synccompactor: pass runCtx into doOneCompaction so c.runDuration actually bounds the loop#899
Merged
Merged
Conversation
Compact() built runCtx = context.WithTimeout(ctx, c.runDuration) but the loop body called doOneCompaction(ctx, ...). The deadline only gated the pre-flight select; individual compactions ran under the unbounded parent ctx and could blow past c.runDuration. This is the sibling of the bug fixed in #893 (WriteEnvelope walking under an unbounded context): a deadline-bearing context exists but the inner work uses a different one, so the deadline dangles instead of bounding the work. Pass runCtx into doOneCompaction so c.runDuration actually limits the loop, and when runCtx fires due to its deadline (rather than parent cancellation), surface the same clean "compaction run duration has expired" message the pre-flight check already uses — instead of letting a bare context.DeadlineExceeded bubble out of inner SQLite operations. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
General PR Review: synccompactor: pass runCtx into doOneCompaction so c.runDuration actually bounds the loopBlocking Issues: 0 | Suggestions: 0 | Threads Resolved: 0 Review SummaryThe new commit adds two tests covering the Security IssuesNone found. Correctness IssuesNone found. SuggestionsNone. |
…mbiguation TestCompactorRunDurationExpired locks in the user-visible contract that when c.runDuration expires (with parent ctx still alive), Compact returns the clean "compaction run duration has expired" error wrapping context.DeadlineExceeded. TestCompactorParentCtxCancelDoesNotMaskAsRunDuration locks in the disambiguation between runDuration expiry and parent ctx cancellation — a parent-cancelled runCtx must surface as the raw cause (context.Canceled) and must NOT be misreported as runDuration expiry. Without the disambiguation either branch could silently regress and mislabel cancellation as a deadline. Both tests reuse the existing makeEmptySync helper. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
btipling
approved these changes
May 30, 2026
btipling
left a comment
Contributor
There was a problem hiding this comment.
semantically, it should be correct for doOneCompaction to use the c.runDuration context.
mj-palanker
approved these changes
May 30, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Compact()buildsrunCtx = context.WithTimeout(ctx, c.runDuration)but the loop callsdoOneCompaction(ctx, ...). The deadline only gates the pre-flight select; individual compactions then run under the unbounded parent ctx and can blow pastc.runDuration.This is the sibling of the bug fixed in #893 (
WriteEnvelopewalked under an unbounded context). Same shape: a deadline-bearing context exists but inner work uses a different one, so the deadline dangles instead of bounding the work.Change
pkg/synccompactor/compactor.go: passrunCtxintodoOneCompactionsoc.runDurationactually limits the loop.runCtxfires due to its own deadline (not parent cancellation), surface the same clean "compaction run duration has expired" message the pre-flight select already uses, instead of letting a barecontext.DeadlineExceededbubble out of inner SQLite operations. The error is still returned so callers can decide whether to retry.Impact
Caller-visible behavior change for callers that pass
WithRunDuration:c.runDurationonly checked at entry; individual compactions could run as long as the parent ctx allowed. If they did eventually fail (e.g. heartbeat timeout during a long SQLite operation), the error was a rawcontext deadline exceededfrom insideExecContext.doOneCompactionis bounded byc.runDuration. When the deadline fires, the loop exits withcompaction run duration has expired: context deadline exceeded, and the structured log identifies which sync was in flight and how many remain.Callers that don't use
WithRunDurationare unaffected —runCtx == ctxin that branch.Test plan
go build ./pkg/synccompactor/...go vet ./pkg/synccompactor/...go test ./pkg/synccompactor/...(all existing tests pass)Context
Surfaced by a code-review bot on a separate compaction stuck-state investigation. Lilly's Entra connector compaction was failing at ~98 min with
context deadline exceededfrom insideCompactGrantseven thoughWithRunDuration(1h)was passed — becauserunCtxwas created but not threaded into the loop, the deadline never fired cleanly; instead the activity hit its Temporal heartbeat timeout while SQLite was blocking.This PR alone won't unblock Lilly (the compaction still doesn't fit in 1h), but it makes the failure intentional and legible, and matches the contract the option name implies.
🤖 Generated with Claude Code