Skip to content

synccompactor: pass runCtx into doOneCompaction so c.runDuration actually bounds the loop#899

Merged
arreyder merged 2 commits into
mainfrom
arreyder/synccompactor-runctx-in-loop
May 30, 2026
Merged

synccompactor: pass runCtx into doOneCompaction so c.runDuration actually bounds the loop#899
arreyder merged 2 commits into
mainfrom
arreyder/synccompactor-runctx-in-loop

Conversation

@arreyder

Copy link
Copy Markdown
Contributor

Summary

Compact() builds runCtx = context.WithTimeout(ctx, c.runDuration) but the loop calls doOneCompaction(ctx, ...). The deadline only gates the pre-flight select; individual compactions then run under the unbounded parent ctx and can blow past c.runDuration.

This is the sibling of the bug fixed in #893 (WriteEnvelope walked under an unbounded context). Same shape: a deadline-bearing context exists but inner work uses a different one, so the deadline dangles instead of bounding the work.

Change

  • pkg/synccompactor/compactor.go: pass runCtx into doOneCompaction so c.runDuration actually limits the loop.
  • When runCtx fires due to its own deadline (not parent cancellation), surface the same clean "compaction run duration has expired" message the pre-flight select already uses, instead of letting a bare context.DeadlineExceeded bubble out of inner SQLite operations. The error is still returned so callers can decide whether to retry.

Impact

Caller-visible behavior change for callers that pass WithRunDuration:

  • Before: c.runDuration only checked at entry; individual compactions could run as long as the parent ctx allowed. If they did eventually fail (e.g. heartbeat timeout during a long SQLite operation), the error was a raw context deadline exceeded from inside ExecContext.
  • After: each doOneCompaction is bounded by c.runDuration. When the deadline fires, the loop exits with compaction run duration has expired: context deadline exceeded, and the structured log identifies which sync was in flight and how many remain.

Callers that don't use WithRunDuration are unaffected — runCtx == ctx in that branch.

Test plan

  • go build ./pkg/synccompactor/...
  • go vet ./pkg/synccompactor/...
  • go test ./pkg/synccompactor/... (all existing tests pass)
  • CI

Context

Surfaced by a code-review bot on a separate compaction stuck-state investigation. Lilly's Entra connector compaction was failing at ~98 min with context deadline exceeded from inside CompactGrants even though WithRunDuration(1h) was passed — because runCtx was created but not threaded into the loop, the deadline never fired cleanly; instead the activity hit its Temporal heartbeat timeout while SQLite was blocking.

This PR alone won't unblock Lilly (the compaction still doesn't fit in 1h), but it makes the failure intentional and legible, and matches the contract the option name implies.

🤖 Generated with Claude Code

Compact() built runCtx = context.WithTimeout(ctx, c.runDuration) but the
loop body called doOneCompaction(ctx, ...). The deadline only gated the
pre-flight select; individual compactions ran under the unbounded parent
ctx and could blow past c.runDuration.

This is the sibling of the bug fixed in #893 (WriteEnvelope walking under
an unbounded context): a deadline-bearing context exists but the inner
work uses a different one, so the deadline dangles instead of bounding
the work.

Pass runCtx into doOneCompaction so c.runDuration actually limits the
loop, and when runCtx fires due to its deadline (rather than parent
cancellation), surface the same clean "compaction run duration has
expired" message the pre-flight check already uses — instead of letting
a bare context.DeadlineExceeded bubble out of inner SQLite operations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@arreyder arreyder requested a review from a team May 30, 2026 01:26
@github-actions

github-actions Bot commented May 30, 2026

Copy link
Copy Markdown
Contributor

General PR Review: synccompactor: pass runCtx into doOneCompaction so c.runDuration actually bounds the loop

Blocking Issues: 0 | Suggestions: 0 | Threads Resolved: 0
Review mode: incremental since fa64fc5
View review run

Review Summary

The new commit adds two tests covering the runCtx-in-loop fix: TestCompactorRunDurationExpired verifies that a 1ns run-duration budget triggers the clean "compaction run duration has expired" error wrapping context.DeadlineExceeded, and TestCompactorParentCtxCancelDoesNotMaskAsRunDuration confirms that a parent-context cancellation is not misreported as a run-duration expiry. The production code and tests are correct with no new issues found.

Security Issues

None found.

Correctness Issues

None found.

Suggestions

None.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No blocking issues found.

…mbiguation

TestCompactorRunDurationExpired locks in the user-visible contract that
when c.runDuration expires (with parent ctx still alive), Compact returns
the clean "compaction run duration has expired" error wrapping
context.DeadlineExceeded.

TestCompactorParentCtxCancelDoesNotMaskAsRunDuration locks in the
disambiguation between runDuration expiry and parent ctx cancellation —
a parent-cancelled runCtx must surface as the raw cause (context.Canceled)
and must NOT be misreported as runDuration expiry. Without the disambiguation
either branch could silently regress and mislabel cancellation as a deadline.

Both tests reuse the existing makeEmptySync helper.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No blocking issues found.

@arreyder arreyder requested a review from btipling May 30, 2026 03:28

@btipling btipling left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

semantically, it should be correct for doOneCompaction to use the c.runDuration context.

@arreyder arreyder merged commit 3016c91 into main May 30, 2026
7 checks passed
@arreyder arreyder deleted the arreyder/synccompactor-runctx-in-loop branch May 30, 2026 14:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants