feat(agent-core): rework compaction to keep only user prompts and summary#1214
Conversation
🦋 Changeset detectedLatest commit: 30c633b The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
commit: |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 40b3d7aaf2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Rework the full-compaction summary to read as the agent's own continuing notes instead of a third-party report: - compaction-instruction.md: free-form first-person continuation that preserves exact commands, paths and outcomes, states the precise next action, and flags claimed-but-unverified work rather than trusting it. - compaction-summary-prefix.md: skeptical "your own working notes" framing; drop the collaborative third-party prefix. - system.md: add compaction-awareness guidance so the model continues naturally from a summary and re-checks any reported "done". - Rename the compaction helpers module to handoff.ts. Update tests and regenerate snapshots for the new prompt text, and fill in contextSummary in the restored-compaction replay expectations.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8f1aa54673
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| */ | ||
|
|
||
| export const COMPACTION_SUMMARY_PREFIX = summaryPrefixTemplate.trimEnd(); | ||
| export const COMPACT_USER_MESSAGE_MAX_TOKENS = 20_000; |
There was a problem hiding this comment.
Make kept user prompt budget fit the model window
For models with smaller context windows (for example 16k, or after observeContextOverflow lowers the effective window), compaction can retain up to 20k tokens of recent user prompts before adding the summary, system prompt, and tools. Since ContextMemory.applyCompaction uses this fixed default budget, the compacted request can still exceed the model window and overflow recovery will keep re-compacting without ever shrinking the retained user suffix. Derive this cap from the effective model context instead of using an unconditional 20k.
Useful? React with 👍 / 👎.
| if (this.agent.context.history.length === 0) { | ||
| throw new KimiError(ErrorCodes.COMPACTION_UNABLE, 'No messages to compact in current history.'); |
There was a problem hiding this comment.
Reject compaction while a tool exchange is open
When an SDK or REST caller starts compaction while the history ends with an assistant tool call whose result has not arrived, this new non-empty-history check lets compaction proceed instead of rejecting the unsafe state. The compaction projector trims that open exchange from the summarizer input, and applyCompaction clears the pending tool ids, so the later tool.result is treated as an orphan and dropped. Keep a compactability/open-tool-exchange guard here, or defer compaction until the tool exchange closes.
Useful? React with 👍 / 👎.
estimateTokensForContentPart returned 0 for image_url/audio_url/video_url, so auto-compaction triggers, the overflow-shrink budget, the kept-user budget, and the reported context size all went blind to media — a media-heavy session could overflow the model window while the estimate reported a near-empty context. Media parts now carry a fixed estimate (MEDIA_TOKEN_ESTIMATE), and the content-part switch is exhaustive so a new ContentPart kind must declare its estimate rather than silently count as zero.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 72657f2350
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Folding the live context to [recent user prompts, summary] drops the messages that started background tasks and their status updates, so the model could forget a task is still running and spawn a duplicate. injectAfterCompaction now appends a system-reminder listing active background tasks (with guidance to use TaskOutput/TaskList/TaskStop instead of re-spawning). It runs only post-compaction and carries an injection origin, so the next compaction drops and rebuilds it rather than stacking copies; the all-user-role post-compaction shape is preserved (no tool-pairing reintroduced).
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 782d73ad42
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const messages = [ | ||
| ...this.agent.context.project(messagesToCompact), | ||
| createUserMessage(renderPrompt(compactionInstructionTemplate, { customInstruction: data.instruction ?? '' })), | ||
| ...this.agent.context.project(historyForModel), |
There was a problem hiding this comment.
Rebase micro-compaction cutoff after shrinking history
When the compaction request overflows and historyForModel is shrunk to a recent suffix, this still projects that suffix through ContextMemory.project(), which applies MicroCompaction.compact() using the original history cutoff as if the slice started at index 0. With experimental micro-compaction enabled and a cutoff before the retained suffix, recent tool results in the suffix can be replaced by the old-result marker, so the summary is generated without the fresh tool output it is supposed to preserve. Rebase the cutoff by the number of dropped messages or bypass micro-compaction for the already-shrunk suffix.
Useful? React with 👍 / 👎.
Adds compaction-scenarios.test.ts driving the real Agent/ContextMemory/ FullCompaction machinery: - A guard test locking in that repeated compaction folds the prior summary into the new one instead of stacking two summaries. - Seven `it.fails` probes that executably reproduce known, currently-accepted edge-case defects so the suite stays green while documenting each one precisely; any of them will flip red (forcing removal of `.fails`) the day the behavior is fixed. They cover: assistant/tool appended during an in-flight summarizer call being dropped; unbounded shrink on empty summaries; the fixed 20k kept-user budget overflowing a small model window; a tool result orphaned when compaction starts mid-exchange; legacy compaction records dropping their verbatim tail on replay; micro-compaction clearing recent tool results in an overflow-shrunk suffix; and media being discarded when the oldest kept user message is truncated.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1ddc463f0c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…ontext A tool call and its result can end up non-adjacent in history — a background-task notification or flushed steer lands between them, or an interrupted/nested step delays the result — which strict providers reject with HTTP 400. The projector now moves each tool_use's result up to immediately follow it (projection-time only; the stored history is untouched), and full compaction projects its summarizer input with a synthetic result for any still-open call so the summary request stays well-formed. Micro-compaction only surfaced this latent ordering by busting the prompt cache, so it now defaults off. Includes projector adjacency regression tests, a context-level integration test, and a compaction synthesize-missing guard; the prior "keeps an unresolved tool exchange out of the compaction prompt" test is updated to the now-well-formed (synthetic-result) behavior.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e8f0589bc3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…pactions A pre-rework `context.apply_compaction` record used `[summary, ...history.slice(compactedCount)]` semantics and kept a verbatim recent tail, but it has no `keptUserMessageCount`. The reworked applyCompaction re-folded such records into the all-user shape, dropping the recent assistant/tool tail — so resuming a session compacted by an older version silently lost its most recent context. On restore of such a record (gated on records.restoring, no keptUserMessageCount, and compactedCount < history length) reproduce the old shape instead. The forward/live path is unchanged; the projector's tool-adjacency repair keeps the restored tail well-formed, and compaction only runs at clean step boundaries so the tail has no open exchange. The legacy-tail probe now passes as a regression guard via the real restore path.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 914ae179c0
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
The transcript reducer re-derived foldedLength for pre-rework context.apply_compaction records (no keptUserMessageCount) using the new kept-user+summary rule, but ContextMemory's restore now reproduces the legacy [summary, ...history.slice(compactedCount)] shape for those records. The two diverged for legacy sessions, so MessageService's foldedLength-vs-live-history comparison could mis-handle GET /messages (miss or misorder recent output). The reducer now mirrors the live legacy fold: when compactedCount is below the pre-compaction length it computes 1 + (length - compactedCount); otherwise it falls back to the kept-user derivation. The MessageService transcript test's fixture is corrected to a new-format record, matching its all-user live mock.
The Anthropic message merge keyed on isToolResultOnly(last) === isToolResultOnly(converted), which left a tool_result-only user turn followed by a plain-text user turn unmerged. After tool-exchange repair this shape (assistant tool_use -> tool_result -> injected notification) produces two adjacent user messages, which strict Anthropic-compatible backends reject with HTTP 400. Switch to the asymmetric predicate isToolResultOnly(last) || !isToolResultOnly(converted): a tool-result-only running message absorbs whatever user turn follows (parallel tool_results or a trailing text), yielding a valid [tool_result, ..., text] message; a plain-text running message still only absorbs plain text. [tool_result, text] is valid for both native Anthropic (which concatenates anyway) and strict backends.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 24a508364b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
The 'does not clear recent tool results when projecting a shrunk suffix' probe is an it.fails that only documents a real defect while micro-compaction is active. It inherited the ambient KIMI_CODE_EXPERIMENTAL master switch, so its pass/fail flipped with the runner: green locally (master switch on) but a hard failure in CI, where the flag defaults off and MicroCompaction.compact() is a no-op that leaves the tool result intact. Enable KIMI_CODE_EXPERIMENTAL_MICRO_COMPACTION explicitly for this probe so it deterministically exercises the micro-compaction path regardless of the environment.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 572e17edcb
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const summaryMessage: ContextMessage = { | ||
| role: 'user', | ||
| content: [{ type: 'text', text: contextSummary }], |
There was a problem hiding this comment.
Preserve legacy summary role on restore
When resuming pre-rework context.apply_compaction records without keptUserMessageCount, the legacy branch now preserves the old tail but still reuses this hard-coded user-role summaryMessage. Those records originally restored the summary as an assistant message, so upgraded sessions with a legacy compacted tail will send the same summary as a new user turn, changing provider role alternation and how the summary is interpreted. Fresh evidence: the new isLegacyRestore branch below keeps the legacy tail but does not create a legacy-role summary.
Useful? React with 👍 / 👎.
…unded shrink, and media loss Three compaction-path fixes surfaced by review, each flipping its documenting it.fails probe to a passing it: - Append race (CMP-02): after the summarizer returns, the post-summary history check only compared the compacted prefix. A live step appending to the tail while a manual/SDK compaction was in flight slipped through — an appended assistant/tool turn is neither summarized (the summary covers only the snapshot) nor kept (the rebuild keeps user input), so it vanished. Now cancel when the appended tail contains a non-user message; an appended user message is still kept (rebuild picks it up), preserving the existing 'keeps messages appended while compacting an unchanged prefix' behavior. - Unbounded empty/truncated shrink: an empty or truncated summary dropped the oldest message and reset retryCount, so a model that kept returning empty could issue ~one request per history entry. Bound the shrink attempts by MAX_COMPACTION_RETRY_ATTEMPTS, mirroring the overflow-shrink counter. - Media dropped on truncation (CMP-07): truncating the oldest kept user message replaced its whole content with one text block, discarding any image/audio/video. Keep the non-text parts and spend the remaining budget (maxTokens minus their cost) on truncated text.
For a pre-rework context.apply_compaction record (no keptUserMessageCount), agent-core's ContextMemory restore and the transcript reducer keep the old [summary, ...history.slice(compactedCount)] tail — a verbatim recent tail including assistant/tool. The vis model-mode projector always applied the new kept-user selection, so opening an older compacted session in model mode hid the assistant/tool tail the resumed agent still holds (and surfaced a pre-compaction user message the agent dropped). Branch on a missing keptUserMessageCount with compactedCount < history length and reproduce the legacy shape, matching the agent-core restore.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0b7562b7b2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Resolve conflicts from main adding the plugin_command PromptOrigin kind alongside this branch's compaction rework, which centralized real-user detection into isRealUserInput / compactionUserMessageDisposition: - Drop the per-file isRealUserPrompt helpers main re-added in sessionService, context, transcript, and the vis projector — every caller on this branch already routes through isRealUserInput, so they were dead duplicates. Refreshed two stale comments that still named isRealUserPrompt in the vis projector. - Teach compactionUserMessageDisposition the new plugin_command kind (user-slash => keep, like skill_activation), restoring the exhaustive switch, and add it to the handoff disposition test's origin matrices.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3cb4a9e947
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const isLegacyRestore = | ||
| this.agent.records.restoring !== null && | ||
| input.keptUserMessageCount === undefined && | ||
| input.compactedCount < this._history.length; |
There was a problem hiding this comment.
Preserve summary-only legacy compactions
When resuming a pre-rework context.apply_compaction record that compacted the entire live history (keptUserMessageCount absent and compactedCount === this._history.length), this < guard routes to the new kept-user branch. Old versions restored [summary, ...history.slice(compactedCount)], so this case should restore only the summary; the current branch resurrects user prompts that were already summarized and changes the model context for upgraded sessions. Fresh evidence: the new legacy branch now handles only compactedCount < this._history.length, leaving equality in the new path.
Useful? React with 👍 / 👎.
The in-flight append guard cancelled only when the tail grew with a non-user role. A user-role message that compaction would still drop — a background-task notification, hook/cron reminder, or shell-command output — slipped through: appended after the summary snapshot (so absent from the summary) and dropped by the all-user rebuild (which keeps only real user input), vanishing silently. Key the guard on the same predicate applyCompaction uses (!isRealUserInput) so it cancels whenever the appended tail holds anything compaction would drop. A real user message is still kept, so a live user turn racing a manual/SDK compaction continues to complete.
The transcript reducer's legacy fallback (records predating keptUserMessageCount, compacted with no verbatim tail) re-derived the kept-user count from the whole transcript, including messages before the last context.clear. Live ContextMemory rebuilds _history from post-clear messages only, so counting pre-clear prompts overstated foldedLength; MessageService then saw context.history.length <= foldedLength and skipped appending unflushed live tail messages, dropping recent output from the messages endpoint for old sessions compacted after a clear. Derive only from entries at or after clearFloor to match the live context.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f3dbf4bd88
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Revert the media-preserving truncation: keeping non-text parts on the truncated boundary message overshot the kept-user budget when the media alone exceeded it, and reordered interleaved text/media parts. Both codex (no media-aware truncation) and Claude Code (strips media at compaction) decline to preserve media on a truncated message, since media cannot be partially truncated and keeping it whole breaks the budget. truncateUserMessage now keeps only the truncated text. Recent messages that fit the budget are still kept verbatim with their media; only the oldest, partially-overflowing boundary message loses its attachments.
A manual/SDK compaction could start while a turn was streaming, or a new
turn could launch while a compaction was in flight. Either way the turn
mutates the shared context (streaming content into an existing assistant
message, or appending new messages) during the summarizer await, and that
output is neither summarized nor preserved by the all-user rebuild —
silent loss that object-identity checks can't detect (the streamed message
is mutated in place).
Guard both directions so the agent does one of {turn, compaction} at a
time: begin() refuses a manual compaction while a turn is active, and
launch() refuses a new turn while a compaction is in progress. Auto
compaction is exempt — it runs from within the turn at a step boundary,
which blocks the turn for its duration.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b0fd186b72
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // most recent user messages followed by a user-role summary. The | ||
| // transcript keeps the full history and appends the summary marker; | ||
| // foldedLength tracks the post-compaction live context length. | ||
| transcript.push({ |
There was a problem hiding this comment.
Insert legacy compaction summaries before retained tails
For legacy context.apply_compaction records that lack keptUserMessageCount and keep a verbatim tail (record.compactedCount < foldedLength), this unconditional append puts the summary marker after the retained tail in the REST/web transcript. Those records used the old [summary, ...history.slice(compactedCount)] model shape, so the tail was not summarized and should appear after the summary marker; otherwise older compacted sessions show messages in the wrong order and make the summary look like it covers tail messages it never included.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7ceef0014d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fa519a74be
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…ction Drive real compaction output and the compaction summarizer projection through the real Anthropic provider conversion and assert the wire request is well-formed: strict user/assistant alternation and every tool_use answered by an adjacent tool_result. Locks in the cross-layer guarantee (projector merge + Anthropic consecutive-user merge + adjacency repair + synthesizeMissing) that compacted sessions stay valid for strict Anthropic-compatible backends.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fd2d46224e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| start = i; | ||
| } | ||
| if (start === 0) start = 1; | ||
| return dropLeadingToolResults(messages.slice(start)); |
There was a problem hiding this comment.
Drop orphan tool results after shrinking compaction input
When the summarizer request overflows, this suffix can start after an assistant tool_use but before its delayed tool.result if a notification/user message sits between them. dropLeadingToolResults() only removes tool messages at index 0, so a retained suffix like [notification, tool.result] still sends an orphan tool result; project(..., { synthesizeMissing: true }) cannot repair it without the assistant call, and strict providers can reject the compaction request. Pick a safe split or filter tool results whose call is no longer present in the retained suffix.
Useful? React with 👍 / 👎.
# Conflicts: # packages/agent-core/src/agent/compaction/full.ts # packages/agent-core/test/agent/compaction/full.test.ts
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7a1d84b271
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| input.compactedCount < this._history.length; | ||
| this._history = isLegacyRestore | ||
| ? [summaryMessage, ...this._history.slice(input.compactedCount)] | ||
| : [...keptUserMessages, summaryMessage]; |
There was a problem hiding this comment.
Coalesce kept user prompts for Google providers
When compaction keeps any real user prompt, this rebuild creates adjacent role: 'user' messages: the kept prompt(s) followed by the summary. Anthropic now has a provider-side merge pass, but I checked messagesToGoogleGenAIContents and the Google/Vertex path pushes each non-tool message as a separate Content (packages/kosong/src/providers/google-genai.ts:446), while Gemini multi-turn history is expected to be a single ordered conversation with user/model turns. The first post-compaction request for Google-backed models can therefore be rejected or misinterpreted as a malformed turn sequence; merge adjacent user contents in projection or in that provider before sending.
Useful? React with 👍 / 👎.
…tead of rejecting Manual/SDK compaction runs outside a turn, so the earlier guard rejected prompts/steers that arrived while it held the context. That broke three things: a REST/web prompt got stuck 'running' (no terminal turn event), a background-task/cron steer was silently lost (null was read as 'buffered' but nothing was), and a follow-up prompt could land in the window after isCompacting cleared but before reminders were reinjected. Reuse the existing defer-and-replay model instead of rejecting: - steer() and launch() buffer into steerBuffer while a compaction is in progress (returning null = buffered), mirroring how an active turn defers input. - FullCompaction.compactionWorker keeps isCompacting true through refreshSystemPrompt + injectAfterCompaction (moving markCompleted and the completed event after reinjection), then replays the buffer via TurnFlow.onCompactionFinished — on success, on an A1 prefix/tail cancel, and on failure/abort. - onCompactionFinished flushes into an active turn if one exists, else launches a fresh turn from the deferred input. No PromptService change: a deferred prompt's eventual turn.started lets it associate the pending prompt and clear it on turn.ended.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ecbe650fba
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (this.agent.fullCompaction.isCompacting) { | ||
| this.steerBuffer.push({ input, origin }); | ||
| return null; |
There was a problem hiding this comment.
Remove aborted prompts from the compaction buffer
When a REST prompt is submitted while manual compaction is still running, this branch stores it only in steerBuffer and returns without a turn id. If the client aborts that prompt before compaction finishes, PromptService calls cancel with no turnId; TurnFlow.cancel has no active turn to abort and does not remove this buffered entry. onCompactionFinished() then launches or flushes the supposedly aborted input, so a prompt whose abort returned success can still execute after compaction completes.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6755eeb54b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| this.steerBuffer.push({ input, origin }); | ||
| return null; |
There was a problem hiding this comment.
Make deferred prompts durable before accepting them
When a REST prompt arrives while manual compaction is running, prompt() has already written a turn.prompt record and PromptService returns a running prompt, but this branch keeps the input only in the in-memory steerBuffer. If the process exits or restarts before onCompactionFinished() replays that buffer, replay sees the turn.prompt record via restorePrompt() but never appends or relaunches the saved input, so the accepted user prompt disappears after resume; persist a deferred-prompt record that replay can drain, or don't accept the prompt until it can be launched.
Useful? React with 👍 / 👎.
Gemini/Vertex require strictly alternating user/model turns and reject consecutive user turns with HTTP 400. They arise after compaction (kept prompts + user-role summary + injected reminders) and when a turn is steered in right after a tool result. Anthropic already merged them inline; the Google converter did not, so post-compaction requests failed. Extract the asymmetric merge into a shared mergeConsecutiveUserMessages helper applied at each strict provider's conversion boundary: refactor Anthropic to use it (behavior unchanged) and apply it at the Google converter's exit. A conformance suite drives every strict provider with the post-compaction shape and a steer-after-tool-result shape, asserting no consecutive same-role turns reach the wire, so a new strict provider cannot silently omit the merge. The provider-agnostic projector stays structure-preserving: lenient providers (OpenAI/Kimi) keep distinct turns for clearer message boundaries; only strict providers normalize, where the requirement lives.
No description provided.