Skip to content

fix(web): cap live session subscriptions to reduce lag with many sessions#1320

Merged
wbxl2000 merged 4 commits into
mainfrom
web/ws-subscription-cap
Jul 2, 2026
Merged

fix(web): cap live session subscriptions to reduce lag with many sessions#1320
wbxl2000 merged 4 commits into
mainfrom
web/ws-subscription-cap

Conversation

@wbxl2000

@wbxl2000 wbxl2000 commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator

Related Issue

No linked issue — this comes from a user report that the web UI gets sluggish once they have a hundred or so sessions.

Problem

Every session the user opens subscribes to its WebSocket event stream, and the socket keeps all subscriptions across reconnects (re-sending them in client_hello). Subscriptions were only dropped on archive / delete, never on session switch. After opening many sessions, every background session's status / meta / usage event flows through the reducer and dirties the sidebar computeds, so the whole UI slows down as the session count grows.

What changed

Cap the number of live WebSocket session subscriptions with a small MRU list (4). The active session is always retained; when the cap is exceeded, the least-recently-opened subscription is evicted. Eviction only drops the live subscription — the per-session seq / epoch cursor is kept, so re-opening an evicted session resumes from the cursor (the daemon replays missed durable events, or answers resync_required). Reconnects also get cheaper because client_hello only carries the retained subscriptions.

Trade-off: a background session that has fallen out of the 4-slot window no longer lights up its unread dot / completion notification in real time; it syncs when the user switches back to it.

Checklist

  • I have read the CONTRIBUTING document.
  • I have linked a related issue, or explained the problem above.
  • I have added tests that prove my feature works.
  • Ran gen-changesets skill, or this PR needs no changeset.
  • Ran gen-docs skill, or this PR needs no doc update.

…ions

Every opened session stayed subscribed to its WebSocket event stream across reconnects, so opening hundreds of sessions turned background events into a constant reducer and sidebar recompute storm. Keep only the four most-recently-opened sessions subscribed; evicted sessions resume from their tracked cursor on re-open.
@changeset-bot

changeset-bot Bot commented Jul 2, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: a12b756

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@moonshot-ai/kimi-code Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@pkg-pr-new

pkg-pr-new Bot commented Jul 2, 2026

Copy link
Copy Markdown
pnpm dlx https://pkg.pr.new/@moonshot-ai/kimi-code@a12b756
npx https://pkg.pr.new/@moonshot-ai/kimi-code@a12b756

commit: a12b756

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 53b633790b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

const tail = wsSubscriptionOrder.at(-1);
if (tail === undefined || tail === rawState.activeSessionId) break;
wsSubscriptionOrder.pop();
eventConn?.unsubscribe(tail);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep evicted cursors from skipping missed events

When this unsubscribes an older session, that session can still receive global session events (for example event.session.status_changed / session.meta.updated are broadcast to all connections), and reduceAppEvent advances lastSeqBySession for every delivered event. If an evicted session emits per-session durable events and then a global status/meta event, the UI cursor jumps past the missed events; subscribeToSessionEvents() later resumes from that advanced cursor, so the daemon will not replay the skipped transcript/task updates. This leaves reopened sessions stale unless eviction forces a snapshot or global events stop advancing cursors for sessions that are no longer subscribed.

Useful? React with 👍 / 👎.

Some session events (status_changed, meta_updated, ...) are broadcast to every connection and still advance lastSeqBySession for an unsubscribed session. If an evicted session emits per-session durable events and then a global event, the cursor jumps past the missed events, so resuming from it later would skip them and leave the reopened session stale. Track evicted sessions and reset their cursor on the next re-subscribe so the daemon replays or snapshots what was missed.
@wbxl2000

wbxl2000 commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator Author

@codex

1 similar comment
@wbxl2000

wbxl2000 commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator Author

@codex

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8142210a6d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +1248 to +1250
if (sessionsWithStaleCursor.delete(sessionId)) {
delete rawState.lastSeqBySession[sessionId];
delete epochBySession[sessionId];

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Refresh evicted sessions via snapshot instead of seq 0 replay

When an evicted session is reopened, deleting its cursor makes subscribe() send {seq: 0}. For any short/normal session whose journal gap is within the server replay cap, getBufferedSince replays all durable events instead of returning resync_required; the raw projector generates fresh assistant/tool message ids for those replayed turns, so an already-loaded transcript gains duplicate/empty historical messages just by opening a fifth session and returning. Force a snapshot/resync for stale cursors, or keep a safe cursor, instead of resuming from zero.

Useful? React with 👍 / 👎.

Comment on lines +1222 to +1223
const tail = wsSubscriptionOrder.at(-1);
if (tail === undefined || tail === rawState.activeSessionId) break;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Continue evicting when the active session is last

The active session is not guaranteed to be at the front: first-time opens call retainWsSubscription() only after syncSessionFromSnapshot() awaits, so rapid clicks can complete out of order. If the current active session completed first and sits at the tail, this break leaves every older completion subscribed and the array can grow far beyond 4, recreating the lag for the multi-session/slow-snapshot case this cap targets. Skip over the active id or move it to the front before enforcing the cap.

Useful? React with 👍 / 👎.

Two fixes for the subscription cap:

- Re-opening a session that was evicted now rebuilds it from a snapshot instead of resuming from seq 0. Replaying from zero made the projector regenerate assistant/tool message ids, which duplicated the already-loaded transcript; resuming from the kept cursor could skip per-session events that arrived while unsubscribed.

- Eviction now skips the active session wherever it sits in the list, instead of breaking when it lands at the tail. First-time opens retain only after an awaited snapshot, so rapid clicks can complete out of order and leave the active session at the tail, which previously let the list grow past the cap.
@wbxl2000

wbxl2000 commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator Author

@codex

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6b58c69f99

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +1268 to +1269
if (sessionsWithStaleCursor.delete(sessionId)) {
return syncSessionFromSnapshot(sessionId);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve stale cursor marker until snapshot succeeds

When an evicted session is reopened and getSessionSnapshot fails transiently, this delete consumes the only marker that forces a safe rebuild. syncSessionFromSnapshot returns 'failed' for non-404 errors without subscribing, so the now-active session can be left without live events; after switching away and back it will resume from rawState.lastSeqBySession, which this patch notes may have jumped past missed per-session events while evicted. Keep the marker until the snapshot returns 'ok' (or otherwise fall back to a safe resync path) so a temporary snapshot failure cannot permanently skip events.

Useful? React with 👍 / 👎.

Re-opening an evicted session deleted the stale-cursor marker before the snapshot ran. If the snapshot failed transiently, the marker was gone and a later re-open would fall back to subscribeToSessionEvents, resuming from a cursor that may have skipped per-session events while evicted. Read the marker instead of deleting it, and let syncSessionFromSnapshot clear it once the snapshot succeeds.
@wbxl2000

wbxl2000 commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator Author

@codex

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a12b7568f1

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

if (victimIdx === -1) break;
const [victim] = wsSubscriptionOrder.splice(victimIdx, 1);
if (victim === undefined) break;
eventConn?.unsubscribe(victim);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep blocking request events for evicted sessions

When a user has opened more than four sessions, this removes the LRU session from the server-side subscription entirely. The server only broadcasts isGlobalSessionEvent frames to all connections (packages/server/src/services/gateway/wsBroadcastService.ts:112-114), and approval/question request frames are not in that global set (packages/server/src/services/gateway/wsBroadcastService.ts:239-256), so an evicted background session that later reaches event.question.requested or event.approval.requested will not populate pending badges or trigger onQuestionRequested notifications/sounds until the user manually reopens it. That can leave an agent blocked without the existing attention signal in exactly the many-session scenario this cap targets.

Useful? React with 👍 / 👎.

@wbxl2000 wbxl2000 merged commit 444e6b1 into main Jul 2, 2026
9 checks passed
@wbxl2000 wbxl2000 deleted the web/ws-subscription-cap branch July 2, 2026 13:48
@github-actions github-actions Bot mentioned this pull request Jul 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant