Skip to content

feat(backend): resume protocol with missed ephemeral-event replay#278

Open
Olorunfemi20 wants to merge 1 commit into
codebestia:mainfrom
Olorunfemi20:feat/resume-missed-event-replay
Open

feat(backend): resume protocol with missed ephemeral-event replay#278
Olorunfemi20 wants to merge 1 commit into
codebestia:mainfrom
Olorunfemi20:feat/resume-missed-event-replay

Conversation

@Olorunfemi20

Copy link
Copy Markdown
Contributor

Summary

Adds a resume protocol so a reconnecting device can recover the lightweight, non-durable events it missed while offline, then fall back to a full envelope sync for durable messages.

Ephemeral events (read/delivery receipts, presence changes, system notices) are appended to a short-lived Redis stream as they are emitted live. On reconnect the client sends resume { lastEventId }; the gateway replays everything recorded after that id and then tells the client to run a message/envelope sync. Durable chat messages live in Postgres and are deliberately never written to the stream, keeping the resume path cheap and bounded.

closes #200

Design

  • services/resumeStream.ts owns the stream. A per-user Redis stream (resume:events:<userId>) is trimmed with MAXLEN ~ 500 and expires after 300s of inactivity, so it stays small and self-cleaning. Redis stream ids are monotonic and unique, which makes them the natural lastEventId clients persist.
  • Reads use an exclusive lower bound ((<lastEventId>), so re-issuing resume with an advanced cursor never re-delivers an event the client already saw. Combined with stable per-event ids (clients can dedupe), replay is idempotent.
  • Per-device semantics are achieved with a per-device cursor over the per-user stream: each device tracks its own lastEventId and resumes independently, without the server having to fan ephemeral events into a separate stream per device. This avoids extra bookkeeping while delivering exactly the "each device resumes from its own position" behaviour.
  • Ephemeral vs durable is enforced by construction: only receipts/presence/system events are ever recorded; messages are recovered through sync. resume_complete always carries syncRequired: true.

Changes

  • services/resumeStream.ts (new): recordEphemeralEvent, publishEphemeral (best-effort fan-out, no-op when Redis is down or the recipient list is empty), readMissedEvents (exclusive range + tolerant payload parsing), plus eventStreamKey and the TTL/MAXLEN constants.
  • socket/messaging.ts:
    • new resume handler — replays missed events as ephemeral_replay { id, type, data }, then emits resume_complete { lastEventId, syncRequired: true }. Falls back to a sync-only completion when Redis is unavailable.
    • message_read now also records the read_receipt to each member's resume stream (guarded so the member lookup is skipped entirely when Redis is unavailable).
  • index.ts: presence changes on connect/disconnect are recorded to co-members' resume streams via a small recordPresenceForCoMembers helper, so members offline at the moment of the change replay it on reconnect.

Acceptance criteria

  • Reconnecting client receives missed ephemeral events: resume replays receipts and presence recorded while the device was offline.
  • Durable messages fetched via sync, not the resume stream: only ephemeral events are recorded; resume_complete instructs the client to run a full envelope sync.
  • Resume is idempotent: exclusive cursor reads plus stable per-event ids mean no duplicate UI events on repeated resumes.

Testing

  • src/__tests__/resumeStream.test.ts: stream key namespacing, capped append + TTL refresh + returned id, recipient de-duplication, no-op when Redis is null / recipients empty, full vs exclusive-cursor reads, and malformed-payload tolerance.
  • src/__tests__/resume.socket.test.ts: replay of missed events with sync signal, exclusive-cursor idempotency (nothing missed leaves the cursor unchanged), and missing-lastEventId handled as a full replay.
  • Full backend suite passes (140 tests), tsc --noEmit clean, ESLint clean (no new warnings), Prettier clean.

@drips-wave

drips-wave Bot commented Jun 28, 2026

Copy link
Copy Markdown

@Olorunfemi20 Great news! 🎉 Based on an automated assessment of this PR, the linked Wave issue(s) no longer count against your application limits.

You can now already apply to more issues while waiting for a review of this PR. Keep up the great work! 🚀

Learn more about application limits

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Resume protocol + missed-event replay

1 participant