Skip to content

feat(ui): smooth + cheap agent streaming (adaptive reveal + block-split markdown)#2716

Open
KarloAldrete wants to merge 3 commits into
PostHog:mainfrom
KarloAldrete:perf/streaming-markdown-blocks
Open

feat(ui): smooth + cheap agent streaming (adaptive reveal + block-split markdown)#2716
KarloAldrete wants to merge 3 commits into
PostHog:mainfrom
KarloAldrete:perf/streaming-markdown-blocks

Conversation

@KarloAldrete

@KarloAldrete KarloAldrete commented Jun 16, 2026

Copy link
Copy Markdown

Problem

Streamed agent replies have two issues:

  1. Choppy — tokens arrive in irregular bursts and are painted exactly as they land, so the text jumps forward unevenly instead of reading as smooth typing.
  2. Slow on long messagesAgentMessageMarkdownRenderer re-parses the entire accumulated markdown (react-markdown + remark-gfm) on every token: O(n) per token, O(n²) per message. Past ~8k chars a single token costs ~19 ms of parsing, over the 16.6 ms frame budget, so long answers stutter and saturate the main thread.

Addresses #2517 (smooth token streaming).

Changes

Two pieces that reinforce each other.

1. Block-split rendering (StreamingMarkdown + splitMarkdownBlocks) — the active message is split into top-level blocks (fenced code kept intact). Completed blocks keep a stable string, so the memoized MarkdownRenderer skips them and only the growing tail re-parses. An open code fence renders as plain text until it closes, so syntax highlighting runs once, not per token.

Measured on a 3,000-char answer streamed in ~250 tokens (jsdom + react-dom/server, default components — a lower bound):

total markdown CPU per token last token
full re-parse 878 ms 3.5 ms 6.8 ms
block-split 88 ms 0.35 ms 0.16 ms

~10× less, and the per-token cost stays flat instead of growing with length.

2. Adaptive reveal (useSmoothedText) — eases the displayed prefix toward the accumulated text via requestAnimationFrame, with a step proportional to the backlog so it tracks the token rate (catching up in ~6 frames) rather than a fixed chars/sec. Reveals on word boundaries so partial markdown tokens never flash. Completed messages show in full immediately.

The two are complementary: the adaptive reveal makes streaming look smooth, and the block-split keeps each frame cheap so it stays ~60 fps even on long messages — which a reveal over a full per-frame re-parse can't hold. Completed messages render via a single full MarkdownRenderer parse, so the final output is byte-identical.

Relationship to #2685

#2685 also implements smooth token streaming (a fixed ~120 chars/sec reveal). This PR overlaps on that goal but takes a different approach — an adaptive reveal rate (no lag-then-snap on fast streams) plus the block-split renderer so the reveal holds up on long messages. Flagging the overlap so it's not a surprise — happy to coordinate or fold these together however the team prefers.

How did you test this?

  • Unit tests: splitMarkdownBlocks.test.ts (round-trip / no text dropped, paragraph + fenced-block splitting, parseOpenFence incl. the completed-then-open-fence edge case) and useSmoothedText.test.ts (immediate on mount, gradual reveal, convergence, prefix-only).
  • pnpm --filter @posthog/ui test → 763 pass.
  • pnpm --filter @posthog/ui typecheck passes; pre-commit pnpm typecheck passes.
  • biome check clean on changed files; node scripts/check-host-boundaries.mjs reports no new violations.
  • Block-split numbers from a local micro-benchmark of the markdown render path (not committed).

Automatic notifications

  • Publish to changelog?
  • Alert Sales and Marketing teams?

Created with PostHog Code

Each streamed token re-parsed the entire accumulated markdown of the active
agent message (react-markdown + remark-gfm) — O(n) per token, O(n^2) per
message. Past ~8k chars a single token costs ~19ms of parsing, over the 16.6ms
frame budget, so long answers stutter and saturate the main thread.

StreamingMarkdown splits the active message into top-level blocks (fenced code
kept intact). Completed blocks keep a stable string so the memoized
MarkdownRenderer skips them and only the growing tail re-parses; an open code
fence renders as plain text until it closes, so syntax highlighting runs once
rather than per token. On a 3k-char answer this is ~10x less markdown CPU
(878ms -> 88ms total; 3.5ms -> 0.35ms per token) and the per-token cost stays
flat instead of growing with message length. Completed messages still render
via a single full MarkdownRenderer parse, so output is byte-identical — no
visual change, only less work per token.

Complements PostHog#2685 (smooth token reveal): that PR eases when text appears, this
one keeps rendering it cheap so the smooth reveal stays at 60fps even on long
messages.

Generated-By: PostHog Code
Task-Id: 8e9f327d-84b1-4608-9f48-c3038dbf87ca
@greptile-apps

greptile-apps Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor
Prompt To Fix All With AI
Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 3
packages/ui/src/features/editor/components/StreamingMarkdown.tsx:16-22
**`parseOpenFence` targets the first fence, not the unterminated one**

`/.exec(block)` finds the first fence marker in the block. If the tail block contains a completed fence followed by an open fence with no blank line between them — e.g. `` ```ts\ncode\n```\ntext\n```ts\npartial `` (no blank lines, so `splitMarkdownBlocks` keeps it as one block) — `before` becomes `""` and `code` absorbs everything from the first fence opener onward, including the intermediate closing fence and second opening marker. The result is garbled plain-text output during streaming (it self-corrects once the fence closes and `MarkdownRenderer` takes over, but the in-progress display is wrong).

A fix is to scan all lines with the same fence-tracking logic as `hasOpenCodeFence`, record the last fence-open line index, then slice `before` and `code` relative to that line.

### Issue 2 of 3
packages/ui/src/features/editor/components/splitMarkdownBlocks.ts:59-74
**Fence-tracking logic duplicated between the two exported functions**

`hasOpenCodeFence` re-implements the identical per-line regex match + `inFence`/`fenceChar` state machine that lives inside `splitMarkdownBlocks`. Extracting a shared helper (e.g. `parseFenceState(lines)` that returns `{ inFence, fenceChar }`) would satisfy the codebase's OnceAndOnlyOnce rule and make future spec-compliance fixes (e.g. requiring the closing fence to be at least as long as the opening fence) apply to both functions automatically.

### Issue 3 of 3
packages/ui/src/features/editor/components/splitMarkdownBlocks.test.ts:5-44
**Tests prefer parameterised style over manual loops and multi-assertion cases**

The "never drops text" case iterates over samples with a `for` loop, and the `hasOpenCodeFence` suite packs four independent assertions into one `it`. Vitest's `it.each` / `test.each` makes each sample a separate, named test entry — failures name the exact input and expected output, which saves time when something breaks. Converting both suites to parameterised form would align with the project's stated preference.

Reviews (1): Last reviewed commit: "perf(markdown): re-parse only the tail b..." | Re-trigger Greptile

Comment on lines +16 to +22
const m = /(^|\n) {0,3}(`{3,}|~{3,})/.exec(block);
if (!m) return { before: "", code: block };
const fenceLineStart = m.index + (block[m.index] === "\n" ? 1 : 0);
const before = block.slice(0, fenceLineStart);
const afterMarker = block.indexOf("\n", fenceLineStart);
const code = afterMarker === -1 ? "" : block.slice(afterMarker + 1);
return { before, code };

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 parseOpenFence targets the first fence, not the unterminated one

/.exec(block) finds the first fence marker in the block. If the tail block contains a completed fence followed by an open fence with no blank line between them — e.g. ```ts\ncode\n```\ntext\n```ts\npartial (no blank lines, so splitMarkdownBlocks keeps it as one block) — before becomes "" and code absorbs everything from the first fence opener onward, including the intermediate closing fence and second opening marker. The result is garbled plain-text output during streaming (it self-corrects once the fence closes and MarkdownRenderer takes over, but the in-progress display is wrong).

A fix is to scan all lines with the same fence-tracking logic as hasOpenCodeFence, record the last fence-open line index, then slice before and code relative to that line.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/ui/src/features/editor/components/StreamingMarkdown.tsx
Line: 16-22

Comment:
**`parseOpenFence` targets the first fence, not the unterminated one**

`/.exec(block)` finds the first fence marker in the block. If the tail block contains a completed fence followed by an open fence with no blank line between them — e.g. `` ```ts\ncode\n```\ntext\n```ts\npartial `` (no blank lines, so `splitMarkdownBlocks` keeps it as one block) — `before` becomes `""` and `code` absorbs everything from the first fence opener onward, including the intermediate closing fence and second opening marker. The result is garbled plain-text output during streaming (it self-corrects once the fence closes and `MarkdownRenderer` takes over, but the in-progress display is wrong).

A fix is to scan all lines with the same fence-tracking logic as `hasOpenCodeFence`, record the last fence-open line index, then slice `before` and `code` relative to that line.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +59 to +74
export function hasOpenCodeFence(src: string): boolean {
let inFence = false;
let fenceChar = "";
for (const line of src.split("\n")) {
const trimmed = line.replace(/^ {0,3}/, "");
const fence = /^(`{3,}|~{3,})/.exec(trimmed);
if (!fence) continue;
if (!inFence) {
inFence = true;
fenceChar = fence[1][0];
} else if (trimmed[0] === fenceChar) {
inFence = false;
}
}
return inFence;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Fence-tracking logic duplicated between the two exported functions

hasOpenCodeFence re-implements the identical per-line regex match + inFence/fenceChar state machine that lives inside splitMarkdownBlocks. Extracting a shared helper (e.g. parseFenceState(lines) that returns { inFence, fenceChar }) would satisfy the codebase's OnceAndOnlyOnce rule and make future spec-compliance fixes (e.g. requiring the closing fence to be at least as long as the opening fence) apply to both functions automatically.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/ui/src/features/editor/components/splitMarkdownBlocks.ts
Line: 59-74

Comment:
**Fence-tracking logic duplicated between the two exported functions**

`hasOpenCodeFence` re-implements the identical per-line regex match + `inFence`/`fenceChar` state machine that lives inside `splitMarkdownBlocks`. Extracting a shared helper (e.g. `parseFenceState(lines)` that returns `{ inFence, fenceChar }`) would satisfy the codebase's OnceAndOnlyOnce rule and make future spec-compliance fixes (e.g. requiring the closing fence to be at least as long as the opening fence) apply to both functions automatically.

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines +5 to +44
it("never drops text — joining the blocks reproduces the input", () => {
const samples = [
"",
"single line",
"para one\n\npara two\n\npara three",
"# Heading\n\nText with **bold**.\n\n- a\n- b\n",
"Intro\n\n```ts\nconst x = 1;\nconst y = 2;\n```\n\nOutro",
"trailing blanks\n\n\n\n",
];
for (const s of samples) {
expect(splitMarkdownBlocks(s).join("")).toBe(s);
}
});

it("splits paragraphs at blank lines", () => {
expect(splitMarkdownBlocks("a\n\nb\n\nc")).toEqual(["a\n\n", "b\n\n", "c"]);
});

it("keeps a fenced code block (with blank lines inside) as one block", () => {
const md = "```\nline1\n\nline2\n```\n\nafter";
const blocks = splitMarkdownBlocks(md);
expect(blocks[0]).toBe("```\nline1\n\nline2\n```\n\n");
expect(blocks[1]).toBe("after");
});

it("does not split inside an unterminated fence (the tail stays whole)", () => {
const md = "intro\n\n```ts\nconst a = 1;\n\nconst b = 2;";
const blocks = splitMarkdownBlocks(md);
expect(blocks[blocks.length - 1]).toContain("const b = 2;");
expect(blocks.join("")).toBe(md);
});
});

describe("hasOpenCodeFence", () => {
it("is true while a fence is open and false once it closes", () => {
expect(hasOpenCodeFence("```ts\nconst a = 1;")).toBe(true);
expect(hasOpenCodeFence("```ts\nconst a = 1;\n```")).toBe(false);
expect(hasOpenCodeFence("no code here")).toBe(false);
expect(hasOpenCodeFence("text\n\n```\npartial")).toBe(true);
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Tests prefer parameterised style over manual loops and multi-assertion cases

The "never drops text" case iterates over samples with a for loop, and the hasOpenCodeFence suite packs four independent assertions into one it. Vitest's it.each / test.each makes each sample a separate, named test entry — failures name the exact input and expected output, which saves time when something breaks. Converting both suites to parameterised form would align with the project's stated preference.

Context Used: Do not attempt to comment on incorrect alphabetica... (source)

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/ui/src/features/editor/components/splitMarkdownBlocks.test.ts
Line: 5-44

Comment:
**Tests prefer parameterised style over manual loops and multi-assertion cases**

The "never drops text" case iterates over samples with a `for` loop, and the `hasOpenCodeFence` suite packs four independent assertions into one `it`. Vitest's `it.each` / `test.each` makes each sample a separate, named test entry — failures name the exact input and expected output, which saves time when something breaks. Converting both suites to parameterised form would align with the project's stated preference.

**Context Used:** Do not attempt to comment on incorrect alphabetica... ([source](https://app.greptile.com/review/custom-context?memory=instruction-0))

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

…fence, it.each

- Extract a single stepFence state machine shared by splitMarkdownBlocks,
  hasOpenCodeFence and parseOpenFence (OnceAndOnlyOnce — no duplicated
  fence-tracking logic).
- parseOpenFence now targets the LAST unterminated fence, so a completed fence
  earlier in the same block stays in `before` and renders normally instead of
  being swallowed as plain text mid-stream.
- Parameterize the splitter tests with it.each and add parseOpenFence cases,
  including the completed-then-open fence edge case.

Generated-By: PostHog Code
Task-Id: 8e9f327d-84b1-4608-9f48-c3038dbf87ca
@KarloAldrete

Copy link
Copy Markdown
Author

Thanks for the review! Addressed all three in 3f6bc14:

  1. parseOpenFence targeting the wrong fence — it now walks the block tracking fence state and targets the last unterminated fence, so a completed fence earlier in the same block stays in before and renders normally instead of being swallowed as plain text mid-stream. Added a test for the completed-then-open-fence case.
  2. Duplicated fence logic — extracted a single stepFence state machine shared by splitMarkdownBlocks, hasOpenCodeFence, and parseOpenFence.
  3. Tests — parameterized the splitter suite with it.each and split the hasOpenCodeFence assertions into individual cases.

pnpm --filter @posthog/ui test → 762 pass; typecheck + biome clean.

Smooth the reveal of streamed tokens on top of the block-split renderer.
useSmoothedText eases the displayed prefix toward the accumulated text via
requestAnimationFrame with a step proportional to the backlog, so it tracks a
fast token stream (catching up in ~6 frames) instead of a fixed chars/sec that
lags behind and then snaps. It reveals on word boundaries so partial markdown
tokens never flash, and shows completed (non-streaming) messages in full
immediately.

Pairs with the block-split renderer: only the growing tail re-parses each
frame, so the reveal holds ~60fps even on long messages — which a fixed-rate
reveal over a full markdown re-parse can't.

Generated-By: PostHog Code
Task-Id: 8e9f327d-84b1-4608-9f48-c3038dbf87ca
@KarloAldrete KarloAldrete changed the title perf(markdown): re-parse only the tail block while streaming feat(ui): smooth + cheap agent streaming (adaptive reveal + block-split markdown) Jun 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant