Skip to content

ai-bot: fulfill readRealmFile via Matrix attachments on the host-command path#5369

Merged
jurgenwerk merged 2 commits into
cs-11554-loadskill-tool-in-ai-bot-body-referencesfrom
cs-11554-readrealmfile-matrix-attachment
Jun 30, 2026
Merged

ai-bot: fulfill readRealmFile via Matrix attachments on the host-command path#5369
jurgenwerk merged 2 commits into
cs-11554-loadskill-tool-in-ai-bot-body-referencesfrom
cs-11554-readrealmfile-matrix-attachment

Conversation

@lukemelia

Copy link
Copy Markdown
Contributor

Stacked on #5344. Base is that PR's branch, so the diff here is only the delta. Review/merge #5344 first.

What this explores

An alternative architecture for the readRealmFile feature: instead of ai-bot running reads in an inline, same-turn loop and feeding content to the model in-process, the read content lives in Matrix as an attachment, and the continuation rides the existing host-command result path. This is the "content in Matrix, attachment variant" we wanted to evaluate against the same-turn approach in #5344.

How it works now

  1. The model calls readRealmFile. It surfaces as a normal command request on the bot's message, tagged executedBy: 'ai-bot' (the host records it, never runs it — those guards from ai-bot: the assistant can read a skill directly from the realm server #5344 are unchanged).
  2. After the answer streams, ai-bot fetches each file, uploads it to the Matrix media repo, and posts a command-result event carrying the file as data.attachedFiles (success) or an invalid result with a reason (failure).
  3. That result event re-enters the handler (a narrow exception lets the bot's own command-result events through; everything else it posts is still ignored). getShouldRespond then waits until every request — reads and any host commands — has a result, exactly as it already does for host commands.
  4. On the next turn, the existing reconstruction (toResultMessagesbuildAttachmentsMessagePart) downloads the attachment and feeds its content to the model — the same path the host readFile uses.

What this deletes

Because reads now resolve next-turn like host commands, the two timing models collapse into one. Gone:

  • the inline for(;;) generation loop in main.ts and messagesOverride;
  • the mixed-round rejection (readRealmFile can now coexist with host commands in one answer);
  • the "thinking"-event indicator rotation (beginCommandResultIndicator / sendCommandResultIndicator / resetForNextEvent);
  • buildReadRealmFileFollowup / in-memory tool-result splicing.

Net −90 lines, with main.ts substantially simpler.

Storage / dedup

Uploads dedupe by content hash, so a skill read repeatedly is stored in Matrix once, and a changed file misses the cache and re-uploads (dedup without staleness). Within a room, a file already attached isn't re-read at all (the existing isFileAttachedInRoom path). Matrix media itself is not content-addressable, so this app-level cache is what avoids re-storing identical bytes.

Trade-offs to weigh vs #5344

  • (+) Far less bespoke machinery; reads and host commands share one path and mix freely.
  • (+) Reads are transparent in the timeline as real command requests + results.
  • (−) File content now persists in Matrix (media repo), where ai-bot: the assistant can read a skill directly from the realm server #5344 deliberately kept it out (privacy/size).
  • (−) "Always live" softens to "live as of when read" — a later turn reconstructs from the attachment snapshot rather than re-fetching.
  • (−) Each read batch costs an extra Matrix round-trip + a per-turn credit check (the continuation is a normal billed turn).

Notes on correctness

  • Billing / on-behalf-of: a self-triggered continuation arrives sender = bot; it's re-attributed to the single human in the room (reads only happen in single-human rooms, so this is unambiguous), so billing and the delegated read act on the user's behalf.
  • No double-response: getRoomEvents slices history at the trigger event, so for a batch of reads only the handler whose result completes the set passes getShouldRespond; the bot's answer is posted after all results and is never in a result-handler's sliced history. Same mechanism that protects the host multi-command flow.

Tests

Unit tests for the fulfillment (applied result + attachment on success, invalid + reason on failure, malformed args, and content-hash dedup → one upload), plus classifyToolCalls (reads and host commands coexist) and fileLabelFromUrl. All green; ai-bot type-check and lint clean.

🤖 Generated with Claude Code

…and path

Re-architect readRealmFile so the bot no longer runs reads in an inline,
same-turn loop. Instead it surfaces each read as an `executedBy: 'ai-bot'`
command request, fetches the file, uploads it to the Matrix media repo, and
posts a command-result event carrying the file as `data.attachedFiles`. The
existing host-command result path then reconstructs the content on the next
turn (via the same attachment-download the host `readFile` uses), and the
bot's own result event re-triggers generation.

Because reads now resolve next-turn like host commands, the two timing models
collapse into one: an answer may freely mix reads and host commands, so the
mixed-round rejection, the inline `for(;;)` loop, the "thinking"-event
indicator rotation, and the in-memory followup splicing are all deleted.

Uploads dedupe by content hash, so the same skill read repeatedly is stored
in Matrix once; keying on the hash keeps it version-correct (changed content
re-uploads).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@jurgenwerk

Copy link
Copy Markdown
Contributor
image

Getting this problem, the bot won't continue after reading. Will try to fix

The bot fulfills a readRealmFile read by posting a command-result event,
which is meant to re-trigger the bot for the continuation turn. Two things
prevented that continuation from running:

- Fulfillment happened while the room lock was still held. The re-trigger
  has to acquire that lock, so it was dropped. Fulfill after the lock is
  released instead.

- The re-trigger runs on the result event's local echo, before the
  homeserver has indexed it into /messages. getRoomEvents is a server fetch,
  so the continuation's history missed the just-posted result and the read
  looked unresolved (shouldRespond=false). Splice the in-hand result event
  into the history when the fetch didn't include it. Host command results
  arrive via sync (already server-side), so this only affects the bot's own
  results.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@jurgenwerk

Copy link
Copy Markdown
Contributor

Added a fix, tested it and it looks good!

image

@jurgenwerk jurgenwerk merged commit faf693f into cs-11554-loadskill-tool-in-ai-bot-body-references Jun 30, 2026
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants