Skip to content

feat(agents): agent-applications console (port from agent_platform)#2700

Open
benjackwhite wants to merge 41 commits into
mainfrom
posthog-code/m3a-final
Open

feat(agents): agent-applications console (port from agent_platform)#2700
benjackwhite wants to merge 41 commits into
mainfrom
posthog-code/m3a-final

Conversation

@benjackwhite

@benjackwhite benjackwhite commented Jun 16, 2026

Copy link
Copy Markdown

Ports the agent-applications console (deployed "agent applications") from posthog/posthog's agent_platform into posthog/code, surfaced under /code/agents → Applications. Each console feature is re-implemented in code's layered architecture and style — deployed-agent conversations render through code's native ConversationView via an SSE→ACP adapter rather than porting the console's bespoke chat.

Supersedes #2697 — its branch became unwritable after a workspace reset; this continues the same work on a fresh branch.

What's included

  • Browse & monitor — agent list with inline analytics, a per-agent overview, a cross-agent live-now panel, and filterable session history → read-only transcript (Conversation + structured Logs) with a KPI strip and "fired by" cron badge.
  • Approvals — per-agent and fleet-wide approval queues with a master/detail decide flow (approve / reject + edited args + reason) and the proposing session embedded for context.
  • Configuration — filesystem-style spec/bundle explorer (model · instructions · triggers · tools · skills · mcps · secrets · limits) + a revision picker with freeze → promote → archive lifecycle, Slack setup, and cron "run now".
  • Secrets & memory — guarded set / rotate / clear of encrypted env keys; a memory file browser with BM25 search + tables (render-only).
  • Observability — spend / sessions / failure-rate / p95 KPIs (trends + WoW deltas) over the team's own $ai_* events, blended onto the Applications overview and a per-agent board.
  • Live chat + Agent Builder — a per-agent Chat preview (stream / send / cancel / resume) against the agent ingress, plus an always-on Agent Builder dock that chats with the deployed meta-agent to inspect, debug, and edit agents: it sees the current page (context envelope + get_context), drives the UI (focus_*), and runs an interactive set_secret punch-out. Authoring happens server-side via consent-gated staged draft revisions.

Architecture

  • Wire types in @posthog/shared/agent-platform-types (plain TS, snake_case). Reads on PostHogAPIClient via useAuthenticatedQuery hooks; live-chat state in a @posthog/core store with transport in a renderer hook.
  • SSE→ACP mappers are pure, unit-tested modules that translate each agent-ingress event into the ACP message code's buildConversationItems already reduces — no re-implementation of the console's runnerReducer.
  • Render-first / Agent-Builder authoring: this surface renders config and exposes only operational mutations (enable/disable, promote/freeze/archive); creating and editing agents is the Agent Builder's job. The ingress base URL is region-derived (dev → localhost).
  • Out of scope (owned elsewhere): billing and the registry.

Backend routes (/api/projects/{teamId}/agent_applications/…, /agent_fleet/…) currently live on posthog/posthog's agent_platform branch, not production Cloud yet.

🤖 Generated with Claude Code

First milestone of porting the agent_platform console into Code.

Adds a new `agent-applications` UI feature for deployed agent-platform
agents, surfaced as a sub-option of the existing `/code/agents` config
landing (alongside Scouts), not a new top-level sidebar tab.

- agent-applications feature: list view chrome + empty feature module
- routes: /code/agents/applications (layout + index), regenerated tree
- ConfigureAgentsSection: "Applications" subsection link card

List data, the SSE→ACP chat adapter, and the concierge dock land in
later milestones.

Generated-By: PostHog Code
Task-Id: 3f40d432-67cc-4df1-bc7b-3ee34c7b1d70
M2: wire the agent-applications area to the agent_platform REST API.

- shared: agent-platform-types.ts — domain types for applications,
  revisions, sessions, approvals, fleet (plain TS wire shapes, matching
  inbox-types/git-types convention); exported as a subpath.
- api-client: PostHogAPIClient methods for the read surface (list/get
  applications, stats, sessions list/detail, approvals list + decide,
  revisions, fleet stats/live-sessions/approvals) using the raw fetcher,
  mirroring the signals methods.
- ui: query-key factory + TanStack Query hooks via useAuthenticatedQuery;
  list view now renders a live fleet stat strip + application rows; new
  per-agent detail route (/code/agents/applications/$idOrSlug) showing an
  overview stat strip + recent sessions.

Follows the inbox pattern (shared types → client methods → UI hooks); no
core passthrough service — the core service lands in M3 with the
SSE→ACP chat reducer, where there's real orchestration to own.

Generated-By: PostHog Code
Task-Id: 3f40d432-67cc-4df1-bc7b-3ee34c7b1d70
Prepares the shared wire contract for the SSE→ACP chat adapter (M3).

- Add `AgentSessionEvent`, a discriminated union over the agent-ingress
  `/listen` SSE event catalogue (session_started, user_message,
  turn_started, assistant_text/thinking deltas, tool_call lifecycle,
  tool_result, completed/waiting/failed/closed, client_tool_call/result),
  replacing the loose `{ kind, data, ts }` frame with typed `data` payloads.
- Type the stored conversation transcript content parts precisely
  (text/thinking/toolCall for assistants, text/image for users, text for
  tool results) instead of `unknown`.
- Fix the tool-result message role: the runtime serializer emits
  `toolResult`, not `tool`.

Generated-By: PostHog Code
Task-Id: 3f40d432-67cc-4df1-bc7b-3ee34c7b1d70
@github-actions

github-actions Bot commented Jun 16, 2026

Copy link
Copy Markdown

React Doctor found 10 issues in 6 files · 10 warnings.

10 warnings

src/features/agent-applications/agent-builder/AgentBuilderDock.tsx

src/features/agent-applications/components/AgentChatPane.tsx

src/features/agent-applications/components/AgentConfigurationPane.tsx

src/features/agent-applications/components/AgentMemoryPane.tsx

src/features/agent-applications/components/AgentRevisionBar.tsx

src/features/agent-applications/components/FileExplorer.tsx

Reviewed by React Doctor for commit a03ee55.

@greptile-apps

greptile-apps Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Comments Outside Diff (1)

  1. packages/ui/src/features/agent-applications/components/AgentApplicationDetailView.tsx, line 1024-1044 (link)

    P2 Identical EmptyState defined in both view components

    AgentApplicationDetailView.tsx and AgentApplicationsListView.tsx each declare a file-private EmptyState({ title, description }) component with identical markup. This duplicates the same idea in two places. A shared component (e.g., in a common components/ folder or inlined from a shared primitive) would say it OnceAndOnlyOnce.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: packages/ui/src/features/agent-applications/components/AgentApplicationDetailView.tsx
    Line: 1024-1044
    
    Comment:
    **Identical `EmptyState` defined in both view components**
    
    `AgentApplicationDetailView.tsx` and `AgentApplicationsListView.tsx` each declare a file-private `EmptyState({ title, description })` component with identical markup. This duplicates the same idea in two places. A shared component (e.g., in a common `components/` folder or inlined from a shared primitive) would say it OnceAndOnlyOnce.
    
    How can I resolve this? If you propose a fix, please make it concise.

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
packages/shared/src/agent-platform-types.ts:69-76
**`AgentAggregateStats` uses camelCase while the file's own convention is snake_case**

The file header explicitly states "Field names stay snake_case to match the JSON exactly" and every other REST-API type in this file follows that — `AgentApplication.team_id`, `AgentRevision.created_at`, `AgentSpec.max_turns`, etc. `AgentAggregateStats` is the only exception, using `liveCount`, `sessionsInWindowCount`, `spendInWindowUsd`, etc.

If the Django endpoint returns the standard `live_count`, `sessions_in_window_count` shape (which DRF does by default), every field on the cast result will be `undefined` at runtime. The `?? 0` fallbacks in `AgentApplicationsListView` and `AgentApplicationDetailView` will then silently show 0 for all stats — sessions, spend, failures, and approvals — regardless of actual activity.

### Issue 2 of 2
packages/ui/src/features/agent-applications/components/AgentApplicationDetailView.tsx:1024-1044
**Identical `EmptyState` defined in both view components**

`AgentApplicationDetailView.tsx` and `AgentApplicationsListView.tsx` each declare a file-private `EmptyState({ title, description })` component with identical markup. This duplicates the same idea in two places. A shared component (e.g., in a common `components/` folder or inlined from a shared primitive) would say it OnceAndOnlyOnce.

Reviews (1): Last reviewed commit: "feat(agents): typed SSE event union + tr..." | Re-trigger Greptile

Comment on lines +69 to +76
export interface AgentAggregateStats {
liveCount: number;
sessionsInWindowCount: number;
spendInWindowUsd: number;
lastActivityAt: string | null;
failedInWindowCount: number;
pendingApprovalsCount: number;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 AgentAggregateStats uses camelCase while the file's own convention is snake_case

The file header explicitly states "Field names stay snake_case to match the JSON exactly" and every other REST-API type in this file follows that — AgentApplication.team_id, AgentRevision.created_at, AgentSpec.max_turns, etc. AgentAggregateStats is the only exception, using liveCount, sessionsInWindowCount, spendInWindowUsd, etc.

If the Django endpoint returns the standard live_count, sessions_in_window_count shape (which DRF does by default), every field on the cast result will be undefined at runtime. The ?? 0 fallbacks in AgentApplicationsListView and AgentApplicationDetailView will then silently show 0 for all stats — sessions, spend, failures, and approvals — regardless of actual activity.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/shared/src/agent-platform-types.ts
Line: 69-76

Comment:
**`AgentAggregateStats` uses camelCase while the file's own convention is snake_case**

The file header explicitly states "Field names stay snake_case to match the JSON exactly" and every other REST-API type in this file follows that — `AgentApplication.team_id`, `AgentRevision.created_at`, `AgentSpec.max_turns`, etc. `AgentAggregateStats` is the only exception, using `liveCount`, `sessionsInWindowCount`, `spendInWindowUsd`, etc.

If the Django endpoint returns the standard `live_count`, `sessions_in_window_count` shape (which DRF does by default), every field on the cast result will be `undefined` at runtime. The `?? 0` fallbacks in `AgentApplicationsListView` and `AgentApplicationDetailView` will then silently show 0 for all stats — sessions, spend, failures, and approvals — regardless of actual activity.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Reply from a Claude Code session, on Ben's behalf — flagging that an agent verified this, not a human.)

False positive — the stats endpoint explicitly serializes camelCase, not DRF-default snake_case. See products/agent_platform/backend/presentation/views.py → the AgentAggregateStats inline_serializer, whose fields are liveCount, sessionsInWindowCount, spendInWindowUsd, lastActivityAt, failedInWindowCount, pendingApprovalsCount. So the camelCase TS type matches the wire shape and the values resolve at runtime. Leaving as-is.

benjackwhite and others added 23 commits June 16, 2026 13:40
First half of the SSE→ACP chat adapter (M3b): a pure, unit-tested mapper
that turns a stored agent_platform conversation transcript into the
`AcpMessage[]` code's `ConversationView` already renders.

- `acpEnvelope.ts` — pure constructors for the JSON-RPC envelopes the
  `buildConversationItems` reducer consumes (session/prompt request,
  session/update notification, _posthog/turn_complete) plus SessionUpdate
  builders for agent text/thinking chunks and tool-call lifecycle.
- `conversationToAcp.ts` — walks the pi-ai `conversation` array (user /
  assistant / toolResult), bracketing turns and attaching tool results to
  their call by `toolCallId`. Reuses code's builder for all accumulation
  rather than re-implementing the console's runnerReducer.
- 7 unit tests covering text, thinking, tool call+result, error results,
  multi-turn bracketing, and content flattening.

Generated-By: PostHog Code
Task-Id: 3f40d432-67cc-4df1-bc7b-3ee34c7b1d70
Second half of the SSE→ACP chat adapter (M3b): the incremental, stateful
mapper for the live agent-ingress `/listen` stream.

- `createAgentChatMapper()` translates each `AgentSessionEvent` into zero or
  more `AcpMessage`s, threading monotonic prompt-request ids and tracking
  seen tool-call ids to distinguish a first sighting (emit `tool_call`) from
  a follow-up (emit a merging `tool_call_update`). Defensively synthesizes a
  call for an orphan tool_result.
- Streaming `tool_call_args_delta` frames are dropped — the canonical
  `tool_call` carries full args a beat later; rendering half-streamed JSON
  reads worse than a brief gap. `turn_started`/`assistant_text` snapshots and
  other lifecycle frames are no-ops (the builder derives turns from prompts).
- `sessionEventsToAcpMessages()` folds a full buffer for replay/tests.
- 11 unit tests incl. a full end-to-end streaming turn.

Generated-By: PostHog Code
Task-Id: 3f40d432-67cc-4df1-bc7b-3ee34c7b1d70
Completes M3 — deployed-agent session transcripts now render through code's
native chat UI (read-only playback).

- `useAgentApplicationSession` — fetches one session's detail (with the
  stored conversation) via the M2 client method.
- `AgentSessionTranscriptView` — maps the conversation to `AcpMessage[]` with
  `conversationToAcpMessages` and feeds code's `ConversationView`
  (`isPromptPending={null}`, the cloud-session mode), with loading / error /
  empty / trimmed states and a back link.
- New route `/code/agents/applications/$idOrSlug/sessions/$sessionId`; the
  per-agent detail's recent-session rows now link to it.

Live streaming + sending (the SSE transport + `createAgentChatMapper` wiring)
land in M4.

Generated-By: PostHog Code
Task-Id: 3f40d432-67cc-4df1-bc7b-3ee34c7b1d70
Replaces the single Agents config page + Applications link-card with a
two-tab header for a clearer top-level distinction:

- **Scouts** (`/code/agents/scouts`) — the existing scheduled-agent /
  self-driving configuration (`ConfigureAgentsSection`).
- **Applications** (`/code/agents/applications`) — the deployed agent-platform
  applications list.

- New `AgentsTabLayout` provides the shared "Agents" header + tab bar; both
  list views render through it, while detail pages (a scout, an agent, a
  session) keep their own focused chrome.
- `/code/agents` now redirects to the Scouts tab; added a `scouts/` index
  route. The legacy `/code/inbox/agents` redirect and the scout back-link
  still resolve through it.
- Dropped the now-redundant "Applications" subsection (and its unused icons)
  from `ConfigureAgentsSection`.

Generated-By: PostHog Code
Task-Id: 3f40d432-67cc-4df1-bc7b-3ee34c7b1d70
Working plan + milestone status for the agent-console port, committed so it
travels with the branch. Marked TEMPORARY — to be removed before merge.

Generated-By: PostHog Code
Task-Id: 3f40d432-67cc-4df1-bc7b-3ee34c7b1d70
Add a per-agent detail tab shell (Overview · Approvals) shared across the
deployed agent_platform applications, and an Approvals pane: filter requests
by state, expand to see proposed args, and approve/reject queued tool calls
(with edited args + reason where the tool's policy allows). Reads go through
PostHogAPIClient; the decide mutation invalidates the agent's approval queries.

Also expands AGENT_APPLICATIONS_PLAN.md into a full feature-parity map for the
remaining console port (registry/billing dropped — owned elsewhere).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a Sessions tab to the per-agent shell (filterable by state, load-more
paging) sharing a reusable AgentSessionRow with the overview. Enrich the
session transcript with a KPI strip (messages, tool calls, cost, duration,
errors), a "fired by cron" badge from trigger_metadata, and a structured
Logs tab backed by a new …/sessions/{id}/logs/ read on PostHogAPIClient
(client-side level + substring filtering). The conversation still renders
read-only through code's native ConversationView.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… controls

Rework the per-agent Approvals pane into a master/detail surface (selection via
?request=<id>): a filterable list on the left and, on selection, an
AgentApprovalDetail panel with Approval (proposed args + decision controls) and
Session tabs — the Session tab embeds the agent run that proposed the gated
call so the approver sees full context. Extract the reusable
AgentSessionDetailBody (KPI strip + Conversation/Logs tabs) from the transcript
route and embed it there; the route view is now thin chrome over it.

Add a RefreshIndicator (updated-ago + manual refresh) to the sessions list and
session detail, with react-query refetchInterval auto-poll (paused when the tab
is unfocused; session detail polls only while non-terminal). Adds a `fill` mode
to AgentDetailLayout for full-height master/detail panes.

Also notes a follow-up in the plan: deep-dive the agent_platform conversation
format vs what ConversationView expects (renders slightly off), to land with
the concierge work.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replace the flat config pane with a full-bleed, master/detail explorer modeled
on the agent-console: a reusable FileExplorer primitive (resizable tree + detail,
search-ready, controlled selection) backs a Configuration tab that renders the
live revision's whole spec + bundle as one filesystem — model, instructions
(agent.md), triggers, tools (+ source), skills (+ SKILL.md), mcps, integrations,
secrets (set/not-set), limits. Per-node detail panes + section overviews with
jump rows; selection persists in ?node=. Bundle files render via MarkdownRenderer
(markdown) / CodeBlock (source, schema). New reads: getAgentRevisionBundle,
listAgentEnvKeys; new hooks useAgentRevision/Bundle/EnvKeys. The FileExplorer is
built to also back the Memory tab.

Deferred to follow-ups: Slack setup card, trigger endpoints/usage, cron-fire,
missing-secret warnings, secret set/rotate, revision lifecycle bar (M9),
edit-with-AI (concierge). Also reframes the plan: render-first, concierge-authored
agents, M6 deferred for an observability entrypoint, M12 retired.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ts, cron-fire

Per-trigger detail now carries auth modes + blurbs (intrinsic note + public
warning), missing-required-secret warnings with jump-to-secret buttons, a cron
"Run now" button (fires + jumps to the session), curl/MCP usage snippets, the
Triggers-overview endpoints block with copy, and the Slack setup card (derived
app manifest + request URLs) under slack triggers. MCP detail gains a tools grid
+ missing-secret warnings. Config tree reordered to instructions · model ·
triggers · secrets · skills · tools. New reads: getAgentSlackManifest; new
mutation: fireAgentCron; trigger-required-secrets util.

Pulls in the api-client agent-analytics module (posthog-client depends on it)
from the parallel observability work on this branch; the analytics/observability
UI is committed separately.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The secret detail pane now carries a write-only editor: set a value, rotate an
existing one, or clear it, posting straight to env_keys (PUT/DELETE) and never
reading the value back. On success the env-keys list refetches so set/not-set
status flips across the tree, secret detail, and the trigger/mcp missing-secret
warnings. The "Set <key>" jump buttons land here. New client methods
setAgentEnvKey / clearAgentEnvKey + a useAgentEnvKeyMutations hook.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…secrets

Clearing a secret was a single click — destructive and far too easy. A set
secret now shows only its status + Rotate / Clear; the value input stays hidden
until you choose to rotate. Clear is a two-step confirm ("Yes, clear it" /
Cancel) that spells out the consequence. The input only shows by default when
the secret is not yet set.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a revision bar above the config explorer: a state-filtered picker that
switches which revision the explorer renders (URL ?revision=, default live),
and the operational lifecycle actions for the selected revision — freeze
(draft→ready), promote (ready→live), archive — each behind a confirm spelling
out the consequence. Actions are contextual by state and the destructive ones
are suppressed where they'd strand the agent (e.g. archiving the sole live
revision). New client method transitionAgentRevision + useAgentRevisions /
useAgentRevisionLifecycle hooks; promote/archive invalidate the application +
revisions so the live badge and explorer update.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…view + per-agent board

Adds the agent observability surface (M7) over the team's own `$ai_*` events
(HogQL via PostHogAPIClient.getAgentAnalytics):

- Applications overview blends a 7-day KPI strip (spend / sessions / failure
  rate / p95, with sparkline trends + WoW deltas) on top of the agent list, and
  merges per-agent rollups into each list row as inline stats — one fetch powers
  both. Header is a label + small "Open in AI observability" link.
- Per-agent Overview tab leads with the same KPI strip + a link into the new
  Observability tab.
- New per-agent Observability tab (route + tab): KPIs + cost-by-model + tool
  reliability, with an "Open in AI observability" deep link.

The analytics data layer (shared AgentAnalytics* types, api-client
agent-analytics query builders + pure shaping + unit tests, getAgentAnalytics /
runHogQLQuery client methods, useAgentAnalytics hook + query key) landed in
aed8929.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Reflects the shipped observability surface: features 25/26 (and 1/3) done,
M7 moved to "done this session", M6 narrowed to live-now + operational counts,
global approvals (feature 10) flagged as the natural next step.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Mark the shipped config explorer (spec/bundle/triggers/secrets/revisions —
features 12-14, 17-19, 23 + M8/M9/M10) and sessions/logs/approvals (5,7,8,9,11)
as done in the parity map + milestones; M11 Memory marked in progress.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a Memory tab that browses the agent's S3-backed memory store through the
same reusable FileExplorer as the config bundle: a folder tree + read view
(markdown rendered, with description + tags + updated), a Files/Tables toggle,
BM25 search (the explorer's search mode → scored flat results), and a tables
view (list + row grid). Render-only — create/update/delete deferred. New reads:
getAgentMemoryTree / readAgentMemoryFile / searchAgentMemory /
listAgentMemoryTables / readAgentMemoryTable + useAgentMemory hooks.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a Chat tab to the per-agent detail surface that streams a live session
against the agent's ingress (the console's "preview/test"), rendered through
code's native ConversationView so it reads like a real chat.

Transport lives in the useAgentChat hook (api-client is renderer-scoped): run /
send / cancel plus the SSE loop, mapped to AcpMessage[] via the M3
createAgentChatMapper and pumped into a new core agentChatStore. The ingress
base URL is region-derived (dev → localhost:3030, since the dev trycloudflare
tunnel buffers SSE; us/eu use the record's ingress_base_url as-is).

Fidelity + QoL:
- ConversationView gains an optional `collapseMode` override; the preview
  passes "none" so the agent's prose (and tool work) renders inline instead of
  being folded into a collapsed tool-call chip (which made replies look empty).
- The user's message renders optimistically on send and the streamed echo is
  deduped, so hitting send gives immediate feedback.
- A banner makes clear you're talking to the currently deployed revision.
- A left rail lists the preview chats you started *here* (persisted locally,
  per agent) — deliberately NOT the agent's full server session list, which can
  include real customer conversations. New-chat resets the surface; selecting a
  past chat rebuilds its transcript from the stored session detail (`/listen`
  only tails, it does not replay) and re-attaches the live stream when the
  session is still active, so you can continue where you left off.

Client tools: toast/get_context resolve inline; focus_*/set_secret degrade to
unhandled_client_tool until the concierge milestone wires UI-driving + the
inline secret form.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…stone

Record the shipped per-agent Chat preview (feature 27) and that it resolved the
M-Live transport open question (renderer-hook + region-derived ingress). Reframe
the remaining live track (in-chat approvals, draft preview) and expand
M-Concierge into staged sub-tasks (C1 dock shell + two-chat store, C2 page-context
registry, C3 focus_* tools, C4 set_secret punch-out, C5 edit-with-AI seeds),
reflecting the decided architecture: a global right-hand dock across /code/agents
talking to the deployed agent-concierge, reusing the live-chat stack.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add the concierge: a global right-hand dock that talks to the deployed
agent-concierge and can inspect/debug/edit agents and drive the UI. Builds on
the live-chat stack.

- C1 dock shell: a resizable right rail in the /code/agents layout
  (react-resizable-panels + autoSaveId), toggled via an edge affordance, the
  hide button, or Cmd/Ctrl+I; open/follow-mode persisted. Generalizes the core
  agentChatStore to hold multiple concurrent chats keyed by chatId, so the dock
  ("concierge") and the per-agent preview ("preview:<slug>") coexist; useAgentChat
  now takes a chatId. Extracts the shared AgentChatSurface (conversation +
  composer) from the preview pane.
- C2 page context: useSetConciergePage registers what the user is viewing
  (wired in AgentDetailLayout, AgentsTabLayout, the session transcript). The
  context is prepended as a delimited envelope to the first message (stripped
  from the transcript + dedup by the mapper) and answers the get_context tool.
- C3 focus_* tools: useConciergeClientTools navigates code's agent routes
  (focus_tab/file/revision/spec_section/session), gated by a follow-mode toggle.
- C5 edit-with-AI: EditWithAIButton seeds the dock with a prompt; the agent
  overview gets an "Ask the concierge" entry point.

set_secret (C4) still degrades to unhandled_client_tool (the agent falls back to
its deep-link flow); to land next. Verified live: the concierge listed agents,
loaded a skill, called focus_tab, and the main panel navigated to the target
agent's configuration while the dock stayed put.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Wire the interactive set_secret client tool end-to-end. The agent's server-side
tool returns {queued, interactive} and parks the session; the handler defers
({defer:true}) and stashes a pendingSecret instead of posting a result. The dock
renders ConciergeSecretForm above the composer; on submit it PUTs the value
straight to the env-keys API (the raw value never reaches the agent) and posts
the outcome via POST /send (sendAgentInteractiveToolResult → a client_tool_result
marker) to wake the parked session, which resumes confirming the set. Cancel
posts user_cancelled.

useAgentChat gains resolveInteractiveTool and a `defer` outcome; AgentChatSurface
gains an aboveComposer slot. Also includes the shared AgentChatSurface composer
upgrade to the Quill InputGroup input shell. Verified live against
agent-approval-demo: env_keys PUT 200 + the session woke with confirmation.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
benjackwhite and others added 4 commits June 16, 2026 19:30
Product/UI rename — "concierge" reads as cringe. The dock, its store, hooks,
components, page-context registry, client tools, and all user-facing strings are
now "Agent Builder" (folder features/agent-applications/agent-builder/, symbols
AgentBuilder*/useAgentBuilder*/useSetAgentBuilderPage, chatId "agent-builder",
persist key agent-builder-dock).

The deployed meta-agent's slug stays `agent-concierge` — that's the real backend
agent; only the surface name changed. Pure rename: no behavior change; full
typecheck + agent-applications tests pass, verified live (dock streams, lists
agents, focus_* + set_secret still work).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Restores the operational signal the M7 analytics KPIs displaced:
- live-now panel on the Applications landing — cross-agent in-flight
  sessions over `agent_fleet/live_sessions/`, joined with the cached
  agent list so each row shows the agent name; click → per-agent
  session detail
- fleet approvals queue at `/code/agents/applications/approvals` —
  master/detail mirroring the per-agent pane (`AgentApprovalsPane`),
  reusing `AgentApprovalDetail` unchanged on the detail side; filter
  chips lifted into `agentApprovalsFilters.ts` so both panes share one
  source of truth
- operational strip on the landing — "X live now · Y pending approvals
  →"; pending links to the new approvals route and flips amber when
  non-zero. Counts come from the same hooks the panel/route already use
- 5s poll for live sessions, 10s for approvals; both pause when the
  tab is unfocused
…orgs

The dev app populated the org/projects map purely from
`tokenResponse.scoped_organizations`, then fetched each org via
`/api/organizations/{id}/`. That left the map empty whenever
`scoped_organizations` was empty — which is the default state for
project-scoped OAuth tokens issued by Local Development, where
`/api/organizations/{id}/` also 403s with "API keys with scoped
projects are only supported on project-based endpoints." Result: sign
in succeeded, user pill stuck on "No project selected" forever, no
diagnostic.

`/api/users/@me/` already carries `organization.teams[]` with the same
{id, name} shape the org endpoint returns, and `fetchUserContext` was
already fetching it but throwing the projects away. Extended it to
return the parsed `OrgProjects`, then in `createSessionFromTokenResponse`:

- Seed the org map with the @me/ data when it includes projects,
  before calling `buildOrgProjectsMap`. Gated on
  `projects.length > 0` so thin @me/ responses (tests, edge cases)
  can't clobber previously-known data on refresh.
- When `scoped_organizations` is empty but the seed populated the
  current org, skip the org fetch entirely — the seed already has
  what we need and the org endpoint would just 403 in Local Dev.
- After the fetch, make sure the seeded entry survives in the final
  map even when no fetch ran.

No Cloud behavior change: Cloud's `scoped_organizations` is populated,
so the existing fetch path runs identically and the seed (when present)
is overwritten by the fetch result.
- Add the `5bfeccef` SHA to the M6 entry to match the convention every
  other Done entry follows.
- M5's "Remaining: fleet-wide global approvals queue" pointed at work
  that has since shipped in M6; replaced with a forward link.
- M7's "those return with M6" rephrased to past tense now that M6
  shipped.
@benjackwhite benjackwhite marked this pull request as ready for review June 17, 2026 07:57
@greptile-apps

greptile-apps Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Reviews (2): Last reviewed commit: "docs: stamp M6 commit SHA + clear stale ..." | Re-trigger Greptile

benjackwhite and others added 5 commits June 17, 2026 10:34
Flag gating
- The `agent-platform` flag now gates the Applications tab (content + tab link)
  and the Agent Builder dock; Scouts is unaffected. New shared `featureFlag.ts`;
  gates in `applications` route, AgentsTabLayout, and AgentBuilderDockLayout.

Local-dev flag tooling
- `pnpm posthog:local` (scripts/use-local-posthog.mjs) points VITE_POSTHOG_* at a
  local PostHog so synced flags resolve in dev; posthog exposed on `window` in
  dev for console overrides; documented in LOCAL-DEVELOPMENT.md.

Agent Builder dock polish
- Fix horizontal-scroll/overflow in the narrow dock via a `scrollX` prop
  (VirtualizedList → ConversationView → AgentChatSurface); nested code blocks
  keep their own scroll.
- Page-aware empty-state suggestions (agentBuilderSuggestions).
- Abstract, context-driven header controls (AgentBuilderHeaderControls): a
  contextual AI action (New agent / Explain this session / …), an icon-only
  show button shown only when the dock is hidden, and a "Following" indicator
  while the builder is mid-turn with follow mode on. Replaces the floating edge
  affordance and the ad-hoc per-header buttons.
- Seed confirm dialog (start fresh / continue) when a chat is already running.
- Per-tab header descriptions (Scouts vs Applications).
- Reorder detail tabs: Configuration after Overview, Chat last.
- Toggle shortcut is Cmd/Ctrl+Shift+I (Cmd+I is the inbox); shown in tooltips.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Give the Agent Builder dock composer a clearer, rotating placeholder from a
  fun pool (picked per mount) instead of the generic "Message this agent…";
  AgentChatSurface takes an optional `placeholder` (preview keeps the default).
- Consolidate AgentApplicationsListView's private EmptyState onto the shared
  AgentDetailEmptyState (addresses a PR review nit; the detail view already used
  the shared one).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Opens the fleet approvals inbox focused on a request id. Pairs with the
agent-runner change (PostHog/posthog#64217) that emits these links on a
gated tool call so non-PostHog-Code clients (Slack, MCP) can land on the
approval. Mirrors the existing scout deep-link wiring end-to-end:

- ApprovalLinkService (core) registers the `approval` handler, parses
  `<scheme>://approval/<requestId>`, focuses the window, emits OpenApproval
  (queues if the renderer isn't ready)
- deepLinkRouter onOpenApproval subscription + getPendingApprovalLink query
- main-process DI wiring (token, bindings, container, boot get)
- useApprovalDeepLink renderer hook + navigateToApproval bridge, mounted in
  __root; navigates to /code/agents/applications/approvals?request=<id>
- approval-link.test.ts (8 cases) + docs/DEEP-LINKS.md entry

Verified live over CDP: posthog-code-dev://approval/ar_test123 routes
through the dispatcher to the handler and navigates to the fleet inbox.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The agent builder's get_context (and first-message context envelope) now
includes the user's current project_id, project_name, and org_id, so the
tenant-neutral concierge can resolve and act on the project the user is
in — threading project_id into the @posthog/* tools instead of relying on
the session principal.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@benjackwhite benjackwhite requested a review from a team June 17, 2026 11:44
benjackwhite and others added 6 commits June 17, 2026 13:51
- Format scripts/use-local-posthog.mjs (biome ci diff)
- FileExplorer: reconcile folder auto-expand during render instead of in an
  effect to satisfy React Doctor no-adjust-state-on-prop-change

Generated-By: PostHog Code
Task-Id: cc7b7df1-e4ed-4d05-8525-6aea4c171f5d
Two cleanups from the Greptile pass on #2700:

- Delete the `getAgentApplicationStats` / `getAgentFleetStats` client
  methods + their hooks (`useAgentApplicationStats`,
  `useAgentFleetStats`), the `AgentAggregateStats` wire type, and the
  `stats` / `fleetStats` query keys. Nobody called them — M7 replaced
  these endpoints with the HogQL `\$ai_*` analytics rollup, but the
  endpoint-backed surface was never removed. Side-benefit: the type's
  camelCase fields no longer violate the file's stated snake_case
  convention (Greptile #1).
- Replace `AgentApplicationsListView`'s file-private `EmptyState` with
  the already-exported `AgentDetailEmptyState` (same markup, same prop
  shape) so we have one canonical empty-tile component (Greptile #2).
…feature 28)

Closes feature 28 of the agent-applications port. Before this, an
approval-gated tool call during a live chat preview just stalled the
conversation — the user had to leave the Chat tab, navigate to Approvals,
decide there, and only then return to see the chat resume. Now the
decision surfaces inline as a card between the conversation and the
composer, and the composer parks until decided.

- Detection: poll-based, since the agent-runner emits no SSE event for
  "waiting on approval" (`waiting` is for `@posthog/meta-ask-for-input`,
  not approvals). New `useAgentChatPendingApproval(idOrSlug, sessionId)`
  fetches `/approvals/?state=queued`, filters to the current chat's
  `session_id`, and returns the oldest queued entry. 2s cadence; pauses
  when the tab is unfocused. Cache key lives under the existing
  `approvals` prefix that `useDecideAgentApproval` invalidates, so a
  decide clears the card immediately.
- UI: extracted the existing decision controls from `AgentApprovalDetail`
  into a presentational `AgentApprovalDecisionForm` so both surfaces
  share one form. New `AgentChatPendingApprovalCard` wraps it with the
  tool name + state badge + proposed args (lifted `ArgsSection` to an
  export) + a deep-link to the full Approvals tab.
- Surface: `AgentChatSurface` gains optional `belowConversation` and
  `composerDisabledReason` props mirroring the existing `aboveComposer`
  pattern. Composer disables (with a tooltip explaining why) while a
  decision is pending. Wired into both `AgentChatPane` (per-agent Chat
  tab) and `AgentBuilderDock` (shared surface, harmless inert for the
  concierge today but consistent if its toolset ever changes).
- Flow on decide: existing `useDecideAgentApproval` posts to
  `/approvals/{id}/decide/`, the prefix invalidation clears the pending
  card, the runner resumes the session, the `/listen` SSE stream emits
  the tool result and the conversation continues naturally.
…SION → 6

Django's decide handler for `allow_agent_approver: false` approvals
(`agent_platform/backend/presentation/views.py:approvals_decide`) has been
hardened to require the literal `agent_approvals:write` scope on OAuth
bearer tokens — it does not accept `*` as a substitute, modeled on the
same exclusion as INTERNAL scopes (`posthog/permissions.py:586`).

Without the explicit scope:
- The new in-chat approval card (M-Live-InChat, `b1a374fe`) 404s on
  Approve/Reject.
- The existing per-agent Approvals tab (M5) 404s on the same path; it
  shipped without anyone exercising the decide button against a
  human-only approval.

Adding `agent_approvals:write` alongside `*` is a no-op if Django ever
relaxes the gate to accept wildcards, and is exactly what's required
under the current shape. Bumping `OAUTH_SCOPE_VERSION` forces existing
sessions to re-consent on next refresh (`auth.ts:405`) so the new scope
actually lands on already-issued tokens.
Approving / rejecting from the Approvals tab, fleet queue, or the
in-chat card was silent: the form swapped for a thin "Decided 1m ago by
<uuid>" line and that was it. No success cue, no failure cue, and the
state-flip was easy to miss in the master/detail view.

- `useDecideAgentApproval` fires a Sonner toast on success ("Approved —
  dispatched to the agent", or "Rejected — the agent will see the
  rejection") and on error ("Decision failed — <message>"). Every
  surface that uses the hook gets the feedback for free.
- The `DecidedOutcome` panel now leads with a coloured status banner
  (CheckCircle on green for approved, XCircle on gray for rejected,
  Warning on red for `dispatched_failed` or a `dispatch_outcome.error`,
  Warning on amber for `expired`) anchored on the tool name, decision
  time, and decider. Reason / edited args / dispatch error stay below
  it.
@dmarticus dmarticus force-pushed the posthog-code/m3a-final branch from 76cf496 to a03ee55 Compare June 17, 2026 16:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants