Skip to content

feat(channels): generic chat box with lazy repo attach#2735

Open
raquelmsmith wants to merge 4 commits into
mainfrom
posthog-code/channels-generic-chatbox
Open

feat(channels): generic chat box with lazy repo attach#2735
raquelmsmith wants to merge 4 commits into
mainfrom
posthog-code/channels-generic-chatbox

Conversation

@raquelmsmith

@raquelmsmith raquelmsmith commented Jun 17, 2026

Copy link
Copy Markdown
Member

video explanation

https://www.loom.com/share/f352af8d74194f7a9c9e5d2010b6f00f

What & why

In the channels experience, this drops the up-front repo/branch picker. A channel task becomes a generic chat box: it can be a coding task (needs a repo), an analysis task (PostHog MCP, maybe no repo), or a non-code task like sending an email (no repo at all). The agent decides at runtime whether it needs a repo, clones one only when needed, and asks the user (via the built-in AskUserQuestion tool) when it can't determine which repo to use. Candidates = all repos, prioritizing those named in the channel's CONTEXT.md.

Mirrors how the Slack bot behaves (that repo-inference lives in PostHog/posthog), but implemented here. Works in local/worktree mode (for dev testing) and cloud mode.

How it works

UI (channel-only allowNoRepo)

  • WebsiteNewTaskTaskInput hides the repo/folder picker, branch selector, env selector, and additional-dirs button, and removes the submit gate. The workspace-mode select stays (so you can run local). /code/ is untouched.

Task creation / saga

  • A repo-less local task no longer skips the agent session — it starts in a per-task scratch dir (ensureScratchDir). No repo ⇒ no workspace, no clone.
  • workspace.verify treats an existing scratch dir as valid (no spurious "working directory no longer exists").

Agent

  • channelMode is derived server-side from the scratch path (isScratchPath), so it survives reconnects with no client plumbing.
  • Adds a channel system-prompt (lazy-repo guidance) and two local tools gated on channelMode:
    • list_repos — lists candidates via gh
    • clone_repo — clones into <cwd>/repos/<owner/repo> (a scratch subdir, no cwd rebind, conversation preserved)

Scope / follow-ups

  • Cloud channel tasks rely on the existing cloud "No Repository Mode" prompt; the new list_repos/clone_repo tools are local-only for now.
  • Django: confirm PostHog/posthog accepts a no-repository cloud Task/TaskRun before relying on cloud channel mode (out of scope for this repo).

Testing

  • pnpm typecheck (22 pkgs), biome lint clean
  • ✅ Unit tests: core (1568, incl. a new repo-less saga test), workspace-server (478), agent (734)
  • ⏳ Live in-app E2E not yet run — recommend: open a channel → "+", confirm no repo picker; submit an analysis prompt (no clone); submit a coding prompt without naming a repo (→ list_repos/AskUserQuestion); then name a repo (→ clone_repo).

Created with PostHog Code

@github-actions

github-actions Bot commented Jun 17, 2026

Copy link
Copy Markdown

React Doctor found no issues in the changed files. 🎉

Reviewed by React Doctor for commit 15cf907.

@greptile-apps

greptile-apps Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Security Review

  • GitHub token embedded in clone URL (clone-repo.ts lines 63–65): the token is interpolated as basic-auth in cloneUrl. If CloneSaga surfaces the URL in result.error, the token is returned to the LLM in the tool result and may be stored in conversation history or server-side logs. The sibling list_repos tool correctly injects the token via GH_TOKEN env var instead.
Prompt To Fix All With AI
Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 3
packages/agent/src/adapters/local-tools/tools/clone-repo.ts:63-73
**Token leakage via clone error messages**

The GitHub token is embedded directly in `cloneUrl` as basic-auth (`x-access-token:${token}@...`). When `CloneSaga` fails — on a network error, DNS failure, or an HTTP 4xx/5xx from GitHub — some git backends propagate the remote URL verbatim into `result.error`. That string is then returned as the tool result (`fail(\`clone_repo failed: ${result.error}\`)`), which lands in the LLM's conversation context and is potentially logged or stored in the agent transcript.

The `list_repos` tool next door uses the safer `GH_TOKEN` environment variable for `gh`, which never appears in error output. Consider the same approach here: configure git's credential helper or `GH_TOKEN` via the process environment (as `CloneSaga` may already support), and construct `cloneUrl` without the credential in the URL — fall back to `https://github.com/${slug}.git` and let the env-supplied credential flow through git's own credential handling.

### Issue 2 of 3
packages/agent/src/adapters/local-tools/tools/clone-repo.ts:67-74
**No guard against re-cloning an existing directory**

`targetPath` is deterministic (`<cwd>/repos/<owner>/<repo>`), so if the agent calls `clone_repo` twice for the same repo — during a retry, an LLM confusion loop, or a reconnected session — `CloneSaga` will attempt to clone into a directory that already exists. Git aborts with "destination path '…' already exists and is not an empty directory", which the agent receives as an opaque error and may misinterpret.

A simple pre-check would let the tool return the existing path immediately: if `targetPath` already exists and is a non-empty git repo, skip the clone and return the path directly (possibly also checking out `branch` if requested). This makes the tool idempotent and avoids confusing the agent.

### Issue 3 of 3
packages/workspace-server/src/services/workspace/scratch.ts:21-23
**Path separator assumption may miss Windows paths, and `isScratchPath` is too broad**

`workingDir.split(path.sep)` splits on the OS separator, which is correct on the current platform. However, the agent's cloned repo lives at `<scratchDir>/repos/<owner>/<repo>`, and once the agent `cd`s into that repo, the working directory passed to future sessions would still contain `posthog-code-scratch` — so `channelMode` would remain `true` for the cloned-repo sessions, which is the intended behaviour.

The broader concern is that any path on the system that happens to contain a directory named `posthog-code-scratch` — a coincidence or a malicious symlink — would silently enable channel mode and expose `list_repos`/`clone_repo` to a normal coding session. The check would be more precise if it also verified that `scratchBasePath(worktreeLocation)` is an actual prefix of `workingDir`, rather than just scanning for the folder name as a component anywhere in the path.

Reviews (1): Last reviewed commit: "feat(channels): generic chat box with la..." | Re-trigger Greptile

Comment on lines +63 to +73
const cloneUrl = token
? `https://x-access-token:${token}@github.com/${slug}.git`
: `https://github.com/${slug}.git`;

try {
const result = await new CloneSaga().run({
repoUrl: cloneUrl,
targetPath,
});
if (!result.success) {
return fail(`clone_repo failed: ${result.error}`);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 security Token leakage via clone error messages

The GitHub token is embedded directly in cloneUrl as basic-auth (x-access-token:${token}@...). When CloneSaga fails — on a network error, DNS failure, or an HTTP 4xx/5xx from GitHub — some git backends propagate the remote URL verbatim into result.error. That string is then returned as the tool result (fail(\clone_repo failed: ${result.error}`)`), which lands in the LLM's conversation context and is potentially logged or stored in the agent transcript.

The list_repos tool next door uses the safer GH_TOKEN environment variable for gh, which never appears in error output. Consider the same approach here: configure git's credential helper or GH_TOKEN via the process environment (as CloneSaga may already support), and construct cloneUrl without the credential in the URL — fall back to https://github.com/${slug}.git and let the env-supplied credential flow through git's own credential handling.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/agent/src/adapters/local-tools/tools/clone-repo.ts
Line: 63-73

Comment:
**Token leakage via clone error messages**

The GitHub token is embedded directly in `cloneUrl` as basic-auth (`x-access-token:${token}@...`). When `CloneSaga` fails — on a network error, DNS failure, or an HTTP 4xx/5xx from GitHub — some git backends propagate the remote URL verbatim into `result.error`. That string is then returned as the tool result (`fail(\`clone_repo failed: ${result.error}\`)`), which lands in the LLM's conversation context and is potentially logged or stored in the agent transcript.

The `list_repos` tool next door uses the safer `GH_TOKEN` environment variable for `gh`, which never appears in error output. Consider the same approach here: configure git's credential helper or `GH_TOKEN` via the process environment (as `CloneSaga` may already support), and construct `cloneUrl` without the credential in the URL — fall back to `https://github.com/${slug}.git` and let the env-supplied credential flow through git's own credential handling.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +67 to +74
try {
const result = await new CloneSaga().run({
repoUrl: cloneUrl,
targetPath,
});
if (!result.success) {
return fail(`clone_repo failed: ${result.error}`);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 No guard against re-cloning an existing directory

targetPath is deterministic (<cwd>/repos/<owner>/<repo>), so if the agent calls clone_repo twice for the same repo — during a retry, an LLM confusion loop, or a reconnected session — CloneSaga will attempt to clone into a directory that already exists. Git aborts with "destination path '…' already exists and is not an empty directory", which the agent receives as an opaque error and may misinterpret.

A simple pre-check would let the tool return the existing path immediately: if targetPath already exists and is a non-empty git repo, skip the clone and return the path directly (possibly also checking out branch if requested). This makes the tool idempotent and avoids confusing the agent.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/agent/src/adapters/local-tools/tools/clone-repo.ts
Line: 67-74

Comment:
**No guard against re-cloning an existing directory**

`targetPath` is deterministic (`<cwd>/repos/<owner>/<repo>`), so if the agent calls `clone_repo` twice for the same repo — during a retry, an LLM confusion loop, or a reconnected session — `CloneSaga` will attempt to clone into a directory that already exists. Git aborts with "destination path '…' already exists and is not an empty directory", which the agent receives as an opaque error and may misinterpret.

A simple pre-check would let the tool return the existing path immediately: if `targetPath` already exists and is a non-empty git repo, skip the clone and return the path directly (possibly also checking out `branch` if requested). This makes the tool idempotent and avoids confusing the agent.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +21 to +23
export function isScratchPath(workingDir: string): boolean {
return workingDir.split(path.sep).includes(SCRATCH_DIR_NAME);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Path separator assumption may miss Windows paths, and isScratchPath is too broad

workingDir.split(path.sep) splits on the OS separator, which is correct on the current platform. However, the agent's cloned repo lives at <scratchDir>/repos/<owner>/<repo>, and once the agent cds into that repo, the working directory passed to future sessions would still contain posthog-code-scratch — so channelMode would remain true for the cloned-repo sessions, which is the intended behaviour.

The broader concern is that any path on the system that happens to contain a directory named posthog-code-scratch — a coincidence or a malicious symlink — would silently enable channel mode and expose list_repos/clone_repo to a normal coding session. The check would be more precise if it also verified that scratchBasePath(worktreeLocation) is an actual prefix of workingDir, rather than just scanning for the folder name as a component anywhere in the path.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/workspace-server/src/services/workspace/scratch.ts
Line: 21-23

Comment:
**Path separator assumption may miss Windows paths, and `isScratchPath` is too broad**

`workingDir.split(path.sep)` splits on the OS separator, which is correct on the current platform. However, the agent's cloned repo lives at `<scratchDir>/repos/<owner>/<repo>`, and once the agent `cd`s into that repo, the working directory passed to future sessions would still contain `posthog-code-scratch` — so `channelMode` would remain `true` for the cloned-repo sessions, which is the intended behaviour.

The broader concern is that any path on the system that happens to contain a directory named `posthog-code-scratch` — a coincidence or a malicious symlink — would silently enable channel mode and expose `list_repos`/`clone_repo` to a normal coding session. The check would be more precise if it also verified that `scratchBasePath(worktreeLocation)` is an actual prefix of `workingDir`, rather than just scanning for the folder name as a component anywhere in the path.

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@raquelmsmith raquelmsmith marked this pull request as ready for review June 17, 2026 18:41
Base automatically changed from bluebird/ui-update to main June 17, 2026 22:39
In the channels experience, drop the up-front repo/branch picker. A
channel task is now a generic chat box: it can be a coding task, an
analysis task, or a non-code task (e.g. an email) that needs no repo.
The agent decides at runtime whether it needs a repo, clones one only
when needed, and asks the user (via AskUserQuestion) if it can't tell
which repo to use. Works in local/worktree mode and cloud mode.

- UI: allowNoRepo flag (channel-only) hides the repo/branch pickers and
  removes the submit gate; /code/ is unchanged.
- Saga: a repo-less local task starts an agent in a per-task scratch dir
  (ensureScratchDir) instead of skipping the session; workspace.verify
  treats the scratch dir as valid.
- Agent: channelMode is derived server-side from the scratch path (so it
  survives reconnects). It enables a channel system-prompt and two local
  tools, list_repos (gh) and clone_repo (clones into a scratch subdir, no
  cwd rebind), gated on channelMode.

Cloud channel tasks rely on the existing cloud "No Repository Mode"
prompt; the new tools are local-only for now.

Generated-By: PostHog Code
Task-Id: 9c51d9dc-d108-41b2-abdd-0be442a5e36d
A repo-less channel task has no workspace row, so useCwd resolved no cwd
and TaskLogsPanel rendered the WorkspaceSetupPrompt ("Select a repository
folder") for every such task — before the agent ever connected, and via a
dedicated screen rather than letting the agent decide and ask.

Synthesize a local-mode workspace from the on-disk scratch dir (the path
is deterministic from the task id) in getWorkspace and getAllWorkspaces,
and clean the scratch dir up on delete. The task now resolves the scratch
dir as its cwd, skips the prompt, and connects the agent — which decides
whether a repo is needed and asks via AskUserQuestion only for coding work.

Generated-By: PostHog Code
Task-Id: 9c51d9dc-d108-41b2-abdd-0be442a5e36d
…po task

The synthetic scratch workspace was correct server-side, but the client's
workspace.getAll query is only invalidated after workspace.create — which
repo-less channel tasks skip. So the cached (empty) result kept repoPath
null and the "Select a repository folder" prompt stayed up over the running
task.

- Provision the scratch dir before onTaskReady so it exists when the task
  view mounts.
- Invalidate workspace.getAll after a no-repo task is created so the view
  refetches the synthetic scratch workspace, resolves its cwd, and shows the
  running session instead of the repo prompt.

Generated-By: PostHog Code
Task-Id: 9c51d9dc-d108-41b2-abdd-0be442a5e36d
@raquelmsmith raquelmsmith force-pushed the posthog-code/channels-generic-chatbox branch from 013c3a5 to f8d8fc4 Compare June 17, 2026 22:42
- clone_repo: redact the GitHub token from any error text, so a git failure
  that echoes the credential-bearing remote URL can't leak it into the tool
  result / transcript.
- clone_repo: make it idempotent — if the repo is already cloned at the
  target path, reuse it (optionally re-checking out the branch) instead of
  letting git abort on a non-empty destination.
- isScratchPath: match an actual path prefix against the real scratch base
  (derived from the worktree location) instead of scanning for the folder
  name anywhere in the path, so an unrelated/malicious dir named
  posthog-code-scratch can't spuriously enable channel mode. AgentService now
  injects IWorkspaceSettings to resolve the base.

Generated-By: PostHog Code
Task-Id: 9c51d9dc-d108-41b2-abdd-0be442a5e36d
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant