Skip to content

Latest commit

 

History

History
598 lines (476 loc) · 26.6 KB

File metadata and controls

598 lines (476 loc) · 26.6 KB

AgentRail Integration Guide

AgentRail is a lifecycle API that sits beside a coding agent. It does not replace Claude Code, Codex, Cursor, git, GitHub, or CI. It gives the agent one compact source of truth for assigned work, submission state, CI status, review feedback, events, and ship requests.

This repository is the local and self-managed source-available surface. It should be useful without a hosted account. The planned AgentRail Cloud surface is the managed team/fleet operations layer: connector operations, durable shared run history and memory, routing and wakes, SSO/RBAC/SCIM, audit, dashboards, support, compliance, and hosted reliability. See Cloud boundary.

Use this guide when you are integrating AgentRail into a real agent workflow. If you want to bootstrap a local self-hosted setup quickly, start with the five-minute quick start. If you want copy-paste agent instructions for Claude Code, Codex, or Cursor, use agent recipes.

The current onboarding path exists to close the routing bootstrap described in AGEA-95 and the integration-doc clarity gap called out in AGEA-93.

What Runs Where

Keep these processes separate:

Process Where it runs What it does
AgentRail API server This repository Serves /tasks, /ci-status, /review-feedback, events, webhooks, and ship operations.
AgentRail intake router AgentRail control plane Normalizes provider issues, evaluates routing rules, assigns AgentRail tasks, records routing explanations, and wakes the assigned agent.
Coding agent CLI The target code repository Edits files, runs tests, commits, pushes, and opens PRs.
Agent integration code Your agent harness or target repo Calls AgentRail through the TypeScript SDK, Python SDK, or HTTP.
GitHub / CI providers External services Own source control, pull requests, checks, and reviews.

For local evaluation, the AgentRail server and the target repo may be the same checkout. In production, they are usually separate: AgentRail runs as shared infrastructure, while agents work inside individual project repositories. Self-managed source-available deployments can run that shared infrastructure for a small deployment, but it is not the same promise as AgentRail Cloud operating the team control plane with managed connectors, access control, audit, support, and reliability.

Current / Legacy / Planned Capability Labels

This repository mixes working source-available runtime paths with operator contracts for the planned control plane. Use these labels when deciding what an integration can rely on today:

Capability Current live adapter support Legacy demo note Planned MVP control-plane behavior
Intake Current: Provider intake is documented in the routing OpenAPI, but the self-managed server does not yet run a live provider intake worker. Legacy: The removed demo used a pre-seeded task instead of ingesting a provider issue. Planned: The control plane receives or pulls provider issue snapshots and normalizes them into AgentRail task candidates.
Routing Current: Routing rules, optional AI routing, dry-run evaluation, assignment, and audit are implemented as operator/admin contracts; AGENTRAIL_ROUTING_AUDIT_STORE_PATH persists decisions and evaluation/intake idempotency replay locally. Legacy: The removed demo skipped routing by starting with a pre-assigned task. Planned: Hosted control-plane deployments evaluate deterministic rules, use AI to route tasks to the right agents where configured, store routingReason, wake the selected agent, and expose managed audit history.
Auth Current: Agent API key creation, scopes, rate limits, and route enforcement are implemented on the default server path. Legacy: Placeholder demo keys are no longer valid on the core runtime. Planned: Hosted control-plane deployments issue least-privilege scoped keys per agent and expose operator rotation workflows.
Local/self-hosted setup Current: agentrail init writes local .agentrail scaffolding and operator bootstrap state, agentrail agent create creates scoped local agent credentials/profile/routing, and agentrail doctor --agent-id <agentId> verifies health, auth, profile/routing state, and /tasks/mine visibility. Legacy: The removed demo runtime used a built-in fixture task instead of explicit task-store configuration. Planned: Hosted setup will wrap the same identity/profile/routing concepts in a managed team onboarding service.
Live task store Current: The server reads durable task records from AGENTRAIL_TASK_STORE_PATH, can persist routing audit records through AGENTRAIL_ROUTING_AUDIT_STORE_PATH, and never falls back to hidden fixture data. Legacy: The removed demo used an in-memory deterministic lifecycle store. Planned: The control plane persists assigned tasks, routing decisions, lifecycle state, and event cursors in managed storage.
Submit Current: mode: "adapter_managed" lets the GitHub submit adapter create or reuse provider PRs from persisted task.source metadata. Legacy: Artifact-style placeholder PR examples remain documentation-only; they are not a runtime mode. Planned: Submit is always mediated by provider adapters, with idempotent create-or-reuse behavior and compact response state.
CI / review Current: GitHub Actions, CircleCI, and GitHub review feedback adapters expose compact status summaries from persisted task.source metadata. Legacy: The removed demo simulated CI and review transitions locally. Planned: The control plane stores provider status snapshots, emits task events, and prefers push delivery over agent polling.
Ship Current: Ship and rollback routes are implemented behind adapter interfaces with idempotency keys and common state/error handling. Legacy: The removed demo returned a deterministic queued ship result. Planned: Managed control-plane deployments coordinate merge, deploy, rollback, and audit with least-privilege provider permissions.

First-Time Routing Bootstrap

The first routing rule is not created by a worker agent and it is not inferred only from GitHub labels.

Current source-available server model:

  • A trusted operator or setup script first creates the AgentProfile for the new agentId.
  • That same setup path then creates the initial routing rule set through PUT /operator/routing/rule-sets/current.
  • The first rule should stay narrow: target the new agentId only for the selected repo allowlist and capability tags, with triage as the fallback.
  • After that, AgentRail seeds or ingests one setup verification task through the normal assignment path so the new agent can prove it can read its task.

Current CLI-assisted model:

  • agentrail init gathers repo/base URL defaults, writes local setup files, and creates local operator bootstrap state.
  • During init, choose either rules-only routing or "Use AI to route tasks to the right agents." Rules-only never calls a model; AI routing uses your chosen local runner such as Codex, Claude Code, or Cursor.
  • In AI routing mode, choose whether AgentRail must require a suitable agent and retry waiting tasks after agents change, or assign the closest available match as a recorded best-effort decision.
  • Routing audit entries include role-template metadata for template-backed agents. When no suitable agent exists, AgentRail suggests updating an existing agent or creating a new one from a likely built-in template, then retries waiting tasks after agent profile or routing rule changes.
  • AI routing uses a local runner timeout of 180 seconds by default. If Codex, Claude Code, or Cursor is slow on a machine, raise routing.classifier.timeoutMs up to 600 seconds in config.json; timeout failures go to triage.
  • agentrail agent create creates the scoped agent key, AgentProfile, starter routing state, managed agent env file, and optional per-agent model/profile.
  • agentrail doctor --agent-id <agentId> uses the generated agent key plus operator state to verify that the bootstrap produced visible assigned work. In AI routing mode it also verifies that the configured local runner is available without spending model tokens.

Why it works this way:

  • Routing is control-plane configuration, not worker-owned behavior.
  • AgentRail needs an auditable initial rule-set revision before the first real task arrives, so later ownership changes remain data changes instead of code edits.
  • Starting with one narrow bootstrap rule keeps the first assignment deterministic while AI routing policy handles anything the setup flow did not cover.

Intended End-to-End Flow

The intended production flow is AgentRail-owned:

  1. AgentRail pulls or receives provider issue data from providers such as GitHub.
  2. The AgentRail intake router evaluates deterministic assignment rules and, when enabled, uses AI to route tasks to the right agents.
  3. AgentRail records the assignment and routingReason, then wakes the assigned agent.
  4. In managed local mode, AgentRail wakes the assigned coding agent with the current task context.
  5. The agent edits files, runs tests, and commits locally.
  6. The agent reports back to AgentRail.
  7. AgentRail's adapter creates or reuses the provider PR, tracks CI and review, and returns compact lifecycle state to the agent.
  8. AgentRail follows lifecycle actions until the task is fixed, approved, and shippable. External harnesses may follow availableActions directly.

In local/self-hosted mode, agentrail server start owns wake orchestration. It loads managed agent env files from ~/.agentrail/agents/, starts an event-driven agentrail agent run loop for each configured local agent, and restarts those loops if they exit. Provider webhook and polling delivery still follow the configured provider mode; the runner wake loop only consumes AgentRail task events.

In that flow, the agent should not manually pass a PR URL as the primary automation contract. The PR URL is provider state that AgentRail should create, discover, and return through the task lifecycle response.

Routing is not a worker-agent responsibility. Operators manage routing through the separate intake routing architecture and operator routing OpenAPI contracts. Those endpoints require routing scopes and should not be generated into the normal lifecycle SDK used by coding agents.

The source-available repository now exposes this as the primary submit contract: mode: "adapter_managed" lets GitHubSubmitAdapter create or reuse PRs from persisted task.source metadata, and the response can include prUrl, prNumber, and whether the PR was created or reused.

Local deterministic examples may still show placeholder PR artifacts so the demo can run without provider credentials. Do not model production automation on a human pasting a PR URL into AgentRail.

Choose an Integration Track

Track A: Local Self-Hosted Bootstrap

Use this when you want to run the real server locally with CLI-managed config, agent credentials, routing, and provider state. The default path is:

  1. Run agentrail init. Choose rules-only routing if your issues already carry enough labels, projects, or type metadata. Choose AI routing if you want AgentRail to use AI to route tasks to the right agents. Choose the GitHub review policy that matches your repo: follow GitHub's own review signals, always require approval for AgentRail PRs, or skip approval waits once CI is green.
  2. Start the server with agentrail server start; this also keeps configured local agents awake.
  3. Create or connect the first local agent with agentrail agent create if init did not already do it interactively.
  4. Finish with agentrail doctor --agent-id <agentId>.

Track A enables anonymous product telemetry by default during setup. It uses an anonymous install ID. AgentRail does not send source code, issue text, prompts, logs, env vars, secrets, tokens, diffs, or raw provider payloads.

agentrail telemetry status
agentrail telemetry off
agentrail telemetry on

The copy-paste version of that flow lives in the five-minute quick start. Raw lifecycle curls are developer reference material after doctor passes.

npm install -g @agentrail-core/cli
agentrail init
agentrail server start

In a second terminal:

agentrail agent create
agentrail doctor --agent-id agt_example

Cloud boundary note: Track A proves the local lifecycle contract. It does not include managed provider connectors, team access control, audited fleet routing, dashboards, support, compliance, backups, or hosted reliability.

Track B: Claude Code / Codex / Cursor Uses AgentRail

Use this when a coding agent should work through AgentRail instead of manually polling GitHub, CI, and review APIs.

  1. Start AgentRail with Track A, your own auth-enabled deployment, or an explicitly provisioned hosted API base URL. Public AgentRail Cloud is not generally available yet.
  2. Open the target repository where the agent should edit code.
  3. For managed local agents, use the env files created by AgentRail setup and let agentrail server start wake the agent. For manual harnesses, export the AgentRail connection settings in the agent's shell.
  4. Start the coding agent with the AgentRail operating instructions, or let the server-owned supervisor start it.

Example:

cd /path/to/target-repo
export AGENTRAIL_BASE_URL=http://127.0.0.1:3000
export AGENTRAIL_API_KEY=ar_live_replace_with_real_key

Claude Code interactive launch:

claude --append-system-prompt-file /path/to/agentrail/docs/agent-recipes.md

Codex or Cursor:

  • Use agentrail agent run directly only for debugging or when intentionally running a single local agent outside the server-owned supervisor.
  • AgentRail waits on task events, starts the configured coding agent only for actionable code work, and consumes the local report after the child process exits.
  • Managed children may use agentrail run current, agentrail run actions, and agentrail agent report. Do not ask the child LLM to query broad AgentRail task, CI, review, provider, or operator endpoints.

The agent still edits files in the target repository. AgentRail owns:

  • task assignment and lifecycle state,
  • CI and review observation,
  • provider PR creation, shipping, and rollback,
  • relaunching the agent only when CI or review feedback requires code changes.

Track C: Application or Harness Uses the SDK

Use this when you are writing an agent harness, MCP server, workflow runner, or internal service that calls AgentRail directly.

The SDKs are client libraries for a running AgentRail API. They do not start the local server, connect providers, create local config, or wake managed local agents. Follow the CLI setup above first, then use the SDK from your external harness. The dedicated SDK guide has complete TypeScript and Python examples.

Install the TypeScript SDK in that application:

npm install @agentrail-core/sdk

For local development against this repository before publication, build and install from the local SDK directory:

cd /path/to/agentrail/sdk/typescript
npm install
npm run build
cd /path/to/your-agent-harness
npm install /path/to/agentrail/sdk/typescript

Install the Python SDK in a Python harness:

pip install agentrail

For local development against this repository:

pip install -e /path/to/agentrail/sdk/python

Core Runtime Loop

Managed local child agents are different from external harnesses. They should not list tasks or poll lifecycle APIs. They receive a single run-scoped context and may use agentrail run current, agentrail run actions, and agentrail agent report.

Agents should follow the API's availableActions field instead of guessing the next step when they are external harnesses using the SDK or HTTP API directly.

  1. List assigned work.
  2. Read the selected task.
  3. Edit and test locally.
  4. Submit an attempt with an idempotency key.
  5. Wait for task events, or read CI and review summaries.
  6. Fix the task if CI or review requires changes.
  7. Ship only when CI is green and review is approved.

Submit Model

There are two submit modes in the current repo:

Mode Intended use PR URL handling
Adapter-managed submit Production and serious dogfooding AgentRail creates or reuses the PR through its provider adapter and returns the PR URL.
Artifact demo submit Deterministic local demo with no provider credentials The request includes a placeholder pull_request artifact so CI/review/ship examples can run locally.

Prefer adapter-managed submit for real automation. The artifact demo mode is a local scaffold, not the product architecture.

HTTP shape:

curl -s "$AGENTRAIL_BASE_URL/tasks/mine?status=in_progress&limit=1" \
  -H "authorization: Bearer $AGENTRAIL_API_KEY"
curl -s -X POST "$AGENTRAIL_BASE_URL/tasks/tsk_DEMOISSUETOSHIP01/submit" \
  -H "authorization: Bearer $AGENTRAIL_API_KEY" \
  -H "content-type: application/json" \
  -H "idempotency-key: submit-adapter-1" \
  -d '{
    "summary": "Implemented the failing endpoint and pushed commits to the task branch.",
    "mode": "adapter_managed",
    "pullRequest": {
      "title": "Implement failing endpoint",
      "draft": false
    }
  }'
curl -s "$AGENTRAIL_BASE_URL/tasks/tsk_DEMOISSUETOSHIP01/ci-status" \
  -H "authorization: Bearer $AGENTRAIL_API_KEY"
curl -s "$AGENTRAIL_BASE_URL/tasks/tsk_DEMOISSUETOSHIP01/review-feedback" \
  -H "authorization: Bearer $AGENTRAIL_API_KEY"
curl -s -X POST "$AGENTRAIL_BASE_URL/tasks/tsk_DEMOISSUETOSHIP01/ship" \
  -H "authorization: Bearer $AGENTRAIL_API_KEY" \
  -H "content-type: application/json" \
  -H "idempotency-key: ship-demo-1" \
  -d '{
    "mode": "merge_and_deploy",
    "targetEnvironment": "production",
    "expectedHeadSha": "b5bc7f86b9ad94f4f18f83d28bdf3e27a31e53a0"
  }'

SDK Examples

TypeScript

import { AgentRailClient } from "@agentrail-core/sdk";

const client = new AgentRailClient({
  baseUrl: process.env.AGENTRAIL_BASE_URL ?? "http://127.0.0.1:3000",
  apiKey: process.env.AGENTRAIL_API_KEY!,
});

const tasks = await client.listMyTasks({ status: "in_progress", limit: 1 });
const task = tasks.data[0];

if (!task) {
  process.exit(0);
}

if (task.availableActions.includes("submit")) {
  await client.submitTask(
    task.id,
    {
      summary: "Implemented the task and pushed commits to the task branch.",
      mode: "adapter_managed",
      pullRequest: {
        title: `Submit ${task.identifier}`,
        draft: false,
      },
    },
    `submit-${task.id}-v1`,
  );
}

const ci = await client.getTaskCiStatus(task.id);
const review = await client.getTaskReviewFeedback(task.id);

if (
  ci.data.overallStatus === "passed" &&
  review.data.latestDecision?.outcome === "approved"
) {
  await client.shipTask(
    task.id,
    {
      mode: "merge_and_deploy",
      targetEnvironment: "production",
      expectedHeadSha: "b5bc7f86b9ad94f4f18f83d28bdf3e27a31e53a0",
    },
    `ship-${task.id}-v1`,
  );
}

Python

import asyncio
import os

from agentrail import AgentRailClient, TaskStatus


async def main():
    async with AgentRailClient(
        base_url=os.getenv("AGENTRAIL_BASE_URL", "http://127.0.0.1:3000"),
        api_key=os.environ["AGENTRAIL_API_KEY"],
    ) as client:
        tasks = await client.list_my_tasks(status=TaskStatus.IN_PROGRESS, limit=1)
        if not tasks.data:
            return

        task = tasks.data[0]
        ci = await client.get_task_ci_status(task.id)
        review = await client.get_task_review_feedback(task.id)

        print(task.identifier, ci.data.overall_status, review.data.latest_decision.outcome)


asyncio.run(main())

Auth-Enabled Operation

Agent auth is supported on the default server path. The first bootstrap request creates an auth:admin key, and subsequent task routes require the returned secret data.apiKey.

Current behavior: auth-enabled setup is an operator/server wiring path, not a one-command local CLI. Planned behavior: agentrail agent create/connect wraps the API key and profile calls described below, writes .agentrail/agents/<agentId>.env, and verifies that the generated key can call /tasks/mine.

Create the first admin key:

curl -s -X POST "$AGENTRAIL_BASE_URL/agent-api-keys" \
  -H "content-type: application/json" \
  -H "idempotency-key: bootstrap-admin-v1" \
  -d '{
    "agent": {
      "id": "agt_cto",
      "displayName": "CTO",
      "role": "cto"
    },
    "scopes": ["auth:admin"],
    "rateLimit": {
      "windowSeconds": 60,
      "maxRequests": 600
    }
  }'

The response returns data.apiKey once. Store that secret in the agent runtime or secret manager. The data.id value starts with akey_ and is only the key identifier.

Recommended scopes:

Agent responsibility Minimum scopes
Read assigned tasks tasks:read
Submit completed work tasks:read, tasks:write
Inspect CI ci:read
Inspect review feedback reviews:read
Ship or roll back ship:write
Stream task events events:read
Manage event webhook subscriptions webhooks:read, webhooks:write
Manage API keys auth:admin, usage:read

Live GitHub and CI Adapters

The local demo is deterministic only in explicit demo mode. To connect task lifecycle calls to live providers, configure provider tokens before starting the default server. Live adapters read persisted task.source metadata from the task record; if an older task is missing required source fields, repair it through the operator API or agentrail task source repair.

export GITHUB_TOKEN=ghp_...
export CIRCLECI_TOKEN=...
export CIRCLECI_WEBHOOK_SECRET=...
npm start

Provider behavior:

  • GITHUB_TOKEN enables GitHub PR submission and GitHub Actions CI status.
  • CIRCLECI_TOKEN enables CircleCI status for tasks with ciProvider: "circleci".
  • CIRCLECI_WEBHOOK_SECRET verifies inbound CircleCI webhook requests at POST /providers/circleci/webhooks.
  • agentrail provider connect <provider> validates credentials, runs readiness, applies safe local setup fixes, and reports classified blockers such as missing files, missing env vars, missing config, remote settings, or repo linkage gaps.
  • agentrail provider test <provider> and agentrail provider doctor <provider> use the same readiness engine; test is not a lightweight token probe.
  • If no live adapter matches a task, the route returns 404; server mode never falls back to the deterministic demo task store.

Do not commit provider tokens or generated AgentRail API keys.

Push Instead of Polling

Managed AgentRail runners avoid blind status polling. They use:

  • GET /task-events/stream for server-sent events with cursor replay.
  • /event-subscriptions for signed outbound webhook delivery.

Manual harnesses may use polling only as a compatibility fallback when they cannot receive push events.

Managed Run Reclaim Policy

Managed local runners keep durable run records so AgentRail can avoid launching two agents on the same task at once. If the local runner process disappears while a run is still marked starting or running, AgentRail can reclaim that record after a conservative timeout and let the task run again.

AgentRail only reclaims active managed runs when the recorded process is no longer alive and the configured stale threshold has passed. It does not reclaim awaiting_user runs because those represent unresolved human action.

Defaults:

Config key Default Meaning
managedRuns.startingStaleAfterMs 300000 Reclaim a stuck launch after five minutes.
managedRuns.runningStaleAfterMs 5400000 Reclaim a running local agent only after ninety minutes.
managedRuns.failureWindowMs 3600000 Count repeated runner infrastructure failures within one hour.
managedRuns.maxInfrastructureFailures 2 Retry one abandoned run, then block for user action on the second recent infrastructure failure.
managedRuns.supervisorRestartWindowMs 60000 Window for local supervisor restart-loop detection.
managedRuns.supervisorMaxRestarts 5 Pause automatic restarts after more than five exits in the restart window.

Environment overrides use seconds for timeout values:

  • AGENTRAIL_MANAGED_RUN_STARTING_STALE_SECONDS
  • AGENTRAIL_MANAGED_RUN_RUNNING_STALE_SECONDS
  • AGENTRAIL_MANAGED_RUN_FAILURE_WINDOW_SECONDS
  • AGENTRAIL_MANAGED_RUN_MAX_INFRA_FAILURES
  • AGENTRAIL_LOCAL_RUNNER_RESTART_WINDOW_SECONDS
  • AGENTRAIL_LOCAL_RUNNER_MAX_RESTARTS

When repeated infrastructure failures hit the limit, AgentRail blocks the task for user action with a managed-run reclaim reason instead of silently looping. This is reliability protection for local managed runs; it is not full queue fairness, agent capacity planning, or priority scheduling.

Troubleshooting

Symptom Likely cause Fix
auth_store_unavailable from /agent-api-keys The server was started without auth wiring or an older runtime is still running Restart with the current default server entrypoint and retry the bootstrap request.
401 Unauthorized Auth-enabled server received a missing or wrong bearer key Use the one-time data.apiKey secret, not the akey_... id.
403 insufficient_scope The key lacks the route's required scope Create or rotate a key with the minimum required scope.
409 conflict on submit or ship The idempotency key was reused with a different body, or the task is not in a valid state Use a new key for a new attempt, or follow availableActions.
503 with x-agentrail-fallback: true Fallback mode is enabled Set AGENTRAIL_FALLBACK_MODE=false and restart.
Empty tasks.data No task matches the status filter Try status=todo, remove the filter, or check the task assignment source.
CI stays pending No live CI adapter is configured for the task, or the task is missing persisted source metadata Set GITHUB_TOKEN or CIRCLECI_TOKEN, then inspect or repair task.source.

Related Documentation