FE-730: Orchestrator POC — Greenfield orchestrator with pi-agent dispatch#143
FE-730: Orchestrator POC — Greenfield orchestrator with pi-agent dispatch#143kostandinang wants to merge 22 commits into
Conversation
PR SummaryMedium Risk Overview This diff also lands Reviewed by Cursor Bugbot for commit 44300a6. Bugbot is set up for automated code reviews on this repo. Configure here. |
🤖 Augment PR SummarySummary: Adds an Orchestrator POC behind Changes:
Technical Notes: Deterministic verification uses orchestrator-owned 🤖 Was this summary useful? React with 👍 or 👎 |
f41c9d5 to
d18a745
Compare
lunelson
left a comment
There was a problem hiding this comment.
I think as a POC this is great, just want to make clear it's diverging from true petri-net properties with output places being dynamically determined by the transition. this might need some remodelling otherwise we're precluding some of the theoretical benefits
|
Thanks, this is a fair point. This PR reflects first take of PoC, focusing on the overall arch, rather than deepening on the petri interaction, where we can't do topology-level analysis. FE-738 (petri-semantic-lanes) moved this partway by separating topology compilation from runtime wiring and adding declared output sets to handlers. The remaining work, splitting conditional handlers into explicit graph transitions with declared outputs/guards — is under work and land in subsequent PRs, should address this. Appreciate the early catch here. |
b72d1ad to
934ea57
Compare
934ea57 to
ab20ef2
Compare
- Add orchestrator capability requirements (R46–R50) to SPEC.md - Add decisions D155-K–D159-K (dual-engine, reports.jsonl, ActionRegistry, plan model, worktree isolation) - Add invariants I121-K–I123-K (contract test parity, token discipline, worktree safety) - Add orchestrator lexicon entries - Add orchestrator-poc frontier definition to PLAN.md - Move design doc to docs/design/orchestrator.md - Update Linear FE-730 description to match design doc Co-authored-by: Amp <amp@ampcode.com>
#1 - types.ts: Plan, Epic, Slice, Orchestrator seam, ReportSink, ActionHandlers - report-sink.ts: InMemoryReportSink (append + query by id) - engine-proc.ts: ProceduralOrchestrator with TDD inner loop, topo-sort, epic-level verification, retry loop - engine-contract.test.ts: 4 tests — status completed, correct outcomes, TDD cycle call order, report sink contents - Code lives under src/orchestrator/ (cook is CLI subcommand name only) - All 4 contract tests pass; npm run verify clean Co-authored-by: Amp <amp@ampcode.com>
… CLI, fixture - plan-loader.ts: YAML parsing with validation (3 tests) - test-runner.ts: BunTestRunner wrapping `bun test` - worktree.ts: createWorktree with .cook/runs/<runId>/worktree/ (2 tests) - file-report-sink.ts: JSONL-backed ReportSink with stdout streaming - pi-actions.ts: createPiActions() dispatching pi CLI for each agent role - prompts/: test-writer.md, code-writer.md, evaluator.md - cook-cli.ts: parseCookArgs + runCook wiring everything together (5 tests) - cli.ts: `brunch cook` command registered alongside agent - fixtures/txt/plan.yaml: Fixture #1 (2 epics, 5 slices) - 34 orchestrator tests pass; build clean with cook-cli chunk
Design doc §8: worktree at <cwd>/.cook/runs/ not <dir>/.cook/runs/ R49, D159-K, I123-K: updated to cwd-scoped worktree Lexicon: worktree entry clarifies cwd-scoped Card 16 scoped: cwd worktree + fixture cleanup Co-authored-by: Amp <amp@ampcode.com>
- cook-cli: structured header/footer with engine, plan, worktree, retries; per-epic/slice result table; total duration - pi-actions: elapsed timer from session start, compact one-line-per-action with icons (▸ start, ✓ done, ✗ fail, ● verdict, ○ needs work, ? evaluate) - file-report-sink: stop streaming raw JSON to stdout; JSON stays in file only - 35 tests pass, build clean
The field was always the agent working directory, not the fixture directory. Also removes unused ReportLine import from engine-proc. Co-authored-by: Amp <amp@ampcode.com>
topoSort<T>(items, getId, getDeps) replaces topoSort(epics) + topoSortSlices(slices). Co-authored-by: Amp <amp@ampcode.com>
report-helpers.ts: createReport(sink, fields) handles id generation + timestamp + append. Replaces 5 inline report-construction sites across engine-proc, engine-petri, and pi-actions. Co-authored-by: Amp <amp@ampcode.com>
Delete old module-level callOrder/evalCallCount/fakeActions/fakeTestRunner. All 9 contract test suites now use the same createFakes() factory. ~100 lines removed. Co-authored-by: Amp <amp@ampcode.com>
Co-authored-by: Amp <amp@ampcode.com>
Co-authored-by: Amp <amp@ampcode.com>
Co-authored-by: Amp <amp@ampcode.com>
…sults - Status banner: landed POC with SPEC cross-references - §2 seam: fixtureDir → worktreeDir, ActionRegistry → ActionHandlers - §3: POC note pointing to §12 deferral - §12: streaming UX row updated (implemented, not deferred) - §13: experiment results with verdict (proc wins on simplicity) Co-authored-by: Amp <amp@ampcode.com>
…propagation, verify-epic parity - cook-cli: validate --max-retries is finite non-negative (prevents NaN infinite loop) - engine-petri: epic deps use single transition with ALL dep-done places as inputs (was one transition per dep → fired on first dep instead of all) - engine-petri: PetriNet.run() accepts shouldHalt callback, checked each iteration (was ignoring ctx.halted so transitions kept firing after a halt) - engine-proc: verify-epic called once per epic, not once per verification entry (handler owns all targets; matches petri engine behavior) Co-authored-by: Amp <amp@ampcode.com>
- engine-petri: unreached slices/epics now set ctx.halted=true so the overall status correctly reports 'halted' instead of 'completed' - report-helpers: append monotonic sequence counter to IDs to prevent collisions when multiple reports are created in the same millisecond Co-authored-by: Amp <amp@ampcode.com>
- engine-petri: epic deps use per-dependent signal places (same pattern as slice deps) so multiple epics depending on the same predecessor each get their own token instead of competing for one - engine-proc: haltedResult() fills in unreached epics/slices as halted before returning, matching petri engine behavior Co-authored-by: Amp <amp@ampcode.com>
Co-authored-by: Amp <amp@ampcode.com>
Co-authored-by: Amp <amp@ampcode.com>
- engine-proc: hoist reportIds to run() scope so catch preserves them - engine-petri: remove dead ep(epicId, 'ready') place (readiness fans out directly to slice eligible places) - pi-actions: verify-epic write step uses test-writer.md, not evaluator.md Co-authored-by: Amp <amp@ampcode.com>
- Extract PetriNet class, Token, TransitionDef, FiringPolicy into petri-net.ts - Extract compilePlan() and RunCtx into net-compiler.ts - Both engines now call shared compilePlan() with serial firing policy - Migrate retry state from ctx.retries Map into in-net retry-budget places - Add adapter tests pinning compiled net place/transition counts - engine-petri.ts and engine-proc.ts are thin wrappers (~65 LOC each) Amp-Thread-ID: https://ampcode.com/threads/T-019e4b4e-1543-7602-b99d-c32342fb3938 Co-authored-by: Amp <amp@ampcode.com>
…oped - Mark orchestrator-poc done in PLAN.md (Phase 0 complete) - Add petri-semantic-lanes and petri-parallel-execution frontier definitions - Add petri-graph-compilation and petri-simulation-oracle to Horizon - Add Track F dependency graph for H-6476 umbrella - Scope Card 1-3 queue in CARDS.md for petri-semantic-lanes Amp-Thread-ID: https://ampcode.com/threads/T-019e4b4e-1543-7602-b99d-c32342fb3938 Co-authored-by: Amp <amp@ampcode.com>
ab20ef2 to
5585e4e
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 5585e4e. Configure here.
Anchor pi-action log elapsed time to cook run start instead of module import, and consolidate proc/petri run logic into engine-run with thin policy wrappers. Co-authored-by: Cursor <cursoragent@cursor.com>
See the output of git range-diff at https://github.com/hashintel/brunch/actions/runs/26397697802


Orchestrator POC. First take at representing orchestration on a minimal Petri-net structure — the goal is to validate the substrate on a simple fixture and evolve from there with more complex plans, richer action types, and parallel execution. Two interchangeable engines behind a shared seam, driven test-first with fake agents and validated end-to-end with real pi/Sonnet. The plan schema is speculative — brunch does not yet emit execution plans; the YAML shape is forward-compatible and will sharpen as canonical plan output lands.
CLI
Architecture
graph TD CLI["brunch cook <dir>"] --> Loader["Plan Loader"] Loader --> Engine subgraph Engine["Orchestrator Interface"] Proc["proc engine"] Petri["petri engine"] end Engine --> Actions["Action Handlers"] Engine --> Runner["Test Runner"] Actions --> Pi["pi CLI"] Engine --> Reports["reports.jsonl"] Engine --> WT["Worktree"]Per-slice TDD loop
graph TD subgraph "Petri net inner loop" Ready(("spec\nready")) -->|evaluate| NM(("needs\nmore")) Ready -->|evaluate| Done(("done")) NM -->|write-tests| FT(("failing\ntests")) FT -->|write-code| UC(("untested\ncode")) UC -->|"run-tests ✓"| Ready UC -->|"run-tests ✗"| FT Done --> Comp(("completed")) endKey decisions
ActionRegistrydeferred until a 3rd action type landsreports.jsonlas communication medium — petri enforces token-pointer discipline; proc passes data normally; shared seam is inputs/outputs<cwd>/.cook/runs/<runId>/, fixtures stay pristine--mode text;extractJson()parses evaluator responsesVerification
txtCLI from empty worktreeProc vs Petri
Promise.all— structural changeVerdict: Proc wins on simplicity and debuggability. Petri earns its complexity only when parallel execution or dynamic replanning enters scope. More tests to be done with the petri-net to understand more on parallelism._
Out of scope
Milestones, resumability, parallel execution, brownfield seed,
ActionRegistryabstraction, dynamic replanning.Fixture:
txtGreenfield TypeScript CLI built from nothing — 2 epics, 5 slices:
--version,--help(lists subcommands)reverse,count,slugifyExercises: happy-path TDD cycles, intra-epic slice deps (
help-flagwaits onversion-flag), inter-epic deps (text-opswaits onscaffolding), epic-level integration verification, and the retry loop (slugify edge cases).Running examples
TBD — sample CLI output and fixture run recordings to be added.
Reports.jsonl
{"id":"rpt-evaluator-version-flag-1779363759893-0","ts":"2026-05-21T11:42:39.893Z","epicId":"scaffolding","sliceId":"version-flag","actor":"evaluator","event":"eval-done","payload":{"done":false,"reasoning":"The verification target `tests/version.test.ts` does not exist in the worktree. The worktree directory is empty (only contains `.` and `..`). No test file has been created to verify the `--version` flag implementation. To satisfy this slice, a test file must be created at `tests/version.test.ts` that defines and tests the functionality of adding a `--version` flag that prints the version from package.json, and those tests must pass when run with `bun test`."}} {"id":"rpt-test-writer-version-flag-1779363997897-1","ts":"2026-05-21T11:46:37.898Z","epicId":"scaffolding","sliceId":"version-flag","actor":"test-writer","event":"tests-written","payload":{"sliceId":"version-flag","targets":["tests/version.test.ts"]}} {"id":"rpt-code-writer-version-flag-1779364039448-2","ts":"2026-05-21T11:47:19.449Z","epicId":"scaffolding","sliceId":"version-flag","actor":"code-writer","event":"code-written","payload":{"sliceId":"version-flag"}} {"id":"rpt-test-runner-version-flag-1779364040678-3","ts":"2026-05-21T11:47:20.678Z","epicId":"scaffolding","sliceId":"version-flag","actor":"test-runner","event":"tests-run","payload":{"passed":true,"output":"bun test v1.3.0 (b0a6feca)\n"}} {"id":"rpt-evaluator-version-flag-1779364062168-4","ts":"2026-05-21T11:47:42.168Z","epicId":"scaffolding","sliceId":"version-flag","actor":"evaluator","event":"eval-done","payload":{"done":true,"reasoning":"All 7 tests in tests/version.test.ts pass successfully. The implementation correctly handles the --version flag specification:\n\n✅ Exits with code 0 when --version is passed\n✅ Prints the version from package.json to stdout\n✅ Does not write to stderr\n✅ Does not launch web UI or show help banners\n✅ Supports -V short flag variant\n✅ Version output is semver-formatted\n✅ Version output is single line\n\nThe implementation in src/server/cli.ts correctly reads from package.json and outputs the version before exiting. All verification targets are satisfied."}} {"id":"rpt-evaluator-help-flag-1779364081347-5","ts":"2026-05-21T11:48:01.348Z","epicId":"scaffolding","sliceId":"help-flag","actor":"evaluator","event":"eval-done","payload":{"done":false,"reasoning":"Verification target `tests/help.test.ts` does not exist. The test file that should validate the --help flag functionality (listing subcommands: reverse, count, slugify) is missing entirely. No tests can be run to verify the implementation. The slice specification requires both the implementation of the --help flag and the corresponding test coverage."}} {"id":"rpt-test-writer-help-flag-1779364118018-6","ts":"2026-05-21T11:48:38.018Z","epicId":"scaffolding","sliceId":"help-flag","actor":"test-writer","event":"tests-written","payload":{"sliceId":"help-flag","targets":["tests/help.test.ts"]}} {"id":"rpt-code-writer-help-flag-1779364162780-7","ts":"2026-05-21T11:49:22.781Z","epicId":"scaffolding","sliceId":"help-flag","actor":"code-writer","event":"code-written","payload":{"sliceId":"help-flag"}} {"id":"rpt-test-runner-help-flag-1779364164386-8","ts":"2026-05-21T11:49:24.386Z","epicId":"scaffolding","sliceId":"help-flag","actor":"test-runner","event":"tests-run","payload":{"passed":true,"output":"bun test v1.3.0 (b0a6feca)\n"}} {"id":"rpt-evaluator-help-flag-1779364183312-9","ts":"2026-05-21T11:49:43.312Z","epicId":"scaffolding","sliceId":"help-flag","actor":"evaluator","event":"eval-done","payload":{"done":true,"reasoning":"All 10 tests in tests/help.test.ts pass. The verification targets comprehensively cover the slice specification: --help flag exits with code 0, prints help text to stdout, lists all three subcommands (reverse, count, slugify), supports -h short flag, does not launch the web UI, and includes a usage/commands header. No test failures or missing tests."}} {"id":"rpt-orchestrator-version-flag-1779364276953-10","ts":"2026-05-21T11:51:16.953Z","epicId":"scaffolding","sliceId":"version-flag","actor":"orchestrator","event":"epic-verified","payload":{"passed":true}} {"id":"rpt-evaluator-reverse-1779364313167-11","ts":"2026-05-21T11:51:53.167Z","epicId":"text-ops","sliceId":"reverse","actor":"evaluator","event":"eval-done","payload":{"done":false,"reasoning":"The verification target file 'tests/reverse.test.ts' does not exist in the worktree. Additionally, the reverse subcommand is listed in the help output but has no actual implementation in src/server/cli.ts — there is no handler for rawArgs[0] === 'reverse', no pure string reversal function, and no wiring to argv[2]. The specification requires tests to exist and pass, but the test file is completely missing, making it impossible to verify the slice is satisfied."}} {"id":"rpt-test-writer-reverse-1779364369807-12","ts":"2026-05-21T11:52:49.807Z","epicId":"text-ops","sliceId":"reverse","actor":"test-writer","event":"tests-written","payload":{"sliceId":"reverse","targets":["tests/reverse.test.ts"]}} {"id":"rpt-code-writer-reverse-1779364421099-13","ts":"2026-05-21T11:53:41.099Z","epicId":"text-ops","sliceId":"reverse","actor":"code-writer","event":"code-written","payload":{"sliceId":"reverse"}} {"id":"rpt-test-runner-reverse-1779364422488-14","ts":"2026-05-21T11:53:42.488Z","epicId":"text-ops","sliceId":"reverse","actor":"test-runner","event":"tests-run","payload":{"passed":true,"output":"bun test v1.3.0 (b0a6feca)\n"}} {"id":"rpt-evaluator-reverse-1779364440832-15","ts":"2026-05-21T11:54:00.832Z","epicId":"text-ops","sliceId":"reverse","actor":"evaluator","event":"eval-done","payload":{"done":true,"reasoning":"All 22 tests in tests/reverse.test.ts pass successfully. The test suite verifies: (1) the pure `reverse()` function is exported and correctly reverses strings of all types (ASCII, unicode-compatible, with spaces/numbers/punctuation, palindromes, empty strings, single chars), (2) the CLI `reverse` subcommand exits with code 0, (3) the subcommand reads from argv[2] and outputs the reversed string to stdout, (4) output is newline-terminated with no extra lines, and (5) stderr is empty on normal invocation. All specification requirements are satisfied."}} {"id":"rpt-evaluator-count-1779364466667-16","ts":"2026-05-21T11:54:26.668Z","epicId":"text-ops","sliceId":"count","actor":"evaluator","event":"eval-done","payload":{"done":false,"reasoning":"The verification target file `tests/count.test.ts` does not exist. The slice specification requires a `count` subcommand that counts whitespace-separated words with empty input returning 0, but there is no test file to verify this functionality. The worktree contains only: cli-scaffolding.integration.test.ts, help.test.ts, reverse.test.ts, and version.test.ts. Neither the test file nor any implementation of the count function exists."}} {"id":"rpt-test-writer-count-1779364575496-17","ts":"2026-05-21T11:56:15.496Z","epicId":"text-ops","sliceId":"count","actor":"test-writer","event":"tests-written","payload":{"sliceId":"count","targets":["tests/count.test.ts"]}} {"id":"rpt-code-writer-count-1779364636796-18","ts":"2026-05-21T11:57:16.796Z","epicId":"text-ops","sliceId":"count","actor":"code-writer","event":"code-written","payload":{"sliceId":"count"}} {"id":"rpt-test-runner-count-1779364638241-19","ts":"2026-05-21T11:57:18.242Z","epicId":"text-ops","sliceId":"count","actor":"test-runner","event":"tests-run","payload":{"passed":true,"output":"bun test v1.3.0 (b0a6feca)\n"}} {"id":"rpt-evaluator-count-1779364663762-20","ts":"2026-05-21T11:57:43.762Z","epicId":"text-ops","sliceId":"count","actor":"evaluator","event":"eval-done","payload":{"done":true,"reasoning":"All 26 tests in tests/count.test.ts pass successfully. The implementation includes: (1) a pure count() function exported from src/server/count.ts that counts whitespace-separated words and returns 0 for empty/whitespace-only input, (2) a count CLI subcommand that reads argv[2] and prints the word count to stdout with proper exit code and formatting. Test coverage includes pure function behavior (empty strings, whitespace delimiters, leading/trailing whitespace, tabs, newlines, long inputs) and CLI behavior (exit codes, stdout output, empty arguments, whitespace handling, output formatting). No failing tests."}} {"id":"rpt-evaluator-slugify-1779364682705-21","ts":"2026-05-21T11:58:02.705Z","epicId":"text-ops","sliceId":"slugify","actor":"evaluator","event":"eval-done","payload":{"done":false,"reasoning":"The verification target tests/slugify.test.ts does not exist. The test file is required to validate the slice specification, but it is missing from the tests/ directory. The current directory only contains: cli-scaffolding.integration.test.ts, count.test.ts, help.test.ts, reverse.test.ts, and version.test.ts. No implementation of the slugify subcommand or its tests have been created."}} {"id":"rpt-test-writer-slugify-1779364751272-22","ts":"2026-05-21T11:59:11.274Z","epicId":"text-ops","sliceId":"slugify","actor":"test-writer","event":"tests-written","payload":{"sliceId":"slugify","targets":["tests/slugify.test.ts"]}} {"id":"rpt-code-writer-slugify-1779364804214-23","ts":"2026-05-21T12:00:04.214Z","epicId":"text-ops","sliceId":"slugify","actor":"code-writer","event":"code-written","payload":{"sliceId":"slugify"}} {"id":"rpt-test-runner-slugify-1779364806089-24","ts":"2026-05-21T12:00:06.089Z","epicId":"text-ops","sliceId":"slugify","actor":"test-runner","event":"tests-run","payload":{"passed":true,"output":"bun test v1.3.0 (b0a6feca)\n"}} {"id":"rpt-evaluator-slugify-1779364851082-25","ts":"2026-05-21T12:00:51.082Z","epicId":"text-ops","sliceId":"slugify","actor":"evaluator","event":"eval-done","payload":{"done":true,"reasoning":"All 46 tests in tests/slugify.test.ts pass. The implementation satisfies the slice specification:\n\n1. Pure function tests (24 tests) verify:\n - Lowercasing (all-uppercase, mixed-case)\n - Non-alphanumeric replacement with dashes (spaces, hyphens, underscores, dots, special chars)\n - Dash collapsing (consecutive dashes, mixed separators)\n - Leading/trailing dash trimming\n - Numeric preservation\n - Edge cases (empty strings, whitespace-only, special-char-only)\n\n2. Unicode diacritic tests (10 tests) verify:\n - Diacritic stripping for é, ü, ö, à, Ñ, and others\n - Combined diacritic + case handling\n\n3. CLI subcommand tests (12 tests) verify:\n - Exit code 0 on success\n - Correct stdout output (slug on single line with newline termination)\n - No stderr output on normal invocation\n - All slugify behaviors work through CLI interface\n\nImplementation in src/server/slugify.ts uses Unicode NFD normalization + combining-mark removal for diacritics, followed by the required transformations. CLI integration in src/server/cli.ts correctly handles the 'slugify' subcommand."}} {"id":"rpt-orchestrator-reverse-1779364971688-26","ts":"2026-05-21T12:02:51.688Z","epicId":"text-ops","sliceId":"reverse","actor":"orchestrator","event":"epic-verified","payload":{"passed":true}}