Skip to content

Milestone 1 for ADR-269: Refactor dev-team for cloud and credentials#29

Merged
jodavis merged 4 commits into
mainfrom
feature/ADR-269-milestone-1
Jun 11, 2026
Merged

Milestone 1 for ADR-269: Refactor dev-team for cloud and credentials#29
jodavis merged 4 commits into
mainfrom
feature/ADR-269-milestone-1

Conversation

@jodavis

@jodavis jodavis commented Jun 11, 2026

Copy link
Copy Markdown
Owner

Commit first milestone for ADR-269. The pipeline is refactored and several bugs are fixed, allowing it to be generally available.

jodavis and others added 3 commits June 9, 2026 14:06
* Initial draft of the new agent orchestration spec

* Finalized spec with tasks defined
… task-runner agent (#27)

* Updating spec with work item IDs

* ADR-273: Implement core step-machine — dev_team.py, dev-team.md loop, task-runner agent

- dev_team.py: refactored from subprocess-spawn orchestrator to step machine.
  Each step exits with a JSON descriptor when an agent is needed (exit_with_action).
  Removes call_agent(), _MdStreamWriter, _resolve_claude(). Adds --context-file
  required arg; log_dir derived from context file path. PipelineContext gains
  signoff_cycle_count, consecutive_failures, review_cycle_count, troubleshooter_input,
  pending_agent, signoff_review, signoff_research fields. Troubleshooter trigger
  conditions check thresholds and route to exit_with_action.

- test_dev_team.py: 31 unit tests covering exit_with_action (subprocess isolation),
  compute_context_path (with/without DEV_TEAM_STATE_DIR), and counter
  increment/reset logic for all three counters (signoff_cycle_count,
  review_cycle_count, consecutive_failures) including save/load roundtrip.

- agents/task-runner.md: new agent that owns the orchestration protocol —
  reads context sections, invokes a skill, writes result to context file,
  returns exactly one result_format line.

- commands/dev-team.md: new orchestration loop skill that computes the context
  file path, drives the step-machine loop, and spawns task-runner or
  troubleshooter agents via the Agent tool.

- commands/implement.md, fix.md: updated to invoke dev-team skill instead of
  run-workflow.

- commands/run-workflow.md: deprecated with redirect to dev-team.md.

Key decisions: SignoffStep runs reviewer-sign-off and researcher-validate
sequentially (not in parallel) tracking completion via context file sections;
PR URL parsed from "PR URL" section by PipelineContext.load(); compute_context_path
provided as tested helper; review_cycle_count increments on every ReviewStep
completion.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Move validate scripts to the REPO_ROOT/scripts folder where it is expected

* ADR-273: Clear review_notes in FixPrStep so ReviewStep re-runs reviewer on next cycle

* ADR-273: Remove unconditional _handle_agent_success from pipeline loop; add explicit calls in SignoffStep on each agent result detection

* ADR-273: Fix researcher-validate result parsing — exact indicator match with JSON-array fallback, replacing fragile substring check

* ADR-273: Pass troubleshooter user answer via stdin heredoc instead of inline string interpolation to prevent shell injection

* ADR-273: Rename _make_ctx() to make_sut() per CONTRIBUTING.md naming convention

* Fixscript paths in build-and-test.yml

Updated script paths in the build and test workflow.

* ADR-273: Apply PR #27 review comment fixes (Items 1-8)

- Item 1: Fix re.sub lambda in dev-team.md troubleshooter_input update
- Item 2: Clarify agent field in task-runner.md is display/logging only
- Item 3: task-runner spawns sub-agent via Agent tool instead of calling Skill directly; remove Skill/Bash/Glob/Grep from frontmatter tools
- Item 4: Add --print-context-path CLI flag to dev_team.py; update dev-team.md Step 1 to use it
- Item 5: Add message field to every exit_with_action call site; orchestration loop displays it
- Item 6: ReviewStep extracts pr_url from PR URL section and saves to frontmatter; remove section fallback from load()
- Item 7: ValidateStep already runs build/test before review cycle — no change needed
- Item 8: SignoffStep emits parallel actions descriptor; FixPrStep uses gh pr checks when PR exists

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Update spec to support parallelism and agent-run scripts

* Bump version from 1.0.2 to 1.2.0

* ADR-273: Add get-context-path.sh — deterministic shell script for slug extraction

Replaces the model-interpreted git URL derivation with a standalone bash
script that handles HTTPS/.git, HTTPS, SSH/.git, and SSH remote URL forms
via sed, then delegates to dev_team.py --print-context-path.

* ADR-273: Replace Step 1 model-interpreted git URL with deterministic script call

Removes the prose asking the model to parse `git remote get-url origin`
output manually. Step 1 now calls get-context-path.sh, making the slug
derivation cheap and deterministic. Adds a note that Git Bash (bundled
with Git-for-Windows) is the runtime on Windows.

* ADR-273: Add parameterised tests for get-context-path.sh slug extraction

Adds TestGetContextPathShSlugExtraction (15 cases covering four remote URL
forms) and TestGetContextPathShErrorHandling (git-fails and no-args). Uses
GIT_REMOTE_URL_OVERRIDE env seam to avoid cross-platform fake-binary issues.

* Update spec with Step protocol refactor task (ADR-277)

* ADR-273: Rename exit_with_action to exit_with_actions emitting JSON array; add script-runner agent

- exit_with_action(dict) -> exit_with_actions(list[dict]): stdout is now a JSON
  array so the orchestration loop can dispatch mixed spawn_agent + run_script items
  in a single step
- All ~17 call sites wrapped in [...]
- SignOffStep.run(): parallel case now emits a flat array with reviewer, researcher,
  and run_script (validate-build + validate-tests) items; in-process script execution
  removed; sequential fallbacks emit single-item arrays; re-entry checks
  signoff_build_result section (populated by dev-team.md from script-runner result)
- PipelineContext gains signoff_build_result field (section, not frontmatter) to
  track build/test pass-fail across pipeline re-invocations
- dev-team.md: parse JSON array from stdout; remove nested "actions" list branch;
  add run_script dispatch via script-runner agent; write run_script result to
  write_section in context file after parallel collect
- task-runner.md: remove Agent tool, add Skill/Bash/Glob/Grep; Step 3 uses
  Skill(<skill-name>) directly instead of Agent(subagent_type=...)
- agents/script-runner.md: new agent — runs one command, writes log, returns
  "passed — log: <path>" or "failed — log: <path>"
- test_dev_team.py: updated helper and test classes to verify array output format;
  added TestSignoffBuildResult (5 tests); 81 tests total, all pass

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ADR-273: Fix signoff_build_result check to use startswith; update tests to real format

script-runner returns "passed — log: <path>", not bare "passed"; the equality
check `== "passed"` would always fail. Changed to `.startswith("passed")`.

Updated TestSignoffBuildResult round-trip tests to store the real script-runner
format string; added two tests that explicitly verify the startswith logic.

* ADR-273: Replace local build/test with wait-pr-checks.sh script for CI-backed signoff

* ADR-273: Remove redundant single-spawn_agent special case; general parallel block handles it

---------

Co-authored-by: Joe Davis <ElwoodMoves@hotmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Joe Davis <jodasoft@outlook.com>
* Add the ADR work item ID for Task 2

* Fix pipeline issues from first run
1. The "validating" step is synchronous, blocking the step engine script with no feedback. Change it to return a script task to the agent. The task can simply run scripts/validate, which will run validate-build and then validate-test if build succeeds
2. The Create PR step failed with a message that the "task-runner doesn't have shell access". Not sure what that means, task-runner used CLI tools in other tasks and it worked.
3. The developer didn't create a branch for the task prior to starting implementation.
4. The developer didn't commit its changes after implementation. Changes were committed as "uncommitted changes during validation". Come to think of it, maybe the developer couldn't use shell access either
5. The reviewer had no comments after it ran the first time, and returned "approved" as its result. But the script outputed "Reviewer requested changes". This might just be an output error, I don't think the developer was invoked to fix anything
6. Agents are using the gh CLI, which is still asking me to choose an account. In general, they should all use the GitHub MCP whenever possible. If they must use the gh CLI, I want them to use the token that's configured as the GH_TOKEN environment variable. Is there a different way to provide this token just to the agent environment, so it will be used more reliably?
7. During sign-off, all three parallel tasks passed, but the pipeline proceeded as if they had failed. The developer was invoked, had nothing to fix, and then all three sign-offs passed again but it continued to fixing-pr again as if they had failed.
@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown

build-and-test: Python test results

Status: ✅ Passed

Test log

@jodavis jodavis merged commit 2da2dc1 into main Jun 11, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants