Milestone 1 for ADR-269: Refactor dev-team for cloud and credentials#29
Merged
Conversation
* Initial draft of the new agent orchestration spec * Finalized spec with tasks defined
… task-runner agent (#27) * Updating spec with work item IDs * ADR-273: Implement core step-machine — dev_team.py, dev-team.md loop, task-runner agent - dev_team.py: refactored from subprocess-spawn orchestrator to step machine. Each step exits with a JSON descriptor when an agent is needed (exit_with_action). Removes call_agent(), _MdStreamWriter, _resolve_claude(). Adds --context-file required arg; log_dir derived from context file path. PipelineContext gains signoff_cycle_count, consecutive_failures, review_cycle_count, troubleshooter_input, pending_agent, signoff_review, signoff_research fields. Troubleshooter trigger conditions check thresholds and route to exit_with_action. - test_dev_team.py: 31 unit tests covering exit_with_action (subprocess isolation), compute_context_path (with/without DEV_TEAM_STATE_DIR), and counter increment/reset logic for all three counters (signoff_cycle_count, review_cycle_count, consecutive_failures) including save/load roundtrip. - agents/task-runner.md: new agent that owns the orchestration protocol — reads context sections, invokes a skill, writes result to context file, returns exactly one result_format line. - commands/dev-team.md: new orchestration loop skill that computes the context file path, drives the step-machine loop, and spawns task-runner or troubleshooter agents via the Agent tool. - commands/implement.md, fix.md: updated to invoke dev-team skill instead of run-workflow. - commands/run-workflow.md: deprecated with redirect to dev-team.md. Key decisions: SignoffStep runs reviewer-sign-off and researcher-validate sequentially (not in parallel) tracking completion via context file sections; PR URL parsed from "PR URL" section by PipelineContext.load(); compute_context_path provided as tested helper; review_cycle_count increments on every ReviewStep completion. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Move validate scripts to the REPO_ROOT/scripts folder where it is expected * ADR-273: Clear review_notes in FixPrStep so ReviewStep re-runs reviewer on next cycle * ADR-273: Remove unconditional _handle_agent_success from pipeline loop; add explicit calls in SignoffStep on each agent result detection * ADR-273: Fix researcher-validate result parsing — exact indicator match with JSON-array fallback, replacing fragile substring check * ADR-273: Pass troubleshooter user answer via stdin heredoc instead of inline string interpolation to prevent shell injection * ADR-273: Rename _make_ctx() to make_sut() per CONTRIBUTING.md naming convention * Fixscript paths in build-and-test.yml Updated script paths in the build and test workflow. * ADR-273: Apply PR #27 review comment fixes (Items 1-8) - Item 1: Fix re.sub lambda in dev-team.md troubleshooter_input update - Item 2: Clarify agent field in task-runner.md is display/logging only - Item 3: task-runner spawns sub-agent via Agent tool instead of calling Skill directly; remove Skill/Bash/Glob/Grep from frontmatter tools - Item 4: Add --print-context-path CLI flag to dev_team.py; update dev-team.md Step 1 to use it - Item 5: Add message field to every exit_with_action call site; orchestration loop displays it - Item 6: ReviewStep extracts pr_url from PR URL section and saves to frontmatter; remove section fallback from load() - Item 7: ValidateStep already runs build/test before review cycle — no change needed - Item 8: SignoffStep emits parallel actions descriptor; FixPrStep uses gh pr checks when PR exists Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Update spec to support parallelism and agent-run scripts * Bump version from 1.0.2 to 1.2.0 * ADR-273: Add get-context-path.sh — deterministic shell script for slug extraction Replaces the model-interpreted git URL derivation with a standalone bash script that handles HTTPS/.git, HTTPS, SSH/.git, and SSH remote URL forms via sed, then delegates to dev_team.py --print-context-path. * ADR-273: Replace Step 1 model-interpreted git URL with deterministic script call Removes the prose asking the model to parse `git remote get-url origin` output manually. Step 1 now calls get-context-path.sh, making the slug derivation cheap and deterministic. Adds a note that Git Bash (bundled with Git-for-Windows) is the runtime on Windows. * ADR-273: Add parameterised tests for get-context-path.sh slug extraction Adds TestGetContextPathShSlugExtraction (15 cases covering four remote URL forms) and TestGetContextPathShErrorHandling (git-fails and no-args). Uses GIT_REMOTE_URL_OVERRIDE env seam to avoid cross-platform fake-binary issues. * Update spec with Step protocol refactor task (ADR-277) * ADR-273: Rename exit_with_action to exit_with_actions emitting JSON array; add script-runner agent - exit_with_action(dict) -> exit_with_actions(list[dict]): stdout is now a JSON array so the orchestration loop can dispatch mixed spawn_agent + run_script items in a single step - All ~17 call sites wrapped in [...] - SignOffStep.run(): parallel case now emits a flat array with reviewer, researcher, and run_script (validate-build + validate-tests) items; in-process script execution removed; sequential fallbacks emit single-item arrays; re-entry checks signoff_build_result section (populated by dev-team.md from script-runner result) - PipelineContext gains signoff_build_result field (section, not frontmatter) to track build/test pass-fail across pipeline re-invocations - dev-team.md: parse JSON array from stdout; remove nested "actions" list branch; add run_script dispatch via script-runner agent; write run_script result to write_section in context file after parallel collect - task-runner.md: remove Agent tool, add Skill/Bash/Glob/Grep; Step 3 uses Skill(<skill-name>) directly instead of Agent(subagent_type=...) - agents/script-runner.md: new agent — runs one command, writes log, returns "passed — log: <path>" or "failed — log: <path>" - test_dev_team.py: updated helper and test classes to verify array output format; added TestSignoffBuildResult (5 tests); 81 tests total, all pass Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * ADR-273: Fix signoff_build_result check to use startswith; update tests to real format script-runner returns "passed — log: <path>", not bare "passed"; the equality check `== "passed"` would always fail. Changed to `.startswith("passed")`. Updated TestSignoffBuildResult round-trip tests to store the real script-runner format string; added two tests that explicitly verify the startswith logic. * ADR-273: Replace local build/test with wait-pr-checks.sh script for CI-backed signoff * ADR-273: Remove redundant single-spawn_agent special case; general parallel block handles it --------- Co-authored-by: Joe Davis <ElwoodMoves@hotmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Joe Davis <jodasoft@outlook.com>
* Add the ADR work item ID for Task 2 * Fix pipeline issues from first run 1. The "validating" step is synchronous, blocking the step engine script with no feedback. Change it to return a script task to the agent. The task can simply run scripts/validate, which will run validate-build and then validate-test if build succeeds 2. The Create PR step failed with a message that the "task-runner doesn't have shell access". Not sure what that means, task-runner used CLI tools in other tasks and it worked. 3. The developer didn't create a branch for the task prior to starting implementation. 4. The developer didn't commit its changes after implementation. Changes were committed as "uncommitted changes during validation". Come to think of it, maybe the developer couldn't use shell access either 5. The reviewer had no comments after it ran the first time, and returned "approved" as its result. But the script outputed "Reviewer requested changes". This might just be an output error, I don't think the developer was invoked to fix anything 6. Agents are using the gh CLI, which is still asking me to choose an account. In general, they should all use the GitHub MCP whenever possible. If they must use the gh CLI, I want them to use the token that's configured as the GH_TOKEN environment variable. Is there a different way to provide this token just to the agent environment, so it will be used more reliably? 7. During sign-off, all three parallel tasks passed, but the pipeline proceeded as if they had failed. The developer was invoked, had nothing to fix, and then all three sign-offs passed again but it continued to fixing-pr again as if they had failed.
build-and-test: Python test resultsStatus: ✅ Passed Test log |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Commit first milestone for ADR-269. The pipeline is refactored and several bugs are fixed, allowing it to be generally available.