Refactor the workflow support into a set of skills with their own scripts#40
Merged
jodavis merged 21 commits intoJun 18, 2026
Merged
Conversation
The workflow-setup skill ensures we are on the right branch and that the context file is initialized. All other skills will depend on this one, so that any skill will start a context file and work in the right branch regardless of how it is invoked. The researcher-plan skill is the first to take advatage of this.
…re they behave correctly within a workflow context.
- Developer needs access to GitHub MCP for reading comments - Researcher and Reviewer need Edit to update the context file - Moved workflow-related scripts to the workflow-orchestrator skill - Deleted some obsolete commands (create-branch, dev-team) and steps (SetupWorkspaceStep) - Developer should not resolve review comments - Pass the precomputed log-file into workflow-script
The developer agent was prepending `cd <repo-root> &&` before git commands despite the Claude Code system prompt already prohibiting this pattern. Added an explicit reminder in both developer-implement and developer-fix so the instruction is visible inside the skill context where the agent operates. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ssful agent result Previously the orchestrator would stop and report on dev_team.py non-zero exit (which invited the top-level agent to investigate inline, building up context). It also had no handling at all when a workflow-worker or workflow-script returned anything other than 'successful'. Now both conditions spawn the troubleshooter sub-agent, keeping investigation out of the orchestrator's context window. Also added the `workflow-troubleshooter` skill, since it was missing. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…cks.sh `gh pr checks --watch` relies on interactive terminal features and was returning immediately when run in a non-interactive subprocess (the script-runner agent), causing the signoff step to evaluate checks before they had completed. Replace with an explicit polling loop that queries the JSON bucket field every 15 seconds until no checks remain in the "pending" bucket, with a 30-minute timeout. Also changed exit codes: exit 1 on failure or timeout, exit 0 only on pass, so the workflow-script agent records a meaningful failure message in the context file rather than always writing "Succeeded". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The workflow-script agent writes 'Succeeded' to the context section when the
script exits 0, and a failure description when it exits non-zero. The signoff
build-validation step was checking startswith("passed"), which never matched
'Succeeded', so the step always returned "failed" even when all checks passed.
This caused the signoff->fixing-pr loop to cycle indefinitely on a clean PR.
Updated both BuildValidationStep.handle_results() and SignoffStep.handle_results()
to check for 'Succeeded', consistent with what workflow-script actually writes.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…jects
The original Fix 4 was wrong in two ways:
- For spawn_agent skills (debugger-investigate, developer-create-pr), the step
handlers were using text heuristics instead of parsing the JSON the skills
already write to their sections.
- For the run_script build-validation step, changing to 'Succeeded' addressed
the symptom but not the root cause: the section should carry structured status,
not a generic word.
Changes:
workflow-script/SKILL.md: when the last non-empty log line is a valid JSON
object, write it as the section result instead of 'Succeeded'. Scripts that
don't output JSON continue to get 'Succeeded'/'failure description' as before.
wait-pr-checks.sh: emit a JSON status object as the final stdout line on every
exit path so workflow-script picks it up into the section:
{"status": "passed"} or {"status": "failed", "reason": "..."}
dev_team.py:
- DebugStep: read {"status": "reproduced"} from debug_report instead of
scanning for the heading string "# Debug report for"
- CreatePrStep: read {"pr_url": "..."} from the PR URL section instead of
applying a regex over the raw section text
- BuildValidationStep / SignoffStep: revert 'Succeeded' back to 'passed', now
correctly sourced from parse_json_output(ctx.signoff_build_result)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
build-and-test: Python test resultsStatus: ✅ Passed Test log |
There was a problem hiding this comment.
Pull request overview
This PR refactors the dev-team pipeline’s “workflow support” into first-class skills (orchestrator/worker/script/setup/troubleshooter), removing intermediary wrapper agents and moving common workflow logic into reusable scripts and standardized context-file conventions.
Changes:
- Adds new workflow skills (
workflow-orchestrate,workflow-worker,workflow-script,workflow-setup,workflow-troubleshoot) plus supporting scripts/assets for context-file management and PR-check polling. - Updates
dev_team.pyorchestration descriptors and result parsing to align with the new worker/script execution model and structured JSON outputs. - Removes legacy wrapper agents/commands (task-runner, old script-runner, dev-team.md, workspace-setup) and updates existing command prompts to rely on the new workflow approach.
Reviewed changes
Copilot reviewed 33 out of 36 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| plugins/dev-team/skills/workflow-worker/SKILL.md | New worker protocol for invoking a skill and writing output to a context section. |
| plugins/dev-team/skills/workflow-troubleshoot/SKILL.md | New troubleshooter skill spec for diagnosing pipeline failures via context file edits. |
| plugins/dev-team/skills/workflow-setup/SKILL.md | New setup skill spec to resolve context/spec/base branch and prep working branch. |
| plugins/dev-team/skills/workflow-setup/scripts/prepare-working-branch.py | New git-branch preparation script for workflow runs. |
| plugins/dev-team/skills/workflow-setup/scripts/init-context-file.py | Initializes context files from a template if missing. |
| plugins/dev-team/skills/workflow-setup/scripts/find-spec-file.py | Finds _spec_*.md containing the work item and writes spec_path to context. |
| plugins/dev-team/skills/workflow-setup/scripts/compute-context-file.py | Computes canonical context file path from repo slug and work item. |
| plugins/dev-team/skills/workflow-setup/assets/context_template.md | Template for workflow context frontmatter and header. |
| plugins/dev-team/skills/workflow-script/SKILL.md | New script-runner protocol: run a command, log output, write result+log path to context. |
| plugins/dev-team/skills/workflow-orchestrate/SKILL.md | New orchestrator loop spec replacing the legacy dev-team.md orchestration wrapper. |
| plugins/dev-team/skills/workflow-orchestrate/scripts/wait-pr-checks.sh | New polling-based PR checks waiter that emits a final JSON status line. |
| plugins/dev-team/skills/workflow-orchestrate/scripts/test_dev_team.py | Adds/updates tests for dev_team.py step-machine and helpers. |
| plugins/dev-team/skills/workflow-orchestrate/scripts/get-context-path.sh | Canonical context-path resolver delegating to dev_team.py --print-context-path. |
| plugins/dev-team/skills/workflow-orchestrate/scripts/dev_team.py | Updates step-machine behavior and descriptors to work with new skills + JSON parsing. |
| plugins/dev-team/skills/workflow-orchestrate/assets/implement-task-plan.md | Refactors the implement workflow definition (Mermaid state machine). |
| plugins/dev-team/skills/workflow-orchestrate/assets/fix-issue-plan.md | Adds a new “fix issue” workflow definition with a debugging phase. |
| plugins/dev-team/skills/identify-project-work-items/SKILL.md | New skill documenting project work-item ID/type patterns. |
| plugins/dev-team/scripts/wait-pr-checks.sh | Removes legacy wait-pr-checks script (moved under workflow-orchestrate). |
| plugins/dev-team/scripts/test_contributing_md.py | Removes CONTRIBUTING.md existence/section tests. |
| plugins/dev-team/commands/workspace-setup.md | Removes legacy workspace-setup command (setup moved into workflow-setup). |
| plugins/dev-team/commands/reviewer-sign-off.md | Updates sign-off reviewer instructions to rely on context file inputs. |
| plugins/dev-team/commands/reviewer-review.md | Updates reviewer instructions to rely on context file inputs. |
| plugins/dev-team/commands/researcher-validate.md | Updates researcher validation instructions to rely on context file inputs. |
| plugins/dev-team/commands/researcher-plan.md | Updates researcher planning instructions to rely on context file inputs. |
| plugins/dev-team/commands/implement.md | Switches to invoking workflow-orchestrate instead of the old dev-team skill. |
| plugins/dev-team/commands/developer-implement.md | Updates developer implementation instructions for the new workflow model. |
| plugins/dev-team/commands/developer-fix.md | Updates developer fix instructions for the new workflow model. |
| plugins/dev-team/commands/developer-create-pr.md | Updates PR creation instructions to use context-file fields for base/pr state. |
| plugins/dev-team/commands/dev-team.md | Removes legacy orchestration command doc. |
| plugins/dev-team/commands/create-branch.md | Removes legacy branch-creation command doc. |
| plugins/dev-team/agents/workspace-setup.md | Removes legacy workspace-setup agent. |
| plugins/dev-team/agents/task-runner.md | Removes legacy task-runner wrapper agent. |
| plugins/dev-team/agents/script-runner.md | Simplifies script-runner agent to the new workflow-script/context-editing model. |
| plugins/dev-team/agents/reviewer.md | Adds Edit tool to reviewer agent for context-file section updates. |
| plugins/dev-team/agents/researcher.md | Adds Edit tool to researcher agent for context-file section updates. |
| plugins/dev-team/agents/developer.md | Adds GitHub MCP tools for PR reading/review/comment/update actions. |
Comments suppressed due to low confidence (1)
plugins/dev-team/skills/workflow-orchestrate/assets/implement-task-plan.md:4
- This workflow jumps from
initstraight toresearching, which meansspec_pathwill never be populated by dev_team.py (since FindSpecStep runs only inspec-finding). With the default context template leavingspec_pathempty, the researcher step won’t have an authoritative spec path to read.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
jodavis
requested changes
Jun 18, 2026
Fix 'tthe' typo on line 22, and extend Step 3 to also fetch PR check failures alongside review comment threads.
The line 'You are performing the first-pass code review for the work item described in the context file.' was accidentally removed during the refactor.
…lement.md workflow-orchestrate expects --work-item-id, --workflow, and --research-skill named arguments; the previous invocation used positional args which don't match.
…ILL.md The --workflow flag was referencing $SCRIPT_DIR which is undefined; all other paths in the file correctly use $SKILL_DIR.
…up/SKILL.md Add a 'Determining work-item-type' table so the branches that reference work-item-type are actionable. Fix duplicate '4e' heading (renamed second to '4f') and correct typo 'ins' -> 'is'.
…repare-working-branch.py - Add .expanduser() so ~-prefixed context file paths resolve correctly. - Make pull() exit on failure; it is only called when the branch is known to exist on the remote, so any pull error is a real problem. - Update checkout() to use 'git checkout -b <branch> origin/<branch>' when the branch exists only on the remote, avoiding failures on clean clones where no local ref exists yet.
…SON section format The test was not updated when Fix 4 changed CreatePrStep.handle_results() to parse a JSON object (via parse_json_output) instead of a raw URL string.
…faulting to 0 Previously, '|| echo "0"' masked auth errors, network failures, and missing gh CLI by treating them as "no pending/failing checks" and reporting success. Now uses 'if !' to capture the error output and exit with a JSON failure object when gh returns a non-zero exit code.
jodavis
approved these changes
Jun 18, 2026
jodavis
pushed a commit
that referenced
this pull request
Jun 19, 2026
…ipts (#40) * Create the workflow-setup skill and consume it from researcher-plan. The workflow-setup skill ensures we are on the right branch and that the context file is initialized. All other skills will depend on this one, so that any skill will start a context file and work in the right branch regardless of how it is invoked. The researcher-plan skill is the first to take advatage of this. * Create the workflow-worker skill, which wraps other skills to make sure they behave correctly within a workflow context. * Integrate workflow-setup into developer-implement * Create the `workflow-script` skill to replace the `script-runner` agent * Add context file usage to all the step commands used by `implement-task-pipeline.md` * workflow-orchestrate skill to run the orchestration loop * Bug fixes in the implement-task workflow * Fix more bugs in the workflow system - Developer needs access to GitHub MCP for reading comments - Researcher and Reviewer need Edit to update the context file - Moved workflow-related scripts to the workflow-orchestrator skill - Deleted some obsolete commands (create-branch, dev-team) and steps (SetupWorkspaceStep) - Developer should not resolve review comments - Pass the precomputed log-file into workflow-script * Fix 1: Forbid cd-prefixed git commands in developer skill files The developer agent was prepending `cd <repo-root> &&` before git commands despite the Claude Code system prompt already prohibiting this pattern. Added an explicit reminder in both developer-implement and developer-fix so the instruction is visible inside the skill context where the agent operates. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix 2: Spawn troubleshooter on non-zero dev_team.py exit or non-successful agent result Previously the orchestrator would stop and report on dev_team.py non-zero exit (which invited the top-level agent to investigate inline, building up context). It also had no handling at all when a workflow-worker or workflow-script returned anything other than 'successful'. Now both conditions spawn the troubleshooter sub-agent, keeping investigation out of the orchestrator's context window. Also added the `workflow-troubleshooter` skill, since it was missing. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix 3: Replace --watch flag with explicit polling loop in wait-pr-checks.sh `gh pr checks --watch` relies on interactive terminal features and was returning immediately when run in a non-interactive subprocess (the script-runner agent), causing the signoff step to evaluate checks before they had completed. Replace with an explicit polling loop that queries the JSON bucket field every 15 seconds until no checks remain in the "pending" bucket, with a 30-minute timeout. Also changed exit codes: exit 1 on failure or timeout, exit 0 only on pass, so the workflow-script agent records a meaningful failure message in the context file rather than always writing "Succeeded". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix 4: Check for 'Succeeded' not 'passed' in signoff build result The workflow-script agent writes 'Succeeded' to the context section when the script exits 0, and a failure description when it exits non-zero. The signoff build-validation step was checking startswith("passed"), which never matched 'Succeeded', so the step always returned "failed" even when all checks passed. This caused the signoff->fixing-pr loop to cycle indefinitely on a clean PR. Updated both BuildValidationStep.handle_results() and SignoffStep.handle_results() to check for 'Succeeded', consistent with what workflow-script actually writes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix 4 (revised): Standardize section result parsing on JSON status objects The original Fix 4 was wrong in two ways: - For spawn_agent skills (debugger-investigate, developer-create-pr), the step handlers were using text heuristics instead of parsing the JSON the skills already write to their sections. - For the run_script build-validation step, changing to 'Succeeded' addressed the symptom but not the root cause: the section should carry structured status, not a generic word. Changes: workflow-script/SKILL.md: when the last non-empty log line is a valid JSON object, write it as the section result instead of 'Succeeded'. Scripts that don't output JSON continue to get 'Succeeded'/'failure description' as before. wait-pr-checks.sh: emit a JSON status object as the final stdout line on every exit path so workflow-script picks it up into the section: {"status": "passed"} or {"status": "failed", "reason": "..."} dev_team.py: - DebugStep: read {"status": "reproduced"} from debug_report instead of scanning for the heading string "# Debug report for" - CreatePrStep: read {"pr_url": "..."} from the PR URL section instead of applying a regex over the raw section text - BuildValidationStep / SignoffStep: revert 'Succeeded' back to 'passed', now correctly sourced from parse_json_output(ctx.signoff_build_result) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * PR #40: Fix typo and add PR check failures fetch in developer-fix.md Fix 'tthe' typo on line 22, and extend Step 3 to also fetch PR check failures alongside review comment threads. * PR #40: Restore first-pass code review intro line in reviewer-review.md The line 'You are performing the first-pass code review for the work item described in the context file.' was accidentally removed during the refactor. * PR #40: Use named arguments when invoking workflow-orchestrate in implement.md workflow-orchestrate expects --work-item-id, --workflow, and --research-skill named arguments; the previous invocation used positional args which don't match. * PR #40: Fix $SCRIPT_DIR typo to $SKILL_DIR in workflow-orchestrate/SKILL.md The --workflow flag was referencing $SCRIPT_DIR which is undefined; all other paths in the file correctly use $SKILL_DIR. * PR #40: Clarify work-item-type and fix step 4e issues in workflow-setup/SKILL.md Add a 'Determining work-item-type' table so the branches that reference work-item-type are actionable. Fix duplicate '4e' heading (renamed second to '4f') and correct typo 'ins' -> 'is'. * PR #40: Fix path expansion, fatal pull, and remote-only checkout in prepare-working-branch.py - Add .expanduser() so ~-prefixed context file paths resolve correctly. - Make pull() exit on failure; it is only called when the branch is known to exist on the remote, so any pull error is a real problem. - Update checkout() to use 'git checkout -b <branch> origin/<branch>' when the branch exists only on the remote, avoiding failures on clean clones where no local ref exists yet. * PR #40: Fix test_handle_results_extracts_pr_url_from_section to use JSON section format The test was not updated when Fix 4 changed CreatePrStep.handle_results() to parse a JSON object (via parse_json_output) instead of a raw URL string. * PR #40: Fail wait-pr-checks.sh when gh pr checks errors instead of defaulting to 0 Previously, '|| echo "0"' masked auth errors, network failures, and missing gh CLI by treating them as "no pending/failing checks" and reporting success. Now uses 'if !' to capture the error output and exit with a JSON failure object when gh returns a non-zero exit code. --------- Co-authored-by: Joe Davis <ElwoodMoves@hotmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Eliminates the need for intermediary agents, and makes it easier to share credentials and tools with subagents.