Skip to content

Refactor the workflow support into a set of skills with their own scripts#40

Merged
jodavis merged 21 commits into
feature/ADR-269-agent-orchestrationfrom
dev/jodavis/skills-refactor
Jun 18, 2026
Merged

Refactor the workflow support into a set of skills with their own scripts#40
jodavis merged 21 commits into
feature/ADR-269-agent-orchestrationfrom
dev/jodavis/skills-refactor

Conversation

@jodavis-claude

Copy link
Copy Markdown
Collaborator

Eliminates the need for intermediary agents, and makes it easier to share credentials and tools with subagents.

ElwoodMoves and others added 13 commits June 16, 2026 08:55
The workflow-setup skill ensures we are on the right branch and that the context file is initialized. All other skills will depend on this one, so that any skill will start a context file and work in the right branch regardless of how it is invoked. The researcher-plan skill is the first to take advatage of this.
…re they behave correctly within a workflow context.
- Developer needs access to GitHub MCP for reading comments
- Researcher and Reviewer need Edit to update the context file
- Moved workflow-related scripts to the workflow-orchestrator skill
- Deleted some obsolete commands (create-branch, dev-team) and steps (SetupWorkspaceStep)
- Developer should not resolve review comments
- Pass the precomputed log-file into workflow-script
The developer agent was prepending `cd <repo-root> &&` before git commands
despite the Claude Code system prompt already prohibiting this pattern. Added
an explicit reminder in both developer-implement and developer-fix so the
instruction is visible inside the skill context where the agent operates.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ssful agent result

Previously the orchestrator would stop and report on dev_team.py non-zero exit
(which invited the top-level agent to investigate inline, building up context).
It also had no handling at all when a workflow-worker or workflow-script returned
anything other than 'successful'.

Now both conditions spawn the troubleshooter sub-agent, keeping investigation
out of the orchestrator's context window.

Also added the `workflow-troubleshooter` skill, since it was missing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…cks.sh

`gh pr checks --watch` relies on interactive terminal features and was returning
immediately when run in a non-interactive subprocess (the script-runner agent),
causing the signoff step to evaluate checks before they had completed.

Replace with an explicit polling loop that queries the JSON bucket field every
15 seconds until no checks remain in the "pending" bucket, with a 30-minute
timeout.

Also changed exit codes: exit 1 on failure or timeout, exit 0 only on pass,
so the workflow-script agent records a meaningful failure message in the context
file rather than always writing "Succeeded".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The workflow-script agent writes 'Succeeded' to the context section when the
script exits 0, and a failure description when it exits non-zero. The signoff
build-validation step was checking startswith("passed"), which never matched
'Succeeded', so the step always returned "failed" even when all checks passed.
This caused the signoff->fixing-pr loop to cycle indefinitely on a clean PR.

Updated both BuildValidationStep.handle_results() and SignoffStep.handle_results()
to check for 'Succeeded', consistent with what workflow-script actually writes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…jects

The original Fix 4 was wrong in two ways:
- For spawn_agent skills (debugger-investigate, developer-create-pr), the step
  handlers were using text heuristics instead of parsing the JSON the skills
  already write to their sections.
- For the run_script build-validation step, changing to 'Succeeded' addressed
  the symptom but not the root cause: the section should carry structured status,
  not a generic word.

Changes:

workflow-script/SKILL.md: when the last non-empty log line is a valid JSON
object, write it as the section result instead of 'Succeeded'. Scripts that
don't output JSON continue to get 'Succeeded'/'failure description' as before.

wait-pr-checks.sh: emit a JSON status object as the final stdout line on every
exit path so workflow-script picks it up into the section:
  {"status": "passed"} or {"status": "failed", "reason": "..."}

dev_team.py:
- DebugStep: read {"status": "reproduced"} from debug_report instead of
  scanning for the heading string "# Debug report for"
- CreatePrStep: read {"pr_url": "..."} from the PR URL section instead of
  applying a regex over the raw section text
- BuildValidationStep / SignoffStep: revert 'Succeeded' back to 'passed', now
  correctly sourced from parse_json_output(ctx.signoff_build_result)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown

build-and-test: Python test results

Status: ✅ Passed

Test log

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the dev-team pipeline’s “workflow support” into first-class skills (orchestrator/worker/script/setup/troubleshooter), removing intermediary wrapper agents and moving common workflow logic into reusable scripts and standardized context-file conventions.

Changes:

  • Adds new workflow skills (workflow-orchestrate, workflow-worker, workflow-script, workflow-setup, workflow-troubleshoot) plus supporting scripts/assets for context-file management and PR-check polling.
  • Updates dev_team.py orchestration descriptors and result parsing to align with the new worker/script execution model and structured JSON outputs.
  • Removes legacy wrapper agents/commands (task-runner, old script-runner, dev-team.md, workspace-setup) and updates existing command prompts to rely on the new workflow approach.

Reviewed changes

Copilot reviewed 33 out of 36 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
plugins/dev-team/skills/workflow-worker/SKILL.md New worker protocol for invoking a skill and writing output to a context section.
plugins/dev-team/skills/workflow-troubleshoot/SKILL.md New troubleshooter skill spec for diagnosing pipeline failures via context file edits.
plugins/dev-team/skills/workflow-setup/SKILL.md New setup skill spec to resolve context/spec/base branch and prep working branch.
plugins/dev-team/skills/workflow-setup/scripts/prepare-working-branch.py New git-branch preparation script for workflow runs.
plugins/dev-team/skills/workflow-setup/scripts/init-context-file.py Initializes context files from a template if missing.
plugins/dev-team/skills/workflow-setup/scripts/find-spec-file.py Finds _spec_*.md containing the work item and writes spec_path to context.
plugins/dev-team/skills/workflow-setup/scripts/compute-context-file.py Computes canonical context file path from repo slug and work item.
plugins/dev-team/skills/workflow-setup/assets/context_template.md Template for workflow context frontmatter and header.
plugins/dev-team/skills/workflow-script/SKILL.md New script-runner protocol: run a command, log output, write result+log path to context.
plugins/dev-team/skills/workflow-orchestrate/SKILL.md New orchestrator loop spec replacing the legacy dev-team.md orchestration wrapper.
plugins/dev-team/skills/workflow-orchestrate/scripts/wait-pr-checks.sh New polling-based PR checks waiter that emits a final JSON status line.
plugins/dev-team/skills/workflow-orchestrate/scripts/test_dev_team.py Adds/updates tests for dev_team.py step-machine and helpers.
plugins/dev-team/skills/workflow-orchestrate/scripts/get-context-path.sh Canonical context-path resolver delegating to dev_team.py --print-context-path.
plugins/dev-team/skills/workflow-orchestrate/scripts/dev_team.py Updates step-machine behavior and descriptors to work with new skills + JSON parsing.
plugins/dev-team/skills/workflow-orchestrate/assets/implement-task-plan.md Refactors the implement workflow definition (Mermaid state machine).
plugins/dev-team/skills/workflow-orchestrate/assets/fix-issue-plan.md Adds a new “fix issue” workflow definition with a debugging phase.
plugins/dev-team/skills/identify-project-work-items/SKILL.md New skill documenting project work-item ID/type patterns.
plugins/dev-team/scripts/wait-pr-checks.sh Removes legacy wait-pr-checks script (moved under workflow-orchestrate).
plugins/dev-team/scripts/test_contributing_md.py Removes CONTRIBUTING.md existence/section tests.
plugins/dev-team/commands/workspace-setup.md Removes legacy workspace-setup command (setup moved into workflow-setup).
plugins/dev-team/commands/reviewer-sign-off.md Updates sign-off reviewer instructions to rely on context file inputs.
plugins/dev-team/commands/reviewer-review.md Updates reviewer instructions to rely on context file inputs.
plugins/dev-team/commands/researcher-validate.md Updates researcher validation instructions to rely on context file inputs.
plugins/dev-team/commands/researcher-plan.md Updates researcher planning instructions to rely on context file inputs.
plugins/dev-team/commands/implement.md Switches to invoking workflow-orchestrate instead of the old dev-team skill.
plugins/dev-team/commands/developer-implement.md Updates developer implementation instructions for the new workflow model.
plugins/dev-team/commands/developer-fix.md Updates developer fix instructions for the new workflow model.
plugins/dev-team/commands/developer-create-pr.md Updates PR creation instructions to use context-file fields for base/pr state.
plugins/dev-team/commands/dev-team.md Removes legacy orchestration command doc.
plugins/dev-team/commands/create-branch.md Removes legacy branch-creation command doc.
plugins/dev-team/agents/workspace-setup.md Removes legacy workspace-setup agent.
plugins/dev-team/agents/task-runner.md Removes legacy task-runner wrapper agent.
plugins/dev-team/agents/script-runner.md Simplifies script-runner agent to the new workflow-script/context-editing model.
plugins/dev-team/agents/reviewer.md Adds Edit tool to reviewer agent for context-file section updates.
plugins/dev-team/agents/researcher.md Adds Edit tool to researcher agent for context-file section updates.
plugins/dev-team/agents/developer.md Adds GitHub MCP tools for PR reading/review/comment/update actions.
Comments suppressed due to low confidence (1)

plugins/dev-team/skills/workflow-orchestrate/assets/implement-task-plan.md:4

  • This workflow jumps from init straight to researching, which means spec_path will never be populated by dev_team.py (since FindSpecStep runs only in spec-finding). With the default context template leaving spec_path empty, the researcher step won’t have an authoritative spec path to read.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread plugins/dev-team/skills/workflow-orchestrate/SKILL.md
Comment thread plugins/dev-team/skills/workflow-setup/SKILL.md
Comment thread plugins/dev-team/skills/workflow-setup/SKILL.md
Comment thread plugins/dev-team/skills/workflow-setup/scripts/prepare-working-branch.py Outdated
Comment thread plugins/dev-team/skills/workflow-orchestrate/scripts/wait-pr-checks.sh Outdated
Comment thread plugins/dev-team/commands/developer-fix.md Outdated
Comment thread plugins/dev-team/commands/developer-fix.md
Comment thread plugins/dev-team/commands/developer-fix.md Outdated
Comment thread plugins/dev-team/commands/reviewer-review.md
Comment thread plugins/dev-team/commands/implement.md Outdated
Fix 'tthe' typo on line 22, and extend Step 3 to also fetch PR check
failures alongside review comment threads.
The line 'You are performing the first-pass code review for the work item
described in the context file.' was accidentally removed during the refactor.
…lement.md

workflow-orchestrate expects --work-item-id, --workflow, and --research-skill
named arguments; the previous invocation used positional args which don't match.
…ILL.md

The --workflow flag was referencing $SCRIPT_DIR which is undefined; all other
paths in the file correctly use $SKILL_DIR.
…up/SKILL.md

Add a 'Determining work-item-type' table so the branches that reference
work-item-type are actionable. Fix duplicate '4e' heading (renamed second
to '4f') and correct typo 'ins' -> 'is'.
…repare-working-branch.py

- Add .expanduser() so ~-prefixed context file paths resolve correctly.
- Make pull() exit on failure; it is only called when the branch is known
  to exist on the remote, so any pull error is a real problem.
- Update checkout() to use 'git checkout -b <branch> origin/<branch>'
  when the branch exists only on the remote, avoiding failures on clean
  clones where no local ref exists yet.
…SON section format

The test was not updated when Fix 4 changed CreatePrStep.handle_results() to
parse a JSON object (via parse_json_output) instead of a raw URL string.
…faulting to 0

Previously, '|| echo "0"' masked auth errors, network failures, and missing
gh CLI by treating them as "no pending/failing checks" and reporting success.
Now uses 'if !' to capture the error output and exit with a JSON failure
object when gh returns a non-zero exit code.
@jodavis jodavis merged commit d9004d5 into feature/ADR-269-agent-orchestration Jun 18, 2026
1 check passed
@jodavis jodavis deleted the dev/jodavis/skills-refactor branch June 18, 2026 23:45
jodavis pushed a commit that referenced this pull request Jun 19, 2026
…ipts (#40)

* Create the workflow-setup skill and consume it from researcher-plan.

The workflow-setup skill ensures we are on the right branch and that the context file is initialized. All other skills will depend on this one, so that any skill will start a context file and work in the right branch regardless of how it is invoked. The researcher-plan skill is the first to take advatage of this.

* Create the workflow-worker skill, which wraps other skills to make sure they behave correctly within a workflow context.

* Integrate workflow-setup into developer-implement

* Create the `workflow-script` skill to replace the `script-runner` agent

* Add context file usage to all the step commands used by `implement-task-pipeline.md`

* workflow-orchestrate skill to run the orchestration loop

* Bug fixes in the implement-task workflow

* Fix more bugs in the workflow system
- Developer needs access to GitHub MCP for reading comments
- Researcher and Reviewer need Edit to update the context file
- Moved workflow-related scripts to the workflow-orchestrator skill
- Deleted some obsolete commands (create-branch, dev-team) and steps (SetupWorkspaceStep)
- Developer should not resolve review comments
- Pass the precomputed log-file into workflow-script

* Fix 1: Forbid cd-prefixed git commands in developer skill files

The developer agent was prepending `cd <repo-root> &&` before git commands
despite the Claude Code system prompt already prohibiting this pattern. Added
an explicit reminder in both developer-implement and developer-fix so the
instruction is visible inside the skill context where the agent operates.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Fix 2: Spawn troubleshooter on non-zero dev_team.py exit or non-successful agent result

Previously the orchestrator would stop and report on dev_team.py non-zero exit
(which invited the top-level agent to investigate inline, building up context).
It also had no handling at all when a workflow-worker or workflow-script returned
anything other than 'successful'.

Now both conditions spawn the troubleshooter sub-agent, keeping investigation
out of the orchestrator's context window.

Also added the `workflow-troubleshooter` skill, since it was missing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Fix 3: Replace --watch flag with explicit polling loop in wait-pr-checks.sh

`gh pr checks --watch` relies on interactive terminal features and was returning
immediately when run in a non-interactive subprocess (the script-runner agent),
causing the signoff step to evaluate checks before they had completed.

Replace with an explicit polling loop that queries the JSON bucket field every
15 seconds until no checks remain in the "pending" bucket, with a 30-minute
timeout.

Also changed exit codes: exit 1 on failure or timeout, exit 0 only on pass,
so the workflow-script agent records a meaningful failure message in the context
file rather than always writing "Succeeded".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Fix 4: Check for 'Succeeded' not 'passed' in signoff build result

The workflow-script agent writes 'Succeeded' to the context section when the
script exits 0, and a failure description when it exits non-zero. The signoff
build-validation step was checking startswith("passed"), which never matched
'Succeeded', so the step always returned "failed" even when all checks passed.
This caused the signoff->fixing-pr loop to cycle indefinitely on a clean PR.

Updated both BuildValidationStep.handle_results() and SignoffStep.handle_results()
to check for 'Succeeded', consistent with what workflow-script actually writes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Fix 4 (revised): Standardize section result parsing on JSON status objects

The original Fix 4 was wrong in two ways:
- For spawn_agent skills (debugger-investigate, developer-create-pr), the step
  handlers were using text heuristics instead of parsing the JSON the skills
  already write to their sections.
- For the run_script build-validation step, changing to 'Succeeded' addressed
  the symptom but not the root cause: the section should carry structured status,
  not a generic word.

Changes:

workflow-script/SKILL.md: when the last non-empty log line is a valid JSON
object, write it as the section result instead of 'Succeeded'. Scripts that
don't output JSON continue to get 'Succeeded'/'failure description' as before.

wait-pr-checks.sh: emit a JSON status object as the final stdout line on every
exit path so workflow-script picks it up into the section:
  {"status": "passed"} or {"status": "failed", "reason": "..."}

dev_team.py:
- DebugStep: read {"status": "reproduced"} from debug_report instead of
  scanning for the heading string "# Debug report for"
- CreatePrStep: read {"pr_url": "..."} from the PR URL section instead of
  applying a regex over the raw section text
- BuildValidationStep / SignoffStep: revert 'Succeeded' back to 'passed', now
  correctly sourced from parse_json_output(ctx.signoff_build_result)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* PR #40: Fix typo and add PR check failures fetch in developer-fix.md

Fix 'tthe' typo on line 22, and extend Step 3 to also fetch PR check
failures alongside review comment threads.

* PR #40: Restore first-pass code review intro line in reviewer-review.md

The line 'You are performing the first-pass code review for the work item
described in the context file.' was accidentally removed during the refactor.

* PR #40: Use named arguments when invoking workflow-orchestrate in implement.md

workflow-orchestrate expects --work-item-id, --workflow, and --research-skill
named arguments; the previous invocation used positional args which don't match.

* PR #40: Fix $SCRIPT_DIR typo to $SKILL_DIR in workflow-orchestrate/SKILL.md

The --workflow flag was referencing $SCRIPT_DIR which is undefined; all other
paths in the file correctly use $SKILL_DIR.

* PR #40: Clarify work-item-type and fix step 4e issues in workflow-setup/SKILL.md

Add a 'Determining work-item-type' table so the branches that reference
work-item-type are actionable. Fix duplicate '4e' heading (renamed second
to '4f') and correct typo 'ins' -> 'is'.

* PR #40: Fix path expansion, fatal pull, and remote-only checkout in prepare-working-branch.py

- Add .expanduser() so ~-prefixed context file paths resolve correctly.
- Make pull() exit on failure; it is only called when the branch is known
  to exist on the remote, so any pull error is a real problem.
- Update checkout() to use 'git checkout -b <branch> origin/<branch>'
  when the branch exists only on the remote, avoiding failures on clean
  clones where no local ref exists yet.

* PR #40: Fix test_handle_results_extracts_pr_url_from_section to use JSON section format

The test was not updated when Fix 4 changed CreatePrStep.handle_results() to
parse a JSON object (via parse_json_output) instead of a raw URL string.

* PR #40: Fail wait-pr-checks.sh when gh pr checks errors instead of defaulting to 0

Previously, '|| echo "0"' masked auth errors, network failures, and missing
gh CLI by treating them as "no pending/failing checks" and reporting success.
Now uses 'if !' to capture the error output and exit with a JSON failure
object when gh returns a non-zero exit code.

---------

Co-authored-by: Joe Davis <ElwoodMoves@hotmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants