Skip to content

Commit 8c40c39

Browse files
authored
Add PPL bugfix skill for Claude Code (#5307)
* Add PPL bugfix harness and /ppl-bugfix slash command - ppl-bugfix-harness.md: systematic bugfix SOP distilled from 15+ historical commits, covering triage, TDD-style fix paths (A-E by bug layer), test templates (UT/IT/YAML), verification, PR creation, decision log, and post-PR follow-up - .claude/commands/ppl-bugfix.md: slash command that auto-resolves issue↔PR, dispatches subagent with worktree isolation and scoped permissions, handles both initial fix and follow-up flows - CLAUDE.md: reference to /ppl-bugfix as the entry point for PPL bugs - .gitignore: allow .claude/commands/ and settings.json to be tracked Signed-off-by: Heng Qian <qianheng@amazon.com> * Improve PPL bugfix command: permission modes, parallel dispatch, file reorg - Add --safe/--yolo permission mode flags (default: bypassPermissions) - Support multiple issue/PR references for parallel processing - Move harness files to .claude/harness/ directory - Add ppl-bugfix-followup.md for post-PR follow-up workflow - Add .claude/settings.json with pre-approved tool permissions - Add CLAUDE_GUIDE.md documenting all slash commands Signed-off-by: Heng Qian <qianheng@amazon.com> * Fix follow-up harness to check PR comments including bot feedback The Reconstruct Context section now explicitly loads all PR comments (bot and human) and categorizes bot suggestions as actionable review feedback, not just CI checks and human reviews. Signed-off-by: Heng Qian <qianheng@amazon.com> * Address review feedback from bot comments on PR #5307 - Add missing git merge permission to settings.json (used in harness merge steps) - Search all PR closing keyword variants (Resolves/Fixes/Closes) for issue-PR linkage - Replace grep -oP with portable grep -oE for macOS compatibility - Use --force-with-lease instead of --force in cherry-pick cleanup flow - Make post-merge test re-run mandatory with explicit command - Add --repo opensearch-project/sql to gh pr create for correct targeting Signed-off-by: Heng Qian <qianheng@amazon.com> * Tighten git permissions: restrict push and reset to safe variants - Replace broad `git push:*` with `git push -u:*` and `git push --force-with-lease:*` to prevent accidental force-pushes to upstream - Restrict `git reset:*` to `git reset --soft:*` since the harness only uses soft reset - Update followup harness push commands to use `-u` flag consistently Signed-off-by: Heng Qian <qianheng@amazon.com> * Add retrospective step to follow-up harness Followup agent now reflects on each feedback comment to identify gaps in the harness that should have prevented the issue, and fixes them in the same commit. Signed-off-by: Heng Qian <qianheng@amazon.com> * Streamline ppl-bugfix workflow: fix 5 cross-cutting issues - Add fork remote discovery step to both harness and followup (replace <your_fork_remote> placeholder with concrete instructions) - Remove redundant follow-up type from command dispatch — subagent discovers it from PR state via followup harness categorization table - Fix Phase 0.1: remove fake REST API call, use integration test only - Add no-coauthor rule to followup harness (was only in harness) - Add --repo flag to gh pr checkout in followup for worktree context Signed-off-by: Heng Qian <qianheng@amazon.com> * Phase 4: clarify that harness improvements go in the same PR Signed-off-by: Heng Qian <qianheng@amazon.com> * Enforce hard stop when bug is already fixed on main Subagent was ignoring the "stop and report back" instruction and still creating regression tests + PRs for bugs that no longer reproduce. - Harness: clarify definition of "not reproducible", replace soft "stop and report back" with explicit HARD STOP + forbidden actions - Skill: add CRITICAL instruction in subagent dispatch prompt so it sees the constraint before even reading the harness Signed-off-by: Heng Qian <qianheng@amazon.com> * Streamline ppl-bugfix harness and CLAUDE.md after eval Harness: 369→151 lines. Move fix paths, test templates, symptom table, and case index to ppl-bugfix-reference.md (loaded on demand). Add completion gate with 8 mandatory deliverables. Add guardrails for runaway agents. Merge overlapping checklists. CLAUDE.md: trim build commands, remove v2 engine references, simplify ppl-bugfix section. Command: add bypassPermissions allow-list note. Signed-off-by: Heng Qian <qianheng@amazon.com> --------- Signed-off-by: Heng Qian <qianheng@amazon.com>
1 parent d984298 commit 8c40c39

8 files changed

Lines changed: 631 additions & 44 deletions

File tree

.claude/commands/ppl-bugfix.md

Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
---
2+
allowed-tools: Agent, Read, Bash(gh:*), Bash(git:*)
3+
description: Run the PPL bugfix harness for a GitHub issue or follow up on an existing PR
4+
---
5+
6+
Fix a PPL bug or follow up on an existing PR using the harness in `.claude/harness/ppl-bugfix-harness.md`.
7+
8+
## Input
9+
10+
Accepts one or more issue/PR references. Multiple references are processed in parallel (each gets its own subagent + worktree).
11+
12+
- `/ppl-bugfix #1234` — single issue
13+
- `/ppl-bugfix PR#5678` — single PR
14+
- `/ppl-bugfix #1234 #5678 PR#9012` — multiple in parallel
15+
- `/ppl-bugfix https://github.com/opensearch-project/sql/issues/1234` — URL
16+
17+
Optional mode flag (append to any of the above):
18+
- `--safe``acceptEdits` mode. Auto-approve file edits only, Bash commands require manual approval. (Most conservative)
19+
- `--yolo``bypassPermissions` mode. Fully trusted, no prompts. Subagent runs in an isolated worktree so this is safe. (Default)
20+
21+
> **Note**: `bypassPermissions` skips the interactive prompt but still respects the allow-list in `~/.claude/settings.json`. Ensure git/gh write commands are in the global allow-list.
22+
23+
Examples:
24+
- `/ppl-bugfix #1234` — single issue, defaults to yolo
25+
- `/ppl-bugfix #1234 #5678 --yolo` — two issues in parallel
26+
- `/ppl-bugfix PR#5293 PR#5300` — two PRs in parallel
27+
- `/ppl-bugfix #1234 PR#5678 --safe` — mix of issue and PR
28+
29+
If no argument given, ask for an issue or PR number.
30+
31+
## Step 0: Resolve Permission Mode
32+
33+
Parse the mode flag from the input arguments:
34+
35+
| Flag | Mode |
36+
|------|------|
37+
| `--safe` | `acceptEdits` |
38+
| `--yolo` | `bypassPermissions` |
39+
| _(no flag)_ | `bypassPermissions` (default) |
40+
41+
Use the resolved mode as the `mode` parameter when dispatching the subagent in Step 2A/2B.
42+
43+
## Step 1: Resolve Each Reference
44+
45+
For each issue/PR reference in the input, resolve its state. Run these lookups in parallel when there are multiple references.
46+
47+
```bash
48+
# Issue → PR (check multiple closing keyword variants)
49+
gh pr list --search "Resolves #<issue_number>" --json number,url,state --limit 5
50+
gh pr list --search "Fixes #<issue_number>" --json number,url,state --limit 5
51+
gh pr list --search "Closes #<issue_number>" --json number,url,state --limit 5
52+
53+
# PR → Issue
54+
gh pr view <pr_number> --json body | jq -r '.body' | grep -oiE '(resolves|fixes|closes) #[0-9]+' | grep -oE '[0-9]+'
55+
```
56+
57+
| State | Action |
58+
|-------|--------|
59+
| Issue exists, no PR | **Initial Fix** (Step 2A) |
60+
| Issue exists, open PR found | **Follow-up** (Step 2B) |
61+
| PR provided directly | **Follow-up** (Step 2B) |
62+
63+
## Step 2: Dispatch Subagents
64+
65+
Dispatch one subagent per reference. When there are multiple references, dispatch all subagents in a single message (parallel execution).
66+
67+
### 2A: Initial Fix
68+
69+
```
70+
Agent(
71+
mode: "<resolved_mode>",
72+
isolation: "worktree",
73+
name: "bugfix-<issue_number>",
74+
description: "PPL bugfix #<issue_number>",
75+
prompt: "Read .claude/harness/ppl-bugfix-harness.md and follow it to fix GitHub issue #<issue_number>.
76+
Follow Phase 0 through Phase 3 in order.
77+
Phase 0.3 defines TDD execution flow. Do NOT skip any phase.
78+
CRITICAL: If Phase 0.1 determines the bug is already fixed on main, HARD STOP.
79+
Do NOT write tests, do NOT create a PR — just comment/close the issue and report back.
80+
If the bug IS reproducible, post the Decision Log (Phase 3.4) before completing."
81+
)
82+
```
83+
84+
### 2B: Follow-up
85+
86+
```
87+
Agent(
88+
mode: "<resolved_mode>",
89+
isolation: "worktree",
90+
name: "bugfix-<issue_number>",
91+
description: "PPL bugfix #<issue_number> followup",
92+
prompt: "Read .claude/harness/ppl-bugfix-followup.md and follow it.
93+
PR: <pr_number> (<pr_url>), Issue: #<issue_number>"
94+
)
95+
```
96+
97+
## Step 3: Report Back
98+
99+
After all subagents complete, report a summary for each:
100+
- Classification, fix summary, PR URL, worktree path and branch, items needing human attention (2A)
101+
- What was addressed, current PR state, whether another round is needed (2B)
102+
103+
## Subagent Lifecycle
104+
105+
Subagents are task-scoped. They complete and release context — they cannot poll for events.
106+
107+
```
108+
Agent A (Phase 0-3) → creates PR → completes
109+
(CI runs, reviewers comment, conflicts arise)
110+
Agent B (Phase 3.5) → handles feedback → completes
111+
(repeat as needed)
112+
Agent N (Phase 3.5) → gh pr ready → done
113+
```
114+
115+
Context is preserved across agents via:
116+
- **Decision Log** (PR comment) — single source of truth for rejected alternatives, pitfalls, design rationale
117+
- **GitHub state** (PR diff, review comments, CI logs) — reconstructed by each follow-up agent
118+
119+
## Rules
120+
121+
- Subagent reads `.claude/harness/ppl-bugfix-harness.md` and fetches issue/PR details itself — do NOT inline content into the prompt
122+
- If bug is not reproducible (Phase 0.1), stop and report — do not proceed
123+
- Issue ↔ PR auto-resolution means the user never needs to track PR numbers manually
124+
- **Do NOT use `mode: "auto"` for subagents**`auto` mode does not work for subagents; Bash commands still require manual approval. Only `bypassPermissions` reliably skips permission checks.
125+
- **Always dispatch subagent** — even for trivial follow-ups (remove co-author, force push). Do NOT run commands directly in the main session; subagents with `bypassPermissions` skip permission prompts, the main session does not.
Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
# PPL Bugfix Follow-up
2+
3+
## Rules
4+
5+
- Do NOT add `Co-Authored-By` lines in commits — only DCO `Signed-off-by`
6+
7+
---
8+
9+
## Reconstruct Context
10+
11+
The follow-up agent runs in a fresh worktree. First checkout the PR branch, then load state:
12+
13+
```bash
14+
# Checkout the PR branch in this worktree
15+
gh pr checkout <pr_number>
16+
17+
# Resolve fork remote — the worktree may only have origin (upstream)
18+
git remote -v
19+
# If no fork remote exists, add it:
20+
git remote add fork https://github.com/<fork_owner>/sql.git
21+
22+
# Load PR state — reviews, CI, mergeability
23+
gh pr view <pr_number> --json title,body,state,reviews,statusCheckRollup,mergeable
24+
gh pr checks <pr_number>
25+
26+
# Load ALL comments — includes bot comments (Code-Diff-Analyzer, PR Reviewer Guide, Code Suggestions) and human comments
27+
gh pr view <pr_number> --json comments --jq '.comments[] | {author: .author.login, body: .body}'
28+
```
29+
30+
Categorize ALL signals — not just CI and human reviews:
31+
32+
| Signal | Type |
33+
|--------|------|
34+
| `statusCheckRollup` has failures | CI failure |
35+
| `reviews` has CHANGES_REQUESTED | Review feedback |
36+
| `mergeable` is CONFLICTING | Merge conflict |
37+
| Bot comments with actionable suggestions | Review feedback (treat like human review) |
38+
| All pass + approved | Ready — run `gh pr ready` |
39+
40+
## Handle Review Feedback
41+
42+
For each comment (human OR bot), **cross-check against the Decision Log first**:
43+
44+
| Type | Action |
45+
|------|--------|
46+
| Code change | If already rejected in Decision Log, reply with reasoning. Otherwise make the change, new commit, push |
47+
| Question | Reply with explanation — Decision Log often has the answer |
48+
| Nit | Fix if trivial |
49+
| Disagreement | Reply with Decision Log reasoning; if reviewer insists, escalate to user |
50+
51+
```bash
52+
git add <files> && git commit -s -m "Address review feedback: <description>"
53+
git push -u fork <branch_name>
54+
```
55+
56+
## Clean Up Commit History
57+
58+
When you need to amend a commit (e.g. remove Co-Authored-By, reword message) and the branch has a merge commit on top, don't try `git reset --soft origin/main` — it will include unrelated changes if main has moved. Instead cherry-pick the fix onto latest main:
59+
60+
```bash
61+
git checkout -B clean-branch origin/main
62+
git cherry-pick <fix_commit_sha>
63+
git commit --amend -s -m "<updated message>"
64+
git push fork clean-branch:<pr_branch> --force-with-lease
65+
```
66+
67+
## Handle CI Failures
68+
69+
```bash
70+
gh pr checks <pr_number> # Identify failures
71+
gh run view <run_id> --log-failed # Read logs
72+
# Test failure → fix locally, push new commit
73+
# Spotless → ./gradlew spotlessApply, push
74+
# Flaky → gh run rerun <run_id> --failed
75+
```
76+
77+
## Handle Merge Conflicts
78+
79+
```bash
80+
git fetch origin && git merge origin/main # Resolve conflicts
81+
./gradlew spotlessApply && ./gradlew test && ./gradlew :integ-test:integTest # Re-verify
82+
git commit -s -m "Resolve merge conflicts with main"
83+
git push -u fork <branch_name>
84+
```
85+
86+
## Mark Ready
87+
88+
```bash
89+
gh pr ready <pr_number>
90+
```
91+
92+
## Retrospective
93+
94+
After handling follow-up, reflect on the feedback received and check if it reveals gaps in the harness or command:
95+
96+
For each comment addressed (bot or human):
97+
- **Does the feedback point to a pattern the harness should have prevented?** → Add guidance to the relevant Phase in `ppl-bugfix-harness.md`
98+
- **Was this a repeated mistake across PRs?** → Add to Quick Reference or Case Index
99+
- **Did the harness template produce the problematic code?** → Fix the template directly
100+
- **Was a permission or tool missing?** → Add to `.claude/settings.json`
101+
- **Did the follow-up workflow itself miss this signal?** → Update this file
102+
103+
If any improvement is needed, make the edit and include it in the same commit.
Lines changed: 151 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,151 @@
1+
# PPL Bugfix Harness
2+
3+
## Phase 0: Triage
4+
5+
### 0.1 Load & Reproduce
6+
7+
```bash
8+
gh issue view <issue_number> --repo opensearch-project/sql
9+
```
10+
11+
Write a failing test or run an existing one to reproduce the bug on `main`.
12+
13+
If the bug **does not reproduce** (correct results, not infra failure):
14+
15+
| Finding | Action |
16+
|---------|--------|
17+
| Already fixed | `gh issue comment` + `gh issue close` |
18+
| Older version only | `gh issue comment` + `gh issue close` |
19+
| Intermittent | Label `flaky` or `needs-info`, do NOT close |
20+
| Can't reproduce | Comment asking for repro steps, label `needs-info` |
21+
22+
**HARD STOP** — do not proceed. Report back.
23+
24+
### 0.2 Classify
25+
26+
Identify the bug layer (Grammar, AST/Functions, Type System, Optimizer, Execution, DI/Resource) and record it. Consult `.claude/harness/ppl-bugfix-reference.md` for fix-path-specific guidance if needed.
27+
28+
### 0.3 Guardrails
29+
30+
Stop and report back if:
31+
- Root cause unclear after reading 15+ source files
32+
- Fix breaks 5+ unrelated tests
33+
- Same build error 3 times in a row
34+
35+
### 0.4 Execution Flow
36+
37+
```
38+
Triage → Write FAILING test → Fix → Remaining tests → Verify → Commit → PR → Decision Log → Completion Gate
39+
```
40+
41+
---
42+
43+
## Phase 1: Fix
44+
45+
Find and fix the root cause. Consult `.claude/harness/ppl-bugfix-reference.md` for path-specific patterns and examples.
46+
47+
---
48+
49+
## Phase 2: Tests
50+
51+
Consult `.claude/harness/ppl-bugfix-reference.md` for test templates.
52+
53+
Required deliverables:
54+
- Failing test reproducing the bug (written BEFORE the fix)
55+
- Unit tests covering happy path and edge cases
56+
- Integration test (`*IT.java` extending `CalcitePPLIT`)
57+
- YAML REST test at `integ-test/src/yamlRestTest/resources/rest-api-spec/test/issues/<ISSUE>.yml`
58+
59+
---
60+
61+
## Phase 3: Verify & Submit
62+
63+
### 3.1 Verify
64+
65+
```bash
66+
./gradlew spotlessApply
67+
./gradlew :<module>:test --tests "<TestClass>"
68+
./gradlew test
69+
./gradlew :integ-test:integTest -Dtests.class="*<YourIT>"
70+
```
71+
72+
Run `./gradlew :integ-test:yamlRestTest` if YAML tests were added. Run `./gradlew generateGrammarSource && ./gradlew :ppl:test` if grammar was modified.
73+
74+
### 3.2 Commit & PR
75+
76+
```bash
77+
git add <changed_files>
78+
git commit -s -m "[BugFix] Fix <description> (#<issue_number>)"
79+
git fetch origin && git merge origin/main
80+
./gradlew test && ./gradlew :integ-test:integTest -Dtests.class="*<YourIT>"
81+
82+
# Resolve fork remote (check git remote -v; add if missing)
83+
git remote add fork https://github.com/<fork_owner>/sql.git
84+
git push -u fork <branch_name>
85+
```
86+
87+
Do NOT add Co-Authored-By lines. Use the git user name to infer the fork owner, or fall back to "qianheng-aws".
88+
89+
```bash
90+
gh pr create --draft --repo opensearch-project/sql \
91+
--title "[BugFix] Fix <description> (#<issue_number>)" \
92+
--body "$(cat <<'EOF'
93+
### Description
94+
<Brief description of fix and root cause>
95+
96+
### Related Issues
97+
Resolves #<issue_number>
98+
99+
### Check List
100+
- [x] New functionality includes testing
101+
- [x] Commits signed per DCO (`-s`)
102+
- [x] `spotlessCheck` passed
103+
- [x] Unit tests passed
104+
- [x] Integration tests passed
105+
EOF
106+
)"
107+
```
108+
109+
### 3.3 Decision Log
110+
111+
Post as a PR comment:
112+
113+
```bash
114+
gh pr comment <pr_number> --body "$(cat <<'EOF'
115+
## Decision Log
116+
**Root Cause**: <what actually caused the bug>
117+
**Approach**: <what was implemented and why>
118+
**Alternatives Rejected**: <what was considered and why it didn't work>
119+
**Pitfalls**: <dead ends, subtle edge cases discovered>
120+
**Things to Watch**: <known limitations, areas needing follow-up>
121+
EOF
122+
)"
123+
```
124+
125+
---
126+
127+
## Completion Gate
128+
129+
Do NOT report "done" until every item below is checked. List each in your final report:
130+
131+
- [ ] **Unit tests**: New test class or methods
132+
- [ ] **Integration test**: New `*IT.java` test
133+
- [ ] **YAML REST test**: `issues/<ISSUE>.yml`
134+
- [ ] **spotlessApply**: Ran successfully
135+
- [ ] **Tests pass**: Affected modules
136+
- [ ] **Commit**: DCO sign-off, `[BugFix]` prefix, no Co-Authored-By
137+
- [ ] **Draft PR**: `--draft`, body contains `Resolves #<issue>`
138+
- [ ] **Decision Log**: PR comment posted
139+
140+
If any item is blocked, report which and why.
141+
142+
---
143+
144+
## Phase 4: Retrospective
145+
146+
- [ ] Symptom in Quick Reference? Add if missing.
147+
- [ ] Classification correct? Fix routing if misleading.
148+
- [ ] Test template worked as-is? Fix if broken.
149+
- [ ] New pattern? Add to Case Index.
150+
151+
Include harness improvements in the same PR.

0 commit comments

Comments
 (0)