Skip to content

Release: skill descriptions, evals, and writing audit for all 32 skills#76

Merged
coreyhaines31 merged 13 commits intomainfrom
development
Mar 5, 2026
Merged

Release: skill descriptions, evals, and writing audit for all 32 skills#76
coreyhaines31 merged 13 commits intomainfrom
development

Conversation

@coreyhaines31
Copy link
Copy Markdown
Owner

Summary

Three improvements inspired by Anthropic's skill-creator guidance, applied across all 32 skills:

All three PRs were independently reviewed and fixes applied before merge.

What changed

  • 64 files changed, ~3,100 lines added
  • 32 new evals/evals.json files (one per skill)
  • 32 updated SKILL.md description fields
  • 16 updated SKILL.md body sections (writing audit)

Test plan

  • All 32 eval files are valid JSON with correct schema
  • All description cross-references point to real skills
  • Zero ALL-CAPS imperatives remaining in skill bodies
  • No trigger phrase conflicts between skills
  • All assertion references verified against SKILL.md content
  • Independent Codex review passed on evals

🤖 Generated with Claude Code

coreyhaines31 and others added 13 commits March 4, 2026 13:02
Following Anthropic skill-creator guidance that Claude undertriggers
skills, make descriptions pushier across all 32 skills:

- Add casual/frustrated user phrases
- Add implicit need triggers where users need the skill but dont name it
- Add catch-all sentences explaining when to use
- Add missing cross-references between related skills
- Ensure consistent format across all descriptions

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
5 eval prompts per skill testing realistic user scenarios:
- page-cro: landing page audit, pricing page, homepage, feature page, redesign regression
- copywriting: homepage copy, headline rewrite, pricing page, landing page, CTA improvement
- seo-audit: full site audit, ranking diagnosis, migration recovery, e-commerce technical, blog content

Follows the skill-creator eval format with prompt + expected_output assertions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Changes per skill:

page-cro (5 -> 7 evals):
- Added structured assertions array to all evals
- Added Blog Post CRO eval (was missing)
- Added boundary eval: signup form should defer to signup-flow-cro
- Replaced feature page A/B test eval with fuller feature page eval

copywriting (5 -> 7 evals):
- Added structured assertions array to all evals
- Added About Page eval with voice/tone adaptation test
- Added boundary eval: email sequence should defer to email-sequence
- Added Quick Quality Check eval (buzzword/jargon/exclamation detection)
- Added meta content assertion to homepage eval

seo-audit (5 -> 8 evals):
- Added structured assertions array to all evals
- Added Local Business eval (NAP, GBP, location pages)
- Added Core Web Vitals / site speed eval
- Added boundary eval: FAQ schema should defer to schema-markup

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…2 skills)

Each skill now has 5-8 evals covering:
- Core framework usage with realistic prompts
- Casual trigger phrase variants
- Sub-type and section-specific coverage
- Boundary tests (skill deferral to related skills)
- Structured assertions for grading

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fixes 5 issues identified by independent Codex review:
- product-marketing-context: match auto-draft workflow, section flexibility
- marketing-psychology: replace phantom models with actual SKILL.md models
- ad-creative: correct RSA pinning guidance to match skill
- free-tool-strategy: boundary test now defers to related skill (page-cro)
- paywall-upgrade-cro: boundary test references only related skills

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Audit all 32 SKILL.md files against Anthropic's skill-creator writing
guidance ("why > MUST" pattern). Replaces ALWAYS/NEVER/MUST/IMPORTANT
imperatives with explanations of WHY the guidance matters.

17 edits across 14 skills:
- ad-creative: character limits reasoning, CTA headline reasoning
- seo-audit: schema detection warning softened, reasoning added
- programmatic-seo: subfolder vs subdomain reasoning
- paid-ads: exclusion list reasoning
- copywriting: honesty principle reasoning
- cold-email: follow-up value reasoning
- ai-seo: freshness signal reasoning
- churn-prevention: post-cancel path reasoning
- product-marketing-context: verbatim language reasoning
- popup-cro: close button visibility reasoning
- signup-flow-cro: label visibility reasoning
- form-cro: label visibility reasoning
- revops: fallback owner reasoning
- ab-test-setup: DON'T → Avoid

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- sales-enablement: "Never demo without discovery" → "Demo after discovery, not before"
- site-architecture: "No exceptions" → explains why (backlink equity, broken pages)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- sales-enablement: "ROI calculator" → "deal-specific ROI analysis" to avoid
  conflict with free-tool-strategy which also claims "ROI calculator"
- sales-enablement: clarified scope boundary to competitor-alternatives for
  battle cards (competitor-alternatives owns battle card creation)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- marketing-psychology eval 4: BJ Fogg assertion did not match expected_output
  which lists Goal-Gradient Effect. Fixed.
- sales-enablement eval 2: all 6 categories assertion contradicted expected_output
  which only categorizes the 3 given objections. Fixed.
- ad-creative eval 5: TikTok hard limit corrected to recommended (80 chars
  recommended, 100 max) per SKILL.md.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ptimization

Optimize all 32 skill descriptions for better triggering
Add evals for all 32 skills (197 total evals, 1261 assertions)
Audit skill bodies: replace rigid imperatives with reasoning
@coreyhaines31 coreyhaines31 merged commit 2f5db8d into main Mar 5, 2026
33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant