Release: skill descriptions, evals, and writing audit for all 32 skills#76
Merged
coreyhaines31 merged 13 commits intomainfrom Mar 5, 2026
Merged
Release: skill descriptions, evals, and writing audit for all 32 skills#76coreyhaines31 merged 13 commits intomainfrom
coreyhaines31 merged 13 commits intomainfrom
Conversation
Following Anthropic skill-creator guidance that Claude undertriggers skills, make descriptions pushier across all 32 skills: - Add casual/frustrated user phrases - Add implicit need triggers where users need the skill but dont name it - Add catch-all sentences explaining when to use - Add missing cross-references between related skills - Ensure consistent format across all descriptions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
5 eval prompts per skill testing realistic user scenarios: - page-cro: landing page audit, pricing page, homepage, feature page, redesign regression - copywriting: homepage copy, headline rewrite, pricing page, landing page, CTA improvement - seo-audit: full site audit, ranking diagnosis, migration recovery, e-commerce technical, blog content Follows the skill-creator eval format with prompt + expected_output assertions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Changes per skill: page-cro (5 -> 7 evals): - Added structured assertions array to all evals - Added Blog Post CRO eval (was missing) - Added boundary eval: signup form should defer to signup-flow-cro - Replaced feature page A/B test eval with fuller feature page eval copywriting (5 -> 7 evals): - Added structured assertions array to all evals - Added About Page eval with voice/tone adaptation test - Added boundary eval: email sequence should defer to email-sequence - Added Quick Quality Check eval (buzzword/jargon/exclamation detection) - Added meta content assertion to homepage eval seo-audit (5 -> 8 evals): - Added structured assertions array to all evals - Added Local Business eval (NAP, GBP, location pages) - Added Core Web Vitals / site speed eval - Added boundary eval: FAQ schema should defer to schema-markup Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…2 skills) Each skill now has 5-8 evals covering: - Core framework usage with realistic prompts - Casual trigger phrase variants - Sub-type and section-specific coverage - Boundary tests (skill deferral to related skills) - Structured assertions for grading Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fixes 5 issues identified by independent Codex review: - product-marketing-context: match auto-draft workflow, section flexibility - marketing-psychology: replace phantom models with actual SKILL.md models - ad-creative: correct RSA pinning guidance to match skill - free-tool-strategy: boundary test now defers to related skill (page-cro) - paywall-upgrade-cro: boundary test references only related skills Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Audit all 32 SKILL.md files against Anthropic's skill-creator writing
guidance ("why > MUST" pattern). Replaces ALWAYS/NEVER/MUST/IMPORTANT
imperatives with explanations of WHY the guidance matters.
17 edits across 14 skills:
- ad-creative: character limits reasoning, CTA headline reasoning
- seo-audit: schema detection warning softened, reasoning added
- programmatic-seo: subfolder vs subdomain reasoning
- paid-ads: exclusion list reasoning
- copywriting: honesty principle reasoning
- cold-email: follow-up value reasoning
- ai-seo: freshness signal reasoning
- churn-prevention: post-cancel path reasoning
- product-marketing-context: verbatim language reasoning
- popup-cro: close button visibility reasoning
- signup-flow-cro: label visibility reasoning
- form-cro: label visibility reasoning
- revops: fallback owner reasoning
- ab-test-setup: DON'T → Avoid
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- sales-enablement: "Never demo without discovery" → "Demo after discovery, not before" - site-architecture: "No exceptions" → explains why (backlink equity, broken pages) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- sales-enablement: "ROI calculator" → "deal-specific ROI analysis" to avoid conflict with free-tool-strategy which also claims "ROI calculator" - sales-enablement: clarified scope boundary to competitor-alternatives for battle cards (competitor-alternatives owns battle card creation) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- marketing-psychology eval 4: BJ Fogg assertion did not match expected_output which lists Goal-Gradient Effect. Fixed. - sales-enablement eval 2: all 6 categories assertion contradicted expected_output which only categorizes the 3 given objections. Fixed. - ad-creative eval 5: TikTok hard limit corrected to recommended (80 chars recommended, 100 max) per SKILL.md. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ptimization Optimize all 32 skill descriptions for better triggering
Add evals for all 32 skills (197 total evals, 1261 assertions)
Audit skill bodies: replace rigid imperatives with reasoning
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three improvements inspired by Anthropic's skill-creator guidance, applied across all 32 skills:
All three PRs were independently reviewed and fixes applied before merge.
What changed
evals/evals.jsonfiles (one per skill)SKILL.mddescription fieldsSKILL.mdbody sections (writing audit)Test plan
🤖 Generated with Claude Code