Security coverage to 95%+ (real tests) + embedded frontend-design skill by tzone85 · Pull Request #112 · tzone85/vortex-dispatch

tzone85 · 2026-07-02T10:51:10Z

Summary

Two requested improvements, both TDD'd and independently reviewed.

1. Security coverage: 73.0% → 98.6% / 95.2% — real behavior pins, not checkbox tests

internal/security: 73.0% → 98.6%. Centerpiece: a fake-scanner harness (scanners_exec_test.go) — executable shell scripts named gosec/gitleaks/semgrep/govulncheck/npm on a controlled PATH emit canned real-format tool output, driving RunScanners end-to-end with zero network: language applicability, PATH availability, per-tool parsing, dedup, and the three-way ran/skipped/failed classification (a tool emitting garbage lands in failed and never masquerades as a clean run; missing tools are skipped visibly; inapplicable tools appear nowhere).
engine/security_gate.go → 95.2% (statements). All LLM review paths driven via llm.ReplayClient: LLM findings merge with scanner findings, a critical diff finding blocks a story, the diff is asserted to travel inside <diff> data tags (injection mitigation), an LLM failure degrades to scanners-only, garbage responses pass cleanly.
Previously-uncovered load-bearing logic now pinned: Covers was at 0% — the self-upskilling dedup that decides whether the agent re-learns a vulnerability class (rule-ID match, CWE alias, learning trigger, receiver immutability). Plus corrupt-KB-is-a-loud-error at both gate entry points, read-only-KB persist failure (best-effort, never aborts), vulnClassID fallback chain, cweOf extraction bounds, language-manifest branches, and parser error paths for all 5 tools.
Remaining uncovered lines are error-log statements requiring failing event/projection stores — acknowledged, not gamed.

2. Embedded frontend-design skill — vxd-built UIs stop looking AI-generated

Research: Anthropic's frontend-design skill (read in full) + a web-research sweep (Anthropic's design-skills blog, anti-"AI slop" guides, WCAG 2026 checklists, LLM UI stack analyses).

agent.FrontendDesignBrief — single-source design-standards block injected into every UI-facing story's goal prompt: two-pass token-first process (palette/type/layout/signature plan → self-critique for genericness → code derived from the plan), named banned defaults (Inter/Roboto, purple-gradient-on-white, cream #F4F1EA+serif+terracotta, acid-green-on-black, the template feature-card page, scattered animation), a non-negotiable accessibility floor (responsive to 360px, visible keyboard focus, WCAG AA contrast, prefers-reduced-motion, 44px touch targets, designed empty/loading/error states, CSS specificity discipline), and copy-as-design-material rules ("Save changes", never "Submit"). Size budget pinned at ≤6 KB.
detectFrontend — owned-file extensions (strongest signal) + whole-word UI vocabulary regex. Word boundaries pinned by table tests: "pagination"≠page, "performance"≠form, "review"≠view; per review, "html" and "responsive" excluded (server-side HTML emails and "responsive API gateway" are backend work).
Threaded into both dispatch paths (first attempt + retry) with a dangling-wire source-scan guard (TestExecutor_WiresFrontendDetection).
Planner: ENGINEERING STANDARDS now require the first UI story to establish a design-token foundation (CSS custom properties / framework theme) consumed by later UI stories, and UI acceptance criteria to carry the accessibility floor.

Review

everything-claude-code:go-reviewer: approve, no CRITICAL/HIGH; 2 MEDIUM + 2 LOW all applied in the final commit (keyword false-positives removed with new negative cases; fake-tool harness exit-code fidelity + collision-resistant sentinel + honest Bin values). The suggested style keyword was deliberately not added — it would trade a minor miss for "code style" false positives.

Test plan

go test ./... -count=1 — all packages pass
go vet ./... clean; golangci-lint run — 0 issues
internal/security 98.6% coverage; security_gate.go 95.2%
Doc-coverage wiring tests pass (CLAUDE.md section + README bullet added; stale "12 checks" corrected to 16)
NXD port noted as pending in CLAUDE.md

… tests internal/security: 73.0% → 98.6%. engine/security_gate.go: → 95.2% (statements). The centerpiece is a fake-scanner harness (scanners_exec_test.go): executable shell scripts named gosec/gitleaks/semgrep/govulncheck/npm on a controlled PATH emit canned real-format output, so RunScanners' orchestration is exercised end-to-end — language applicability, PATH availability, per-tool parse, dedup, and the three-way ran/skipped/failed classification (a tool that emits garbage lands in failed, never in ran; missing tools are skipped visibly; inapplicable tools appear nowhere). Previously-uncovered load-bearing logic now pinned: - Covers (0% → 100%): the self-upskilling dedup — rule-ID match, CWE-alias match, unknown class triggers learning, learned rules extend coverage, immutability of the receiver. - Gate LLM paths (0% → 100%): llmReview/llmReviewDiff/callLLM via llm.ReplayClient — LLM findings merge with scanner findings, a critical diff finding blocks a story, the diff travels inside <diff> data tags, an LLM failure degrades to scanners-only, garbage responses pass cleanly. - vulnClassID fallback chain (28.6% → 100%) + cweOf extraction bounds. - Knowledge-base failure modes: corrupt KB is a loud error at both gate entry points (never silently replaced by the baseline); a read-only KB makes upskill persistence fail without aborting the scan. - DetectLanguages manifest branches (rust/php/ruby/python/ts-beats-js), extension fallback, node_modules skipping. - Parser error paths for all 5 tools, gosec line-range + missing-CWE, npm-audit map-key fallback, govulncheck malformed-line skipping. Remaining uncovered lines are error-log statements requiring failing event/projection stores — tracked, not gamed.

…n-intent planning + a distinctive-design brief vxd-built web UIs came out as generic AI-default design. The factory now carries a frontend-design skill at both ends of the pipeline: - agent.FrontendDesignBrief (internal/agent/frontend.go): single-source design-standards block synthesized from Anthropic's frontend-design skill and current anti-'AI slop' research — two-pass token-first process (palette/type/layout/signature plan, self-critique for genericness, code derived from the plan), named banned defaults (Inter/Roboto, purple-gradient-on-white, cream #F4F1EA + serif + terracotta, acid-green-on-black, the template feature-card page, scattered animation), a non-negotiable accessibility floor (responsive to 360px, visible keyboard focus, WCAG AA contrast, prefers-reduced-motion, 44px targets, designed empty/loading/error states, CSS specificity discipline), and copy-as-design-material rules. Size budget pinned (≤6 KB). - detectFrontend (engine/detect.go): owned-file extensions + whole-word UI vocabulary regex (word boundaries — 'pagination'/'performance'/'review' must not false-positive; pinned by 21 table cases). - Executor threads IsFrontend into BOTH the first-dispatch PromptContext and the retry TemplateContext; TestExecutor_WiresFrontendDetection guards the wire (dangling-wire pattern). - Planner ENGINEERING STANDARDS now require the first UI story to establish a design-token foundation consumed by later UI stories, and UI acceptance criteria to carry the accessibility floor. Docs: CLAUDE.md section + README feature bullet (+ stale '12 checks' corrected to 16). NXD port pending (offline-first mirror).

…anch - detect.go: drop 'html' and 'responsive' from frontendKeywordRe — server-side HTML (email reports, text/html rendering) and 'responsive API gateway' are backend work; real UI stories carry .html in owned files or another keyword. Three new table cases pin the distinction. - scanners_exec_test.go: fakeTool now honors the exit code verbatim (strconv.Itoa, not a 0/1 collapse) and uses a collision-resistant heredoc sentinel with an explicit guard; TestScannerRun_PerKindDispatch uses the real registry Bin values ('npm' for npm-audit) and documents that Run dispatches on Kind with hardcoded binaries — Bin serves only LookPath.

tzone85 added 3 commits July 2, 2026 12:26

tzone85 merged commit e2f3218 into main Jul 2, 2026
5 checks passed

tzone85 deleted the feat/security-coverage-frontend-skill branch July 2, 2026 10:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Security coverage to 95%+ (real tests) + embedded frontend-design skill#112

Security coverage to 95%+ (real tests) + embedded frontend-design skill#112
tzone85 merged 3 commits into
mainfrom
feat/security-coverage-frontend-skill

tzone85 commented Jul 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tzone85 commented Jul 2, 2026

Summary

1. Security coverage: 73.0% → 98.6% / 95.2% — real behavior pins, not checkbox tests

2. Embedded frontend-design skill — vxd-built UIs stop looking AI-generated

Review

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant