Security agent: self-upskilling scanner + LLM review embedded in vxd core by tzone85 · Pull Request #109 · tzone85/vortex-dispatch

tzone85 · 2026-06-26T19:45:40Z

What

A state-of-the-art, self-upskilling security agent built into vxd's core, so every build is reviewed for vulnerabilities and every future build inherits what past ones taught it.

Three layers

internal/security/ (LLM-free, unit-tested) — a growable KnowledgeBase seeded with the OWASP Top 10 (2021) + high-value CWEs (immutable/versioned Add, Covers matches ID or CWE, Checklist renders for prompts; persisted at <state_dir>/security/knowledge.json). A scanner runner orchestrating gosec, govulncheck, gitleaks, semgrep, npm audit with language-aware applicability + PATH detection + graceful degrade (skipped tools are listed, never silently dropped); pure per-tool parsers → real findings (no hallucinations).
engine/security_gate.go SecurityGate — ScanRepo (standalone) + ReviewStory (per-story pre-merge). Deterministic scanners ∪ optional LLM threat-model review against the KB. A finding ≥ gate severity pauses the requirement (human decision), never escalates. Self-upskilling: confirmed high+ findings of a new vuln class become learned KB rules (SECURITY_RULE_LEARNED).
Forward-embedded in core — planner ENGINEERING STANDARDS now spell out the OWASP Top 10 (every story designed secure); per-story gate enforces the live KB; vxd security scan / vxd security kb CLI; security.* config.

Calibration

Default security.gate_severity = critical — the build-pausing gate fires only on leaked secrets / LLM-confirmed injection, not context-dependent SAST noise. Standalone vxd security scan (default --min high) stays thorough for audits.

Proven live

Self-upskilling fired in production — scanning this repo grew the KB v1→v5 (learned CWE-190/338/367/400).
Scanned the 20 most-recently-modified repos (Go/Python/TS/Rust/JS/PHP/Shell). crit=0 everywhere (no leaked secrets, no confirmed criticals).
Fixed what was real: 9 reachable Go stdlib CVEs across shiftsync (7) and bounty-dispatch (2) — toolchain bump to 1.26.4, govulncheck now clean (pushed to those repos). Every other "high" verified as a false positive (header-name G101, parameterized SQL, allowlisted property access) or gosec noise against already-hardened code.

Tests

internal/security/*_test.go (16) + engine/security_gate_test.go (7) + new events in the projection exhaustiveness guard + TestResume_WiresSecurityGate.
Full suite 32 pkgs green, go vet clean, golangci-lint 0 issues.

Follow-ups

NXD port (mirror to nexus-dispatch).
Optional: per-rule severity calibration map; LLM-triage on standalone scans by default.

…security) New package backing vxd's security agent: - knowledge.go: growable, JSON-persisted KnowledgeBase seeded with OWASP Top 10 (2021) + high-value CWEs (hardcoded secrets, path traversal, XSS), each with detection + remediation guidance. Add() is immutable + version-bumping + dedup-by-ID (the self-upskilling store). Checklist() renders for prompts. - scanners.go: orchestrates gosec/govulncheck/gitleaks/semgrep/npm-audit with language-aware applicability + PATH detection (graceful degrade). Pure parsers per tool turn real scanner output into Findings — no hallucinated vulns. - languages.go: manifest + extension language detection (ts vs js aware). - finding.go/severity.go/report.go: findings model, severity ranking with scanner-synonym parsing, dedup, and an operator-facing markdown report. TDD: 16 tests (KB roundtrip/immutability/lang-filter/checklist, scanner applicability, all 5 parsers against representative output, report counts/format). vet + golangci-lint clean.

…lf-upskilling) engine/security_gate.go — vxd's security agent, two entry points: - ScanRepo: standalone whole-repo scan (deterministic scanners ∪ LLM threat-model review against the KB checklist), emits SECURITY_SCAN_COMPLETED. - ReviewStory: per-story pre-merge gate; blocks when any finding meets/exceeds the configured gate severity; emits STORY_SECURITY_PASSED/FAILED. Continuous upskilling: confirmed high+ findings whose vuln CLASS (CWE, else OWASP category, else tool rule) isn't already covered are added to the knowledge base as learned rules (KnowledgeBase.Covers matches ID or CWE so OWASP-indexed baseline classes aren't re-learned), persisted, and announced via SECURITY_RULE_LEARNED — so every future build inherits classes found in past ones. New events STORY_SECURITY_PASSED/FAILED + SECURITY_SCAN_COMPLETED/RULE_LEARNED wired into the projection switch (TestProject_AllDeclaredEventsHandled passes). TDD: 7 tests (scan aggregation+event, block-on-critical, pass-below-threshold, self-upskill on new class, no-relearn known class, LLM findings parse). Injectable scan + now seams; nil client ⇒ scanner-only. vet + golangci-lint clean.

…, config, docs - Pipeline: SecurityGate.ReviewStory runs per-story after QA, before merge (monitor_post_execution.go). A finding >= gate severity PAUSES the requirement (human decision) instead of escalating; a scanner failure never blocks merge. Monitor.SetSecurityGate + resume.go wiring (TestResume_WiresSecurityGate). - CLI: `vxd security scan [path]` (scanners + optional --llm review, --min for CI exit code, --json) and `vxd security kb` (inspect baseline + learned rules). - Forward-embedded: planner ENGINEERING STANDARDS now spells out the OWASP Top 10 so every planned story is designed secure; the live (growable) KB is enforced at the per-story gate. - Config: security.{disable_gate, gate_severity (default high), auto_learn (default true), kb_path}; DefaultConfig seeds the defaults. - Events STORY_SECURITY_PASSED/FAILED + SECURITY_SCAN_COMPLETED/RULE_LEARNED projected (exhaustiveness guard passes). - Docs: README config table + CLAUDE.md (CLI table, vxd.yaml block, events, security-agent knowledge section). Doc-coverage tests pass. Full suite (32 pkgs) + vet + golangci-lint (0 issues) green. Binary rebuilt.

…ld-usable) Validated the agent on real repos (Go/Python/TS): it surfaces real findings (gosec perms/path-traversal, semgrep CWE-89 SQLi patterns) and proved self-upskilling in production (KB grew v1→v5, learned CWE-190/338/367/400 from the vortex-dispatch scan). But gosec/semgrep HIGH severity is context-dependent (non-crypto rand in a Bayesian sampler, taint on operator-controlled $HOME paths, parameterized SQL flagged as concatenation) — gating builds on it would stall the pipeline on noise. Default security.gate_severity: high → critical. The per-story gate now pauses a build only on CRITICAL findings (leaked secrets via gitleaks, LLM-confirmed injection/hardcoded credentials) — high-signal where it counts. The standalone `vxd security scan` still reports high/medium (default --min high) for thorough audits, and operators can tighten the gate to "high". Docs updated. Full suite (32 pkgs) + vet + golangci-lint (0 issues) green. Binary rebuilt.

…r check (#111) * chore(security): dogfood scan hardening — pin GH Actions to SHAs + annotate accepted findings Ran vxd security scan on vxd itself (346 findings) and closed the high-severity set: - Pin all 14 GitHub Actions references in ci.yml to full commit SHAs (mutable-tag supply-chain class, CWE-1357) with version comments. - Annotate the 29 accepted-by-design findings with #nosec / nosemgrep and a one-line rationale each: sampler seed conversion + math/rand (statistical sampling, crypto seed), server shutdown contexts (fresh ctx after parent cancel is the graceful-shutdown idiom), G703 path-taint sites (paths derive from $HOME/worktrees inside the operator trust boundary), and the 15 dangerous-exec-command sites that ARE the orchestrator's core function (each with its upstream validation named). vxd security scan . now reports 0 high+ findings on its own tree, so the scan is usable as a self-gate (--min high) once CI billing is restored. * feat(preflight): security_scanners check — surface missing SAST/secret tools The per-story security gate degrades gracefully when a scanner binary is absent (skipped, never fatal), which left operators with no signal that scan coverage was reduced. New CheckSecurityScanners (WARNING tier) lists missing binaries from the security.KnownScanners registry with install hints (security.InstallHint) and joins AllChecks — vxd preflight now runs 16 checks. lookPath is injected for testability, matching CheckBinaryPath's pattern. 4 new tests including a dangling-wire guard (AllChecks must include the check). Docs updated (CLAUDE.md + README check counts, security-agent section). * fix(watch): vxd watch silently dropped every event for real requirements + test coverage restore Two matcher bugs made `vxd watch` a silent no-op tail in production: 1. Story events: eventMatchesReq compared evt.StoryID[:8] against reqID[:8], but story IDs are namespaced with storyIDPrefix(reqID) — sha256(reqID)[:8] for any reqID longer than 8 chars, which every real ULID reqID is. The prefixes never matched, so no story event ever printed. The matcher now uses the exported engine.StoryIDPrefix (single source of truth). 2. Requirement events: the code commented 'REQ_* events get routed via payload below' but no payload routing existed — REQ_SUBMITTED/PLANNED/COMPLETED/ BLOCKED never printed. Now matched via the payload req_id/id keys (the two keys real emitters use: planner uses 'id', the planning heartbeat 'req_id'). The old TestEventMatchesReq_PrefixMatch pinned the broken raw-prefix behavior and was replaced (test was wrong, not the spec). New tests: hashed-prefix match, short-reqID verbatim match, payload routing for both key spellings, cross-requirement rejection, and an end-to-end tailRequirementEvents run against real file+sqlite stores that pins print-and-exit-on-terminal. Also restores the internal/cli coverage regression from PR #109 (68.0% → 72.9%): the security scan/kb commands, dashboard status/stop daemon commands, and watch were all untested. New: security_test.go (9 tests — pure helpers, kb text/json, scan with empty PATH pinning graceful degradation + skipped- scanner reporting), dashboard_daemon_test.go (7 tests — not-running status, idempotent stop, malformed/stale pidfiles, watch unknown-req error). * fix(review): apply go-reviewer findings — complete G124 suppression, kill dead branch + inert assertions - auth.go: the cookie site had the nosemgrep half of the annotation but gosec G124 still fired; add the #nosec with the same rationale. - watch.go: drop the unreachable evt.StoryID == prefix branch — the planner always emits <prefix>-<suffix> IDs, so HasPrefix covers every real case. - checks_security_test.go: the installed-scanner negative assertions matched a substring ('missing: gitleaks') that could never occur in the real message format; assert on the bare scanner names so a regression can actually fire. --------- Co-authored-by: Thando Mini <tzone85@users.noreply.github.com>

tzone85 added 4 commits June 26, 2026 20:08

tzone85 merged commit ebc4157 into main Jun 26, 2026
6 checks passed

tzone85 deleted the feat/security-agent branch June 26, 2026 19:48

tzone85 mentioned this pull request Jul 2, 2026

Security dogfood hardening + vxd watch matcher fix + preflight scanner check #111

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Security agent: self-upskilling scanner + LLM review embedded in vxd core#109

Security agent: self-upskilling scanner + LLM review embedded in vxd core#109
tzone85 merged 4 commits into
mainfrom
feat/security-agent

tzone85 commented Jun 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tzone85 commented Jun 26, 2026

What

Three layers

Calibration

Proven live

Tests

Follow-ups

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant