For any agent reading this: This document tells you WHERE this project is going, WHAT has been decided, and HOW to contribute without conflicting with the plan.
For the complete philosophical foundation — the three cognitive branches, the DO/DON'T mandates, and the growth flywheel — see COGNITIVE_TREE.md.
defense-in-depth is a governance middleware layer that bridges AI coding agents into human/enterprise operational workflows.
- AI handles: artifact generation, execution plans, mechanical checks
- Humans handle: business logic, ground truth, architecture decisions
- defense-in-depth handles: the gap between them (validation, enforcement, growth)
| Decision | Rationale |
|---|---|
| Git hooks only | No servers, databases, or cloud services required by default |
yaml and json interfaces |
Minimal attack surface, maximum portability |
| Cross-platform CI (3 OS × 3 Node) | Must work everywhere agents work |
| Pluggable Providers | Bridging to external systems (Jira, Linear) works via adapters, never bloated core |
Implication for agents: Do NOT introduce external dependencies. If a feature
requires infrastructure, it must be opt-in via a TicketStateProvider or similar extension, keeping the core lightweight.
| Decision | Rationale |
|---|---|
Pluggable Guard interface |
Users can add custom validators |
| Pure functions only | No side effects → deterministic, testable |
| Engine runs guards sequentially | Predictable order, clear error attribution |
| PASS/WARN/BLOCK severity | Simple tri-state for clear decisions |
Implication for agents: Every new check = new guard file. No checking logic inside the engine or CLI.
| Decision | Rationale |
|---|---|
EvidenceLevel enum (CODE/RUNTIME/INFER/HYPO) |
Forces agents to tag how they verified |
Finding.evidence field |
Guards can attach proof level |
Future: Lesson.wrongApproach + correctApproach |
Án Lệ (case law) records concrete context |
Implication for agents: When reporting findings, ALWAYS specify evidence level.
Untagged findings are treated as HYPO.
| Decision | Rationale |
|---|---|
| Guards never auto-merge PRs | Human judgment is irreplaceable for semantics |
| Automated Gateways as first-pass reviewer | Reduces human review burden, not replaces it |
| Phase gates require plan files | Prevents "code first, think later" |
Implication for agents: You are NOT autonomous. You propose. Humans approve. For automated first-pass reviews, refer to internal operational rules like rule-coderabbit-integration.md to handle feedback metadata.
| Decision | Rationale |
|---|---|
Lesson type with recall-friendly fields |
Growth requires searchable memory |
searchTerms + tags + relatedLessons |
Enables semantic recall across projects |
GrowthMetric tracking |
Measures learning velocity over time |
wrongApproach is MANDATORY in lessons |
Generic lessons are useless |
Implication for agents: When recording lessons, be SPECIFIC. "Always test code" is rejected. "Guard X missed BOM-prefixed files because regex lacked BOM strip" is accepted.
| File | Platform | Purpose |
|---|---|---|
GEMINI.md |
Gemini CLI | Bootstrap chain + cognitive framework |
CLAUDE.md |
Claude Code / Antigravity | Bootstrap chain + memory priming |
.cursorrules |
Cursor AI | Comment-based ruleset |
Implication: Any AI agent entering this project has ZERO onboarding friction. They immediately receive: laws, coding standards, quick reference, and cognitive framework. This is meta-prompting — not telling agents what to do, but teaching them how to teach themselves.
| Layer | Type | What it measures |
|---|---|---|
| 0: Guards | Guard, Finding |
Is this commit clean? (SHIPPED) |
| 1: Memory | Lesson, GrowthMetric |
What did we learn? (DESIGNED) |
| 2: Meta Memory | LessonOutcome, RecallMetric |
Are lessons recalled and helpful? |
| 3: Meta Growth | MetaGrowthSnapshot |
Is the growth system improving? |
| F: Federation | TelemetryPayload |
Bidirectional Internal ↔ OSS data flow |
All types are published in src/core/types.ts — compiled, documented, importable.
See docs/vision/meta-architecture.md for the full vision.
| Phase | Version | Focus | Key Types |
|---|---|---|---|
| Foundation | v0.1 | Core guards + CLI + OSS + prebuilt configs | Guard, Severity, Finding |
| Ecosystem | v0.2 | .agents/ scaffold + 19 rules + 5 skills |
GuardContext, config schema |
| Identity | v0.3 | Ticket-aware guards (TKID Lite) | TicketRef |
| Memory | v0.4 | Lesson recording + growth metrics | Lesson, GrowthMetric |
| Intelligence | v0.5 | DSPy adapter + semantic evaluation | EvaluationScore |
| Federation | v0.6 | Parent↔child governance guards | FederationGuardConfig, HttpTicketProvider |
| Meta Memory | v0.7 | Recall quality measurement | LessonOutcome, RecallMetric |
| Meta Growth | v0.8 | Growth acceleration tracking | MetaGrowthSnapshot |
| Telemetry Sync | v0.9 | Bidirectional Internal ↔ OSS data flow | TelemetryPayload |
| Stable | v1.0 | Public API freeze + npm publish | All types frozen |
Status Update (v0.4): Foundation (v0.1), Ecosystem (v0.2), Identity (v0.3) shipped. Memory Layer & Root Pollution Guard (v0.4) shipped:
TicketRefadded toGuardContext— engine extracts TKID from branch name, commit message, or directory name.TicketIdentityGuardenforces non-contradiction: if branch declares TKIDTK-xxx, commit must not reference a different ticket. Severity:WARN(advisory, not blocking).- Key architectural insight: Git worktree IS the Dependency Injection mechanism.
DefendEngine(projectRoot)receives CWD as the scope boundary. All git operations (branch,staged files,config) resolve relative to this root. When an AAOS worktree (.worktrees/TK-xxx/) is the CWD, identity and isolation come free from Git. When a standalone project is the CWD, the same code works without modification. Zero lock-in by design. - Lesson:
.worktreespath was initially hardcoded inextractTicketRef— removed. Branch name is the canonical TKID source; directory name is a generic fallback. - Review Ecosystem Enhancement: End-user Gateway profiles should align with AAOS guidelines, integrating assertive architectural analysis and preserving the Git-ignored
.agents/records/reviews/flow.
Status Update (v0.5): Foundation (v0.1), Ecosystem (v0.2), Identity (v0.3), and Memory (v0.4) shipped. Intelligence (v0.5) shipped:
EvaluationScoretype already published insrc/core/types.tssince v0.1.hollowArtifactguard enhanced with opt-in DSPy semantic evaluation (useDspy: true,dspyEndpoint,dspyTimeoutMs).- New
evalCLI subcommand for standalone artifact quality analysis with DSPy. - Shared DSPy Client (
src/core/dspy-client.ts): Extracted and generalized to serve as the universal DSPy integration point across ALL DiD layers. Supportsartifact,lesson,search, andrecallevaluation types. - Lesson Quality Gate (v0.5.1):
recordLesson()now optionally evaluates lesson quality via DSPy before persisting. Generic lessons (score < 0.5) are REJECTED. CLI:--quality-gateflag. - Semantic Lesson Search (v0.5.2):
searchLessons()supports DSPy-powered semantic ranking, replacingString.includes()for dramatically better recall. Falls back to string matching when DSPy unavailable. CLI:--semanticflag. - Guard F1 Metrics:
GuardF1Metrictype +computeF1()utility for measuring guard precision, recall, and F1 score. Applies Information Retrieval scoring to the guard pipeline. - Key architectural insight: DSPy is integrated as an enhancement OF the existing guard, not a separate evaluation subsystem. Zero-infrastructure default is preserved — DSPy is fully opt-in and degrades gracefully when the service is unreachable. Tagline: "Works without AI. Excels WITH AI."
Each phase builds on the previous. Agents MUST NOT implement future-phase features unless explicitly tasked.
Status Update (v0.6): Foundation through Intelligence (v0.1–v0.5) shipped. Federation (v0.6) shipped:
- Federation Guard (
federationGuard): Pure guard that cross-validates child project execution against parent ticket lifecycle phase. ConfigurableblockedParentPhases, severity modes (block/warn). - HttpTicketProvider: Network-aware provider using
globalThis.fetchwithAbortControllerfor timeout enforcement. Enables cross-project federation via REST endpoints. - Engine enrichment pipeline:
enrichParentTicket()resolves parent state as a second-stage enrichment BEFORE the guard pipeline runs, preserving guard purity. - Graceful degradation: All provider failures (timeout, network error, 404) degrade to WARN findings, never crash the pipeline.
- Key architectural insight: Guard purity is enforced by architecture, not discipline. The engine's enrichment phase handles ALL I/O. Guards only read
ctx.ticket.*fields. This eliminates an entire class of provider-crash-kills-pipeline bugs. - Bugs caught by TDD: (1)
FileTicketProviderleaked empty-stringparentIdthrough!= nullcheck. (2)HttpTicketProvidersilently droppedparentIdfrom JSON responses. Both caught by edge/integration tests before release. - Test suite: 99 tests total, 37 federation-specific (17 guard unit, 8 HTTP provider, 6 file provider, 6 engine integration).
Multi-Agent Operations (v0.6.1): Established formal strategy for leveraging external AI tools alongside operational agents:
- Agent taxonomy: Operational Agents (Main Agent — human-commanded, core builders) vs External Agents (Jules, CodeRabbit — third-party tools leveraged for optimization).
- Jules Integration: External async builder for routine tasks (tests, bug fixes, docs). Constrained by
.agents/contracts/jules.md. NOT a core dependency. - CodeRabbit Hardening: External PR reviewer.
.coderabbit.yamlexpanded with path instructions fortests/**,docs/**,.agents/**. - Key clarification: This multi-agent setup is DiD's internal development strategy for optimizing its own workflow. It is NOT a requirement imposed on DiD package users.
- Architectural insight: The distinction between "operational agent" (trusted, human-commanded) and "external agent" (constrained, config-bounded) is itself a defense-in-depth principle applied to AI governance.