Skip to content

Latest commit

 

History

History
163 lines (120 loc) · 10.9 KB

File metadata and controls

163 lines (120 loc) · 10.9 KB

📋 STRATEGY.md — Strategic Metadata for defense-in-depth

For any agent reading this: This document tells you WHERE this project is going, WHAT has been decided, and HOW to contribute without conflicting with the plan.


Mission

For the complete philosophical foundation — the three cognitive branches, the DO/DON'T mandates, and the growth flywheel — see COGNITIVE_TREE.md.

defense-in-depth is a governance middleware layer that bridges AI coding agents into human/enterprise operational workflows.

  • AI handles: artifact generation, execution plans, mechanical checks
  • Humans handle: business logic, ground truth, architecture decisions
  • defense-in-depth handles: the gap between them (validation, enforcement, growth)

Strategic Pillars

1. CLI-First, Zero-Infrastructure (Depends-On Philosophy)

Decision Rationale
Git hooks only No servers, databases, or cloud services required by default
yaml and json interfaces Minimal attack surface, maximum portability
Cross-platform CI (3 OS × 3 Node) Must work everywhere agents work
Pluggable Providers Bridging to external systems (Jira, Linear) works via adapters, never bloated core

Implication for agents: Do NOT introduce external dependencies. If a feature requires infrastructure, it must be opt-in via a TicketStateProvider or similar extension, keeping the core lightweight.

2. Guard Pipeline Architecture

Decision Rationale
Pluggable Guard interface Users can add custom validators
Pure functions only No side effects → deterministic, testable
Engine runs guards sequentially Predictable order, clear error attribution
PASS/WARN/BLOCK severity Simple tri-state for clear decisions

Implication for agents: Every new check = new guard file. No checking logic inside the engine or CLI.

3. Trust-but-Verify (Evidence System)

Decision Rationale
EvidenceLevel enum (CODE/RUNTIME/INFER/HYPO) Forces agents to tag how they verified
Finding.evidence field Guards can attach proof level
Future: Lesson.wrongApproach + correctApproach Án Lệ (case law) records concrete context

Implication for agents: When reporting findings, ALWAYS specify evidence level. Untagged findings are treated as HYPO.

4. HITL as Supreme Rule

Decision Rationale
Guards never auto-merge PRs Human judgment is irreplaceable for semantics
Automated Gateways as first-pass reviewer Reduces human review burden, not replaces it
Phase gates require plan files Prevents "code first, think later"

Implication for agents: You are NOT autonomous. You propose. Humans approve. For automated first-pass reviews, refer to internal operational rules like rule-coderabbit-integration.md to handle feedback metadata.

5. Growth Engine (Future)

Decision Rationale
Lesson type with recall-friendly fields Growth requires searchable memory
searchTerms + tags + relatedLessons Enables semantic recall across projects
GrowthMetric tracking Measures learning velocity over time
wrongApproach is MANDATORY in lessons Generic lessons are useless

Implication for agents: When recording lessons, be SPECIFIC. "Always test code" is rejected. "Guard X missed BOM-prefixed files because regex lacked BOM strip" is accepted.

6. Prebuilt Agent Configs (Meta Prompting Materialized)

File Platform Purpose
GEMINI.md Gemini CLI Bootstrap chain + cognitive framework
CLAUDE.md Claude Code / Antigravity Bootstrap chain + memory priming
.cursorrules Cursor AI Comment-based ruleset

Implication: Any AI agent entering this project has ZERO onboarding friction. They immediately receive: laws, coding standards, quick reference, and cognitive framework. This is meta-prompting — not telling agents what to do, but teaching them how to teach themselves.

7. Meta Layers (Vision — Published as Types)

Layer Type What it measures
0: Guards Guard, Finding Is this commit clean? (SHIPPED)
1: Memory Lesson, GrowthMetric What did we learn? (DESIGNED)
2: Meta Memory LessonOutcome, RecallMetric Are lessons recalled and helpful?
3: Meta Growth MetaGrowthSnapshot Is the growth system improving?
F: Federation TelemetryPayload Bidirectional Internal ↔ OSS data flow

All types are published in src/core/types.ts — compiled, documented, importable. See docs/vision/meta-architecture.md for the full vision.


Roadmap (Tactical)

Phase Version Focus Key Types
Foundation v0.1 Core guards + CLI + OSS + prebuilt configs Guard, Severity, Finding
Ecosystem v0.2 .agents/ scaffold + 19 rules + 5 skills GuardContext, config schema
Identity v0.3 Ticket-aware guards (TKID Lite) TicketRef
Memory v0.4 Lesson recording + growth metrics Lesson, GrowthMetric
Intelligence v0.5 DSPy adapter + semantic evaluation EvaluationScore
Federation v0.6 Parent↔child governance guards FederationGuardConfig, HttpTicketProvider
Meta Memory v0.7 Recall quality measurement LessonOutcome, RecallMetric
Meta Growth v0.8 Growth acceleration tracking MetaGrowthSnapshot
Telemetry Sync v0.9 Bidirectional Internal ↔ OSS data flow TelemetryPayload
Stable v1.0 Public API freeze + npm publish All types frozen

Status Update (v0.4): Foundation (v0.1), Ecosystem (v0.2), Identity (v0.3) shipped. Memory Layer & Root Pollution Guard (v0.4) shipped:

  • TicketRef added to GuardContext — engine extracts TKID from branch name, commit message, or directory name.
  • TicketIdentityGuard enforces non-contradiction: if branch declares TKID TK-xxx, commit must not reference a different ticket. Severity: WARN (advisory, not blocking).
  • Key architectural insight: Git worktree IS the Dependency Injection mechanism. DefendEngine(projectRoot) receives CWD as the scope boundary. All git operations (branch, staged files, config) resolve relative to this root. When an AAOS worktree (.worktrees/TK-xxx/) is the CWD, identity and isolation come free from Git. When a standalone project is the CWD, the same code works without modification. Zero lock-in by design.
  • Lesson: .worktrees path was initially hardcoded in extractTicketRef — removed. Branch name is the canonical TKID source; directory name is a generic fallback.
  • Review Ecosystem Enhancement: End-user Gateway profiles should align with AAOS guidelines, integrating assertive architectural analysis and preserving the Git-ignored .agents/records/reviews/ flow.

Status Update (v0.5): Foundation (v0.1), Ecosystem (v0.2), Identity (v0.3), and Memory (v0.4) shipped. Intelligence (v0.5) shipped:

  • EvaluationScore type already published in src/core/types.ts since v0.1.
  • hollowArtifact guard enhanced with opt-in DSPy semantic evaluation (useDspy: true, dspyEndpoint, dspyTimeoutMs).
  • New eval CLI subcommand for standalone artifact quality analysis with DSPy.
  • Shared DSPy Client (src/core/dspy-client.ts): Extracted and generalized to serve as the universal DSPy integration point across ALL DiD layers. Supports artifact, lesson, search, and recall evaluation types.
  • Lesson Quality Gate (v0.5.1): recordLesson() now optionally evaluates lesson quality via DSPy before persisting. Generic lessons (score < 0.5) are REJECTED. CLI: --quality-gate flag.
  • Semantic Lesson Search (v0.5.2): searchLessons() supports DSPy-powered semantic ranking, replacing String.includes() for dramatically better recall. Falls back to string matching when DSPy unavailable. CLI: --semantic flag.
  • Guard F1 Metrics: GuardF1Metric type + computeF1() utility for measuring guard precision, recall, and F1 score. Applies Information Retrieval scoring to the guard pipeline.
  • Key architectural insight: DSPy is integrated as an enhancement OF the existing guard, not a separate evaluation subsystem. Zero-infrastructure default is preserved — DSPy is fully opt-in and degrades gracefully when the service is unreachable. Tagline: "Works without AI. Excels WITH AI."

Each phase builds on the previous. Agents MUST NOT implement future-phase features unless explicitly tasked.

Status Update (v0.6): Foundation through Intelligence (v0.1–v0.5) shipped. Federation (v0.6) shipped:

  • Federation Guard (federationGuard): Pure guard that cross-validates child project execution against parent ticket lifecycle phase. Configurable blockedParentPhases, severity modes (block/warn).
  • HttpTicketProvider: Network-aware provider using globalThis.fetch with AbortController for timeout enforcement. Enables cross-project federation via REST endpoints.
  • Engine enrichment pipeline: enrichParentTicket() resolves parent state as a second-stage enrichment BEFORE the guard pipeline runs, preserving guard purity.
  • Graceful degradation: All provider failures (timeout, network error, 404) degrade to WARN findings, never crash the pipeline.
  • Key architectural insight: Guard purity is enforced by architecture, not discipline. The engine's enrichment phase handles ALL I/O. Guards only read ctx.ticket.* fields. This eliminates an entire class of provider-crash-kills-pipeline bugs.
  • Bugs caught by TDD: (1) FileTicketProvider leaked empty-string parentId through != null check. (2) HttpTicketProvider silently dropped parentId from JSON responses. Both caught by edge/integration tests before release.
  • Test suite: 99 tests total, 37 federation-specific (17 guard unit, 8 HTTP provider, 6 file provider, 6 engine integration).

Multi-Agent Operations (v0.6.1): Established formal strategy for leveraging external AI tools alongside operational agents:

  • Agent taxonomy: Operational Agents (Main Agent — human-commanded, core builders) vs External Agents (Jules, CodeRabbit — third-party tools leveraged for optimization).
  • Jules Integration: External async builder for routine tasks (tests, bug fixes, docs). Constrained by .agents/contracts/jules.md. NOT a core dependency.
  • CodeRabbit Hardening: External PR reviewer. .coderabbit.yaml expanded with path instructions for tests/**, docs/**, .agents/**.
  • Key clarification: This multi-agent setup is DiD's internal development strategy for optimizing its own workflow. It is NOT a requirement imposed on DiD package users.
  • Architectural insight: The distinction between "operational agent" (trusted, human-commanded) and "external agent" (constrained, config-bounded) is itself a defense-in-depth principle applied to AI governance.