In fact, forget the park. And the blackjack.
claude-mem works. The ideas are proven: hook-based lifecycle capture, AI-powered compression, relational profiling, context injection. But the implementation is buried under an ecosystem tax that makes it fragile and hostile to new environments:
- 5 runtimes to do one job: Node.js, Bun, Python (for Chroma), uv, SQLite CLI
- bun-runner.js exists solely to find bun when it's not in PATH
- smart-install.js exists solely to install the runtimes that run the actual code
- devbox exists solely to make the runtimes reproducible
- SessionStore has a duplicate migration chain because two subsystems evolved independently
- Stop hooks don't reliably fire because the bun→node→worker relay chain has too many failure modes
The architecture is sound. The substrate is the problem.
Replace the entire JS/Python stack with Continuity — a single statically-linked Go binary. Same concepts, clean-room implementation, zero runtime dependencies.
$ continuity
Usage:
continuity serve # Start HTTP API + worker
continuity hook <event> # Handle Claude Code hook (reads stdin)
continuity search <query> # Search memories from CLI
continuity profile # Show relational profile
continuity tree [path] # Browse memory tree
continuity import # Migrate from claude-mem SQLite DB
continuity version # Print version
One binary. Ships as a GitHub release. brew install continuity or curl | sh. No npm, no bun, no devbox, no smart-install. Hook scripts become:
{
"type": "command",
"command": "continuity hook stop --transcript=${CLAUDE_TRANSCRIPT}",
"timeout": 120
}┌─────────────────────────────────────────────────────┐
│ continuity binary │
├──────────┬──────────┬───────────┬───────────────────┤
│ CLI │ Hooks │ HTTP API │ Worker (goroutines)│
│ search │ start │ /context │ compress │
│ profile │ submit │ /search │ extract │
│ tree │ tool │ /health │ decay │
│ import │ stop │ /profile │ vectorize │
│ │ end │ /tree │ │
├──────────┴──────────┴───────────┴───────────────────┤
│ Memory Engine │
│ ┌─────────┐ ┌──────────┐ ┌────────┐ ┌───────────┐ │
│ │MemTree │ │Extractor │ │Relator │ │ Retriever │ │
│ │(L0/L1/L2)│ │(taxonomy)│ │(profile)│ │(hybrid) │ │
│ └─────────┘ └──────────┘ └────────┘ └───────────┘ │
├─────────────────────────────────────────────────────┤
│ Storage Layer │
│ ┌──────────────────┐ ┌──────────────────────────┐ │
│ │ SQLite │ │ Embedded Vector Index │ │
│ │ (modernc.org) │ │ (pure Go, no CGO) │ │
│ │ ~/.continuity/ │ │ HNSW in same .db file │ │
│ └──────────────────┘ └──────────────────────────┘ │
├─────────────────────────────────────────────────────┤
│ LLM Clients │
│ Claude CLI │ Anthropic API │ Gemini │ Ollama │
└─────────────────────────────────────────────────────┘
Flat vector databases treat all content as equal-weight chunks floating in embedding space. This sucks for two reasons: (1) you burn tokens retrieving irrelevant detail, and (2) you can't browse — you can only search.
We steal OpenViking's filesystem paradigm but implement it in SQLite, not on disk.
CREATE TABLE mem_nodes (
id INTEGER PRIMARY KEY,
uri TEXT NOT NULL UNIQUE, -- 'mem://user/profile/coding-style'
parent_uri TEXT, -- 'mem://user/profile/'
node_type TEXT NOT NULL, -- 'dir' | 'leaf'
category TEXT NOT NULL, -- taxonomy category (see §5)
-- Three-tier content
l0_abstract TEXT, -- ~100 tokens. Vector search surface.
l1_overview TEXT, -- ~2K tokens. Structured summary.
l2_content TEXT, -- Full content. Loaded on demand.
-- Merge control
mergeable INTEGER NOT NULL DEFAULT 0, -- Can this be updated in place?
merged_from TEXT, -- JSON array of source node IDs if merged
-- Decay
relevance REAL NOT NULL DEFAULT 1.0, -- Decays over time
last_access INTEGER, -- epoch ms, boosted on retrieval
access_count INTEGER NOT NULL DEFAULT 0,
-- Metadata
source_session TEXT, -- session ID that created this
created_at INTEGER NOT NULL, -- epoch ms
updated_at INTEGER NOT NULL, -- epoch ms
FOREIGN KEY (parent_uri) REFERENCES mem_nodes(uri)
);
CREATE INDEX idx_nodes_parent ON mem_nodes(parent_uri);
CREATE INDEX idx_nodes_category ON mem_nodes(category);
CREATE INDEX idx_nodes_relevance ON mem_nodes(relevance DESC);Every memory gets three representations, auto-generated by LLM:
| Tier | Size | Purpose | When loaded |
|---|---|---|---|
| L0 | ~100 tokens | One-line abstract. This is the vector search surface. | Always — it's what gets embedded and searched |
| L1 | ~2K tokens | Structured overview. Enough to decide if L2 is needed. | Context injection, browsing |
| L2 | Unlimited | Full original content. Raw observations, full transcripts. | On-demand deep retrieval only |
Why this matters: Current claude-mem injects compressed observations at full fidelity. With L0/L1/L2, Continuity injects L1s — the agent gets the shape of what happened without the weight. If it needs detail, it can pull specific L2s via the search skill.
mem://user/profile/ # User identity & preferences
mem://user/profile/coding-style # Mergeable leaf
mem://user/profile/communication # Mergeable leaf (relational)
mem://user/entities/ # People, projects, services
mem://user/entities/acme-api # Immutable leaf
mem://user/events/ # Things that happened
mem://user/events/2026-02-25-deploy # Immutable leaf
mem://agent/patterns/ # Reusable techniques learned
mem://agent/patterns/error-handling # Mergeable leaf
mem://agent/cases/ # Problem→solution pairs
mem://agent/cases/sqlite-migration # Immutable leaf
mem://sessions/ # Session archive
mem://sessions/abc123/ # Per-session directory
mem://sessions/abc123/summary # L1 session summary
mem://sessions/abc123/observations # L2 full observations
Deterministic access: continuity tree mem://user/profile/ shows all profile nodes without search. Debugging is trivial — you can browse the tree.
Six categories with explicit merge rules. Stolen from OpenViking's insight that profile data should be updated but historical events should be preserved.
| Category | Owner | Mergeable | Description |
|---|---|---|---|
| profile | user | yes | Identity attributes: coding style, tool preferences, communication patterns. Updated as understanding deepens. |
| preferences | user | yes | Changeable choices: "uses bun not npm", "prefers Go over Rust". Overwritten when preferences change. |
| entities | user | no | People, projects, services, APIs the user works with. Each is a distinct node. Never merged — acme-api and beta-service stay separate. |
| events | user | no | Completed actions with timestamps: deployments, bug fixes, decisions. Immutable historical record. |
| patterns | agent | yes | Reusable techniques: "this codebase uses X pattern for Y". Merged as understanding refines. |
| cases | agent | no | Problem→solution pairs: "SQLite migration failed because SessionStore has its own chain". Immutable reference. |
After each session (Stop hook), the extractor:
- Reads the transcript
- Condenses it (all user messages, assistant bookends, drop tool_use/tool_result)
- Sends to LLM with the taxonomy definitions
- LLM returns structured candidates:
{ category, uri_hint, l0, l1, l2, merge_target? } - Vector dedup against existing L0s
- For each candidate: skip (duplicate), create (new node), or merge (update existing mergeable node)
- Persist to SQLite + update vector index
The relational profile from the current implementation maps to mem://user/profile/communication — a mergeable node in the profile category. Same extraction prompt (feedback calibration, working dynamic, corrections received, earned signals), same compounding behavior. But now it's a first-class node in the tree, not a separate table.
Memories are not forever. Stolen from supermemory's insight that human memory fades.
// Decay runs daily (or on serve startup)
func (e *Engine) DecayMemories() {
// Half-life: 90 days without access
// Each access resets the decay clock
// Relevance floor: 0.1 (never fully forgotten, just deprioritized)
e.db.Exec(`
UPDATE mem_nodes
SET relevance = MAX(0.1, relevance * pow(0.5,
(strftime('%s','now')*1000 - last_access) / (90.0 * 86400000)
))
WHERE last_access IS NOT NULL
AND node_type = 'leaf'
`)
}Retrieval boosts relevance: when a memory is included in context injection or returned from search, its last_access and access_count update. Frequently useful memories stay vivid. Stale ones fade to near-zero but never disappear — a search can still surface them.
Decay exemptions: Relational profile nodes (mem://user/profile/communication) don't decay. How you work with someone doesn't become less relevant over time.
Two modes, matching OpenViking's find/search split:
Direct vector similarity against L0 abstracts. Returns top-k nodes ranked by similarity * relevance. Good for direct lookups: "what do I know about the acme API?"
- Intent analysis (LLM): decompose query into 1-3 typed sub-queries, each tagged MEMORY/RESOURCE/PATTERN
- Tree-aware retrieval: for each sub-query:
- Vector search against L0s to find candidate directories
- Walk up to parent, check sibling relevance (contextual gravity)
- Score =
0.5 * embedding_similarity + 0.3 * relevance + 0.2 * parent_score
- Progressive loading: return L1s by default, let the agent request specific L2s
- Convergence: stop after 3 rounds of stable top-k
The main context payload sent to Claude at session start:
## Working With You
{L1 of mem://user/profile/communication}
## Your Profile
{L1 of mem://user/profile/coding-style}
{L1 of mem://user/profile/preferences}
## Recent Activity
{L1s of recent session nodes, sorted by date, filtered by relevance > 0.3}
## Active Entities
{L1s of frequently-accessed entity nodes}Total budget: ~4K tokens for context injection. L0s used for selection, L1s used for content. L2s never injected automatically.
Stolen from supermemory. Certain phrases trigger immediate high-priority capture:
| Trigger | Action |
|---|---|
| "remember this", "don't forget" | Capture surrounding context as high-relevance memory |
| "always use X", "never do Y" | Capture as preference node (mergeable) |
| "architecture decision", "we decided" | Capture as event node (immutable) |
| "this pattern", "the trick is" | Capture as pattern node (mergeable) |
| "bug was", "root cause" | Capture as case node (immutable) |
These are processed at the UserPromptSubmit hook — no need to wait for session end. The extracted memory is immediately available in subsequent context injections within the same session.
Continuity needs to call an LLM for: L0/L1 generation, memory extraction, relational profiling, intent analysis, and merge decisions.
Priority order:
- Claude CLI —
claude -psubprocess. Uses existing Claude Max subscription. No API key needed. Free with what you're already paying for. This is what claude-mem used successfully. - Ollama — Local models. Zero cost, works offline. Good enough for L0/L1 generation and potentially extraction.
- Anthropic API direct —
ANTHROPIC_API_KEYenv var. Fastest and cleanest, but costs extra — API billing is completely separate from Claude Max. - Gemini / OpenRouter — Alternatives if you have keys.
Configuration in ~/.continuity/config.toml:
[llm]
provider = "claude-cli" # claude-cli | ollama | anthropic | gemini | openrouter
model = "claude-haiku-4-5-20251001"
[llm.extraction]
model = "claude-haiku-4-5-20251001" # cheap/fast model for bulk extraction
[llm.merge]
model = "claude-sonnet-4-6" # smarter model for merge decisions
[llm.ollama]
endpoint = "http://localhost:11434"
model = "llama3.2"
[llm.anthropic]
api_key_env = "ANTHROPIC_API_KEY" # only needed if provider = "anthropic"Claude Code hooks communicate via stdin (JSON) and exit codes. Continuity handles this natively:
continuity hook start # stdin: session info → inject context → stdout: JSON
continuity hook submit # stdin: user message → signal keyword scan → stdout: JSON
continuity hook tool # stdin: tool result → buffer observation
continuity hook stop # stdin: session info → async: extract + relational + decay
continuity hook end # stdin: session info → finalize session archive
Exit codes (unchanged from current):
- 0: Success
- 1: Non-blocking error (stderr shown to user)
- 2: Blocking error (stderr fed to Claude)
No bun-runner. No node subprocess. No finding executables. The binary IS the executable.
Same port (37777), same endpoints, for viewer UI and search skill compatibility:
GET /api/health
GET /api/context/:sessionId
GET /api/search?q=<query>&mode=find|search
GET /api/profile
GET /api/tree?uri=<path>
GET /api/sessions
GET /api/sessions/:id
POST /api/memories # Manual memory creation
GET / # Embedded viewer UI (SPA)
The viewer UI HTML/JS/CSS is embedded in the binary via go:embed. No separate build step.
Single file: ~/.continuity/continuity.db
SQLite via modernc.org/sqlite — pure Go, no CGO, cross-compiles to every platform without a C toolchain.
Vector index: HNSW implementation in pure Go, stored as a separate table in the same SQLite database. Alternatives to evaluate:
github.com/viterin/vek— SIMD-accelerated vector operations- Custom HNSW with SQLite-backed adjacency lists
- Or just brute-force cosine similarity if the corpus stays under 10K nodes (it will for a long time)
Embeddings: generated locally via Ollama, or via API (Anthropic/OpenAI/Voyage). Stored in a mem_vectors table alongside the node ID.
continuity import --from=claude-mem
Reads the existing ~/.claude-mem/claude-mem.db (JS schema), extracts:
- Session summaries →
mem://sessions/nodes - Observations → fed through extraction pipeline to create categorized memories
- Relational profile →
mem://user/profile/communication - Settings →
~/.continuity/config.toml
The import is non-destructive — it reads the old DB and writes to the new schema. Old DB is preserved.
# Build
go build -o continuity ./cmd/continuity
# Cross-compile
GOOS=darwin GOARCH=arm64 go build -o continuity-darwin-arm64
GOOS=linux GOARCH=amd64 go build -o continuity-linux-amd64
GOOS=windows GOARCH=amd64 go build -o continuity-windows-amd64.exe
# Install
brew install continuity # macOS
curl -fsSL https://... | sh # Linux
scoop install continuity # WindowsGitHub Actions builds all platforms on tag push. Single binary, ~15-25MB with embedded UI.
continuity/
├── cmd/
│ └── continuity/
│ └── main.go # CLI entry point (cobra)
├── internal/
│ ├── engine/
│ │ ├── engine.go # Memory engine orchestrator
│ │ ├── extractor.go # Taxonomy-based memory extraction
│ │ ├── relator.go # Relational profile extraction
│ │ ├── decay.go # Smart decay implementation
│ │ └── retriever.go # Hybrid find/search retrieval
│ ├── hooks/
│ │ ├── handler.go # Hook dispatcher
│ │ ├── start.go # SessionStart: context injection
│ │ ├── submit.go # UserPromptSubmit: signal keywords
│ │ ├── tool.go # PostToolUse: buffer observations
│ │ ├── stop.go # Stop: trigger extraction
│ │ └── end.go # SessionEnd: finalize
│ ├── llm/
│ │ ├── client.go # LLM client interface
│ │ ├── anthropic.go # Anthropic API direct
│ │ ├── claude_cli.go # Claude CLI subprocess
│ │ ├── ollama.go # Ollama local
│ │ └── prompts.go # All prompt templates
│ ├── server/
│ │ ├── server.go # HTTP API (net/http or chi)
│ │ ├── routes.go # Route handlers
│ │ └── middleware.go # CORS, logging
│ ├── store/
│ │ ├── db.go # SQLite connection + migrations
│ │ ├── nodes.go # mem_nodes CRUD
│ │ ├── vectors.go # Vector index operations
│ │ └── sessions.go # Session tracking
│ ├── tree/
│ │ ├── tree.go # Virtual filesystem operations
│ │ ├── uri.go # mem:// URI parsing
│ │ └── walk.go # Tree traversal
│ └── transcript/
│ ├── parser.go # JSONL transcript parser
│ └── condenser.go # Transcript condensation
├── ui/
│ └── viewer/ # Embedded SPA (go:embed)
├── plugin/
│ └── hooks.json # Claude Code hook definitions
├── go.mod
├── go.sum
└── Makefile
go mod init github.com/chuck/continuity, cobra CLI, SQLite connection, health endpoint- Prove: binary compiles, starts HTTP server, creates DB
- Transcript parser (port from TS)
- Hook handlers (start/submit/tool/stop/end)
- Basic observation buffering (no extraction yet, just capture)
- Context injection (port existing format)
- mem_nodes schema + CRUD
- LLM client (Anthropic API first, Claude CLI second)
- Extraction pipeline: transcript → candidates → dedup → create/merge
- L0/L1/L2 generation
- Relational profiler (port from TS)
- Vector embeddings (brute-force cosine first, HNSW later if needed)
find()— direct vector searchsearch()— LLM-assisted intent decomposition- Smart decay
- Embed existing viewer UI (or minimal replacement)
continuity importmigration tool- Signal keyword detection
- Tree browsing CLI
- GitHub Actions cross-compilation
- Homebrew formula
- Install script
- Update the Claude Code plugin hooks.json to point at
continuity
- Cloud sync / multi-device (local-first, period)
- Multi-user / team features (single user, single machine)
- Plugin marketplace packaging (Continuity is a standalone binary that Claude Code hooks call)
- MCP server (hooks are simpler and sufficient; can add later if needed)
- Web scraping / URL ingestion (memories come from sessions, not the internet)
Continuity is done when:
continuity servereplaces the entire bun worker service- Hook scripts are one-liners calling
continuity hook <event> - Context injection at SessionStart works identically to current
- Memory extraction produces categorized L0/L1/L2 nodes from real sessions
- Relational profiling compounds across sessions
continuity searchreturns relevant results from CLI- A fresh machine goes from zero to working in:
brew install continuity - No node. No bun. No python. No devbox. No npm. Just the binary.