Skip to content

Commit 1ff79d8

Browse files
rahlkclaude
andcommitted
docs: document Neo4j property-graph output (scale, Kubernetes, enterprise)
Reframe the docs around the new `--emit neo4j` target: project the analysis into a labeled Neo4j property graph that scales beyond a single JSON file. Introduce the producer/consumer split — `canpy` runs out-of-band as a CI/Kubernetes job pushing app-scoped subgraphs to Neo4j over Bolt, while the CLDK Python SDK and agents read the graph read-only via Neo4jConnectionConfig. - New Neo4j guide + graph-schema reference pages; sidebar entries - Neo4j property-graph hero visual (src/components/Neo4jPropertyGraph.astro) - Document --emit json|neo4j|schema, --app-name, --neo4j-* flags and NEO4J_* env - graph.cypher snapshot vs incremental Bolt push; per-app scoping; schema_version 1.1.0 - CLDK read-back examples; extended architecture/concept mermaid diagrams - Fix stale CLI facts and cross-links; verified with `astro build` Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent c8a6e27 commit 1ff79d8

15 files changed

Lines changed: 1494 additions & 133 deletions

astro.config.mjs

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ export default defineConfig({
2121
title: "codeanalyzer-python",
2222
tagline: "Static analysis for Python your agents can call.",
2323
description:
24-
"codeanalyzer-python turns a Python project into one typed artifact — symbol table, call graph, and framework entrypoints — using Jedi, CodeQL, and Tree-sitter. The Python backend behind CLDK.",
24+
"codeanalyzer-python turns a Python project into a typed symbol table and call graph — emitted as one analysis JSON artifact or a queryable Neo4j property graph — using Jedi, CodeQL, and Tree-sitter. The Python backend behind CLDK.",
2525
logo: {
2626
src: "./src/assets/logo.png",
2727
replacesTitle: true,
@@ -97,6 +97,7 @@ export default defineConfig({
9797
{ label: "CLI usage", slug: "guides/cli-usage" },
9898
{ label: "Core concepts", slug: "guides/concepts" },
9999
{ label: "CodeQL analysis", slug: "guides/codeql" },
100+
{ label: "Neo4j graph", slug: "guides/neo4j" },
100101
],
101102
},
102103
{
Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
---
2+
// AUTO-GENERATED property-graph hero for python. Encodes the real Neo4j schema
3+
// (node labels, typed relationships, key properties). Theme-aware via Starlight
4+
// CSS custom properties; renders as static SVG (no client JS).
5+
---
6+
<figure class="pgraph-figure">
7+
<svg class="pgraph" viewBox="0 0 1080 600" role="img" aria-label="codeanalyzer-python · Neo4j property graph" xmlns="http://www.w3.org/2000/svg">
8+
<defs>
9+
<marker id="arrow" viewBox="0 0 10 10" refX="9" refY="5" markerWidth="7" markerHeight="7" orient="auto-start-reverse">
10+
<path d="M 0 0 L 10 5 L 0 10 z" fill="var(--pg-edge)"/>
11+
</marker>
12+
<marker id="arrow-accent" viewBox="0 0 10 10" refX="9" refY="5" markerWidth="8" markerHeight="8" orient="auto-start-reverse">
13+
<path d="M 0 0 L 10 5 L 0 10 z" fill="#DA7194"/>
14+
</marker>
15+
<radialGradient id="gloss" cx="35%" cy="30%" r="75%">
16+
<stop offset="0%" stop-color="#ffffff" stop-opacity="0.35"/>
17+
<stop offset="55%" stop-color="#ffffff" stop-opacity="0.0"/>
18+
</radialGradient>
19+
<filter id="nshadow" x="-30%" y="-30%" width="160%" height="160%">
20+
<feDropShadow dx="0" dy="3" stdDeviation="4" flood-color="#0b1020" flood-opacity="0.28"/>
21+
</filter>
22+
</defs>
23+
<path d="M 155.8 271.1 L 292.7 160.2" class="rel" fill="none" marker-end="url(#arrow)"/>
24+
<g transform="translate(224.2,215.6)"><rect x="-49.5" y="-11" width="99.0" height="22" rx="11" class="pill"/><text x="0" y="4" class="rtype" text-anchor="middle">PY_HAS_MODULE</text></g>
25+
<path d="M 330.0 176.0 L 330.0 422.0" class="rel" fill="none" marker-end="url(#arrow)"/>
26+
<g transform="translate(330.0,333.4)"><rect x="-42.5" y="-11" width="85.0" height="22" rx="11" class="pill"/><text x="0" y="4" class="rtype" text-anchor="middle">PY_DECLARES</text></g>
27+
<path d="M 284.0 470.0 L 168.0 470.0" class="rel" fill="none" marker-end="url(#arrow)"/>
28+
<g transform="translate(226.0,470.0)"><rect x="-60.0" y="-11" width="120.0" height="22" rx="11" class="pill"/><text x="0" y="4" class="rtype" text-anchor="middle">PY_HAS_ATTRIBUTE</text></g>
29+
<path d="M 367.0 442.7 L 521.4 328.5" class="rel" fill="none" marker-end="url(#arrow)"/>
30+
<g transform="translate(444.2,385.6)"><rect x="-49.5" y="-11" width="99.0" height="22" rx="11" class="pill"/><text x="0" y="4" class="rtype" text-anchor="middle">PY_HAS_METHOD</text></g>
31+
<path d="M 560.0 346.0 L 560.0 422.0" class="rel" fill="none" marker-end="url(#arrow)"/>
32+
<g transform="translate(560.0,394.6)"><rect x="-56.5" y="-11" width="113.0" height="22" rx="11" class="pill"/><text x="0" y="4" class="rtype" text-anchor="middle">PY_DECORATED_BY</text></g>
33+
<path d="M 597.0 272.7 L 751.4 158.5" class="rel" fill="none" marker-end="url(#arrow)"/>
34+
<g transform="translate(674.2,215.6)"><rect x="-56.5" y="-11" width="113.0" height="22" rx="11" class="pill"/><text x="0" y="4" class="rtype" text-anchor="middle">PY_HAS_CALLSITE</text></g>
35+
<path d="M 823.9 161.1 L 939.7 267.5" class="rel" fill="none" marker-end="url(#arrow)"/>
36+
<g transform="translate(881.8,214.3)"><rect x="-53.0" y="-11" width="106.0" height="22" rx="11" class="pill"/><text x="0" y="4" class="rtype" text-anchor="middle">PY_RESOLVES_TO</text></g>
37+
<path d="M 606.0 300.0 Q 734.4 170.0 927.0 300.0" class="rel-accent" fill="none" marker-end="url(#arrow-accent)"/>
38+
<g transform="translate(725.2,236.7)"><rect x="-32.0" y="-11" width="64.0" height="22" rx="11" class="pill-accent"/><text x="0" y="4" class="rtype" text-anchor="middle">PY_CALLS</text></g>
39+
<circle cx="120" cy="300" r="46" fill="#C990C0" stroke="#ffffff" stroke-width="2.5" filter="url(#nshadow)"/>
40+
<circle cx="120" cy="300" r="46" fill="url(#gloss)"/>
41+
<text x="120" y="304" class="nlabel" text-anchor="middle">:PyApplication</text>
42+
<text x="120" y="362" class="prop" text-anchor="middle">name</text>
43+
<text x="120" y="378" class="prop" text-anchor="middle">schema_version</text>
44+
<circle cx="330" cy="130" r="46" fill="#4C8EDA" stroke="#ffffff" stroke-width="2.5" filter="url(#nshadow)"/>
45+
<circle cx="330" cy="130" r="46" fill="url(#gloss)"/>
46+
<text x="330" y="134" class="nlabel" text-anchor="middle">:PyModule</text>
47+
<text x="330" y="192" class="prop" text-anchor="middle">module_name</text>
48+
<text x="330" y="208" class="prop" text-anchor="middle">content_hash</text>
49+
<circle cx="330" cy="470" r="46" fill="#F79767" stroke="#ffffff" stroke-width="2.5" filter="url(#nshadow)"/>
50+
<circle cx="330" cy="470" r="46" fill="url(#gloss)"/>
51+
<text x="330" y="466" class="nlabel" text-anchor="middle">:PyClass</text>
52+
<text x="330" y="483" class="nsub" text-anchor="middle">:PySymbol</text>
53+
<text x="330" y="532" class="prop" text-anchor="middle">name</text>
54+
<text x="330" y="548" class="prop" text-anchor="middle">base_classes</text>
55+
<circle cx="120" cy="470" r="46" fill="#57C7E3" stroke="#ffffff" stroke-width="2.5" filter="url(#nshadow)"/>
56+
<circle cx="120" cy="470" r="46" fill="url(#gloss)"/>
57+
<text x="120" y="474" class="nlabel" text-anchor="middle">:PyAttribute</text>
58+
<text x="120" y="532" class="prop" text-anchor="middle">name</text>
59+
<text x="120" y="548" class="prop" text-anchor="middle">type</text>
60+
<circle cx="560" cy="300" r="46" fill="#8DCC93" stroke="#ffffff" stroke-width="2.5" filter="url(#nshadow)"/>
61+
<circle cx="560" cy="300" r="46" fill="url(#gloss)"/>
62+
<text x="560" y="296" class="nlabel" text-anchor="middle">:PyCallable</text>
63+
<text x="560" y="313" class="nsub" text-anchor="middle">:PySymbol</text>
64+
<text x="560" y="362" class="prop" text-anchor="middle">signature</text>
65+
<text x="560" y="378" class="prop" text-anchor="middle">cyclomatic_complexity</text>
66+
<circle cx="560" cy="470" r="46" fill="#ECB5C9" stroke="#ffffff" stroke-width="2.5" filter="url(#nshadow)"/>
67+
<circle cx="560" cy="470" r="46" fill="url(#gloss)"/>
68+
<text x="560" y="474" class="nlabel" text-anchor="middle">:PyDecorator</text>
69+
<text x="560" y="532" class="prop" text-anchor="middle">name</text>
70+
<circle cx="790" cy="130" r="46" fill="#FFC454" stroke="#ffffff" stroke-width="2.5" filter="url(#nshadow)"/>
71+
<circle cx="790" cy="130" r="46" fill="url(#gloss)"/>
72+
<text x="790" y="134" class="nlabel" text-anchor="middle">:PyCallSite</text>
73+
<text x="790" y="192" class="prop" text-anchor="middle">method_name</text>
74+
<text x="790" y="208" class="prop" text-anchor="middle">receiver_type</text>
75+
<circle cx="975" cy="300" r="46" fill="#569480" stroke="#ffffff" stroke-width="2.5" filter="url(#nshadow)"/>
76+
<circle cx="975" cy="300" r="46" fill="url(#gloss)"/>
77+
<text x="975" y="304" class="nlabel" text-anchor="middle">:PyExternal</text>
78+
<text x="975" y="362" class="prop" text-anchor="middle">name</text>
79+
<text x="975" y="378" class="prop" text-anchor="middle">module</text>
80+
</svg>
81+
<figcaption>
82+
The analysis is a <strong>Neo4j property graph</strong>: every node carries a
83+
<em>label</em> (its color) and <em>properties</em>; every relationship carries a
84+
<em>type</em>. The dashed ring marks an <code>entrypoint</code>;
85+
the <span class="calls">PY_CALLS</span> edge is the resolved call graph.
86+
</figcaption>
87+
</figure>
88+
89+
<style>
90+
.pgraph-figure {
91+
margin: 1.5rem 0 2rem;
92+
padding: 0.5rem 0.25rem 0.25rem;
93+
border: 1px solid var(--sl-color-gray-5);
94+
border-radius: 14px;
95+
background:
96+
radial-gradient(120% 80% at 15% 0%, color-mix(in srgb, var(--sl-color-accent-low) 55%, transparent), transparent 60%),
97+
var(--sl-color-black);
98+
overflow: hidden;
99+
}
100+
.pgraph { width: 100%; height: auto; display: block; }
101+
/* edges + labels adapt to theme */
102+
.pgraph { --pg-edge: var(--sl-color-gray-3); }
103+
.pgraph .rel { stroke: var(--pg-edge); stroke-width: 2.2; }
104+
.pgraph .rel-accent { stroke: #DA7194; stroke-width: 3.2; }
105+
.pgraph .nlabel { fill: #fff; font: 700 14px/1.1 ui-sans-serif, system-ui, sans-serif; letter-spacing: .2px; }
106+
.pgraph .nsub { fill: rgba(255,255,255,.92); font: 600 12px ui-sans-serif, system-ui, sans-serif; }
107+
.pgraph .prop { fill: var(--sl-color-gray-2); font: 500 12px ui-monospace, "SFMono-Regular", monospace; }
108+
.pgraph .rtype { fill: #fff; font: 700 11px ui-monospace, monospace; letter-spacing: .3px; }
109+
.pgraph .pill { fill: #3a4252; stroke: rgba(255,255,255,.18); }
110+
.pgraph .pill-accent { fill: #DA7194; }
111+
.pgraph .badge { fill: #ffd98a; font: 700 11px ui-monospace, monospace; }
112+
.pgraph-figure figcaption {
113+
color: var(--sl-color-gray-2);
114+
font-size: 0.85rem; line-height: 1.5;
115+
padding: 0.4rem 1rem 0.75rem; text-align: center;
116+
}
117+
.pgraph-figure .calls { color: #DA7194; font-weight: 700; font-family: ui-monospace, monospace; }
118+
</style>

src/content/docs/extending/analysis-passes.mdx

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,8 @@ Passes declare capability tokens — free-form strings — in `provides` and `re
4545

4646
Passes hand derived facts to each other through `ctx.shared`, keyed by capability token: the provider writes `ctx.shared["odoo.model_identity"] = ...`; the consumer reads it back. `ctx.shared` is the one mutable part of the otherwise-frozen context.
4747

48+
The pipeline produces a single enriched `PyApplication`. That in-memory model is what `canpy` then emits — as `analysis.json` by default, or projected into a Neo4j property graph with `--emit neo4j`. Either way it is the *same* model, so your pass's entrypoints and synthetic edges flow through both targets unchanged:
49+
4850
```mermaid
4951
flowchart LR
5052
A["pass A
@@ -53,6 +55,12 @@ requires: model_identity"]
5355
A --> M[merge into app]
5456
B --> M
5557
M --> N[next pass sees both]
58+
N --> IR["enriched PyApplication (IR)"]
59+
IR -->|"--emit json"| J["analysis.json"]
60+
IR -->|"--emit neo4j"| G["labeled property graph
61+
(:PyApplication / PY_CALLS)"]
62+
G -.->|"no --neo4j-uri"| SNAP["graph.cypher snapshot"]
63+
G -.->|"--neo4j-uri (Bolt)"| LIVE["live incremental push"]
5664
```
5765

5866
## The AnalysisContext
@@ -144,6 +152,10 @@ class OdooDispatchPass(AnalysisPass):
144152

145153
Core never interprets your `provenance` tokens, `detection_source` values, or `tags` keys — they round-trip through `analysis.json` untouched, so downstream consumers (or an LLM) can reason over them.
146154

155+
<Aside type="note" title="Provenance survives the Neo4j projection">
156+
When you emit to Neo4j with `--emit neo4j`, each synthetic `PyCallEdge` becomes a `PY_CALLS` relationship, and your `provenance` token is preserved on its `provenance` property (alongside the integer `weight`) — so an extension's edges stay attributable whether a consumer reads `analysis.json` or queries the property graph. See [Output schema](/codeanalyzer-python/reference/schema/) for the projected node and relationship types.
157+
</Aside>
158+
147159
## Registering your pass
148160

149161
Declare the entry point in your package's `pyproject.toml`. Each entry point must resolve to an `AnalysisPass` subclass (or an instance).
@@ -158,9 +170,9 @@ odoo-dispatch = "my_package.passes:OdooDispatchPass"
158170

159171
1. **Install your package** into the same environment as codeanalyzer (`pip install -e .`).
160172

161-
2. **Run codeanalyzer normally.** `discover_passes()` finds your entry point, instantiates it, and the registry orders it among the built-ins.
173+
2. **Run `canpy` normally.** `discover_passes()` finds your entry point, instantiates it, and the registry orders it among the built-ins.
162174

163-
3. **Check the artifact.** Your entrypoints appear under `app.entrypoints[framework]`; your edges appear in `app.call_graph` with your provenance token.
175+
3. **Check the artifact.** Your entrypoints appear under `app.entrypoints[framework]`; your edges appear in `app.call_graph` with your provenance token — and likewise on the `PY_CALLS` edges if you emit to Neo4j.
164176

165177
</Steps>
166178

src/content/docs/extending/overview.mdx

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ import { Badge, CardGrid, LinkCard, Aside } from "@astrojs/starlight/components"
1010
<Badge text="Help wanted" variant="note" />
1111
</p>
1212

13-
After the symbol table and base call graph are built, codeanalyzer runs a pipeline of **analysis passes** — whole-application steps that contribute framework-dispatched **entrypoints** and **synthetic call edges** the static graph can't observe. Out-of-tree packages register their own through the `codeanalyzer.analysis_passes` entry-point group, so you can teach codeanalyzer a new framework or a new dispatch mechanism **without forking it**.
13+
After the symbol table and base call graph are built, `canpy` runs a pipeline of **analysis passes** — whole-application steps that contribute framework-dispatched **entrypoints** and **synthetic call edges** the static graph can't observe. Out-of-tree packages register their own through the `codeanalyzer.analysis_passes` entry-point group, so you can teach `canpy` a new framework or a new dispatch mechanism **without forking it**.
1414

1515
This is the youngest part of codeanalyzer-python: the **mechanism** exists, but **no concrete framework finder ships yet**. `BUILTIN_PASS_FACTORIES` is empty, so until a finder is written, `PyApplication.entrypoints` comes back empty. The frameworks named on this page (Flask, FastAPI, Celery, Click, gRPC, …) are the **roadmap** — the shapes finders will target — not detection that runs today. Writing the first finder for a framework is itself the contribution.
1616

@@ -37,6 +37,10 @@ This is the youngest part of codeanalyzer-python: the **mechanism** exists, but
3737
**Not built yet:** any concrete finder. No framework — Flask, FastAPI, Celery, Click, gRPC, Django, or otherwise — is detected out of the box. Synthetic-edge passes are likewise scaffolding-only. Everything framework-specific is roadmap.
3838
</Aside>
3939

40+
<Aside type="note" title="Synthetic edges carry into the graph">
41+
A synthetic call edge contributed by a pass is tagged with a **provenance** token (the pass's framework/extension name). That token survives projection: when you run `canpy --emit neo4j`, it lands in the `provenance` array on the `PY_CALLS` relationship, exactly as it appears in `analysis.json`. So a consumer can filter the call graph by who contributed each edge — `jedi`, `codeql`, or your extension — whether it reads `analysis.json` or queries the Neo4j graph.
42+
</Aside>
43+
4044
<Aside type="tip" title="Help wanted">
4145
The first concrete finder is the contribution. Good starting points: an `AbstractEntrypointFinder` for a framework you use, a synthetic-edge pass for an ORM or dispatch pattern the static graph misses, or test fixtures to pin a finder's behavior. Open an issue or PR on [GitHub](https://github.com/codellm-devkit/codeanalyzer-python), or come talk it through on [Discord](https://discord.gg/zEjz9YrmqN).
4246
</Aside>

0 commit comments

Comments
 (0)