Skip to content

Commit b751482

Browse files
rahlkclaude
andcommitted
docs: document Neo4j property-graph output (scale, Kubernetes, enterprise)
Reframe the docs around the new `--emit neo4j` target: project the analysis into a labeled Neo4j property graph that scales beyond a single JSON file. Introduce the producer/consumer split — `codeanalyzer` runs out-of-band as a CI/Kubernetes job pushing app-scoped subgraphs to Neo4j over Bolt, while the CLDK Python SDK and agents read the graph read-only via Neo4jConnectionConfig. - New Neo4j guide + graph-schema reference pages; sidebar entries - Neo4j property-graph hero visual (src/components/Neo4jPropertyGraph.astro) - Document --emit json|neo4j|schema, --app-name, --neo4j-* flags and NEO4J_* env - graph.cypher snapshot vs incremental Bolt push; per-app scoping; schema_version 1.0.0 - CLDK read-back examples; extended architecture/concept mermaid diagrams - Fix stale CLI facts and cross-links; verified with `astro build` Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent ee9e3f3 commit b751482

20 files changed

Lines changed: 1743 additions & 96 deletions

astro.config.mjs

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,9 +19,9 @@ export default defineConfig({
1919
}),
2020
starlight({
2121
title: "codeanalyzer-java",
22-
tagline: "WALA + Javaparser static analysis for enterprise Java, as one JSON schema.",
22+
tagline: "WALA + Javaparser static analysis for enterprise Java one JSON artifact or a queryable Neo4j graph.",
2323
description:
24-
"codeanalyzer-java is the JVM static-analysis backend behind CodeLLM-DevKit's Java support: a standalone JAR that turns a Java project into a symbol table and call graph, emitted as one versioned JSON schema.",
24+
"codeanalyzer-java is the JVM static-analysis backend behind CodeLLM-DevKit's Java support: a standalone JAR that turns a Java project into a symbol table and call graph, emitted as one versioned analysis JSON artifact or projected into a queryable Neo4j property graph.",
2525
logo: {
2626
src: "./src/assets/logo.png",
2727
replacesTitle: true,
@@ -97,6 +97,7 @@ export default defineConfig({
9797
{ label: "Analysis levels", slug: "guides/analysis-levels" },
9898
{ label: "Build integration", slug: "guides/build-integration" },
9999
{ label: "Incremental analysis", slug: "guides/incremental-analysis" },
100+
{ label: "Neo4j output", slug: "guides/neo4j-output" },
100101
],
101102
},
102103
{
@@ -112,6 +113,7 @@ export default defineConfig({
112113
{ label: "Overview", slug: "schema" },
113114
{ label: "Symbol table", slug: "schema/symbol-table" },
114115
{ label: "Call graph", slug: "schema/call-graph" },
116+
{ label: "Neo4j graph", slug: "schema/neo4j-graph" },
115117
],
116118
},
117119
{
Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,121 @@
1+
---
2+
// AUTO-GENERATED property-graph hero for java. Encodes the real Neo4j schema
3+
// (node labels, typed relationships, key properties). Theme-aware via Starlight
4+
// CSS custom properties; renders as static SVG (no client JS).
5+
---
6+
<figure class="pgraph-figure">
7+
<svg class="pgraph" viewBox="0 0 1080 600" role="img" aria-label="codeanalyzer-java · Neo4j property graph" xmlns="http://www.w3.org/2000/svg">
8+
<defs>
9+
<marker id="arrow" viewBox="0 0 10 10" refX="9" refY="5" markerWidth="7" markerHeight="7" orient="auto-start-reverse">
10+
<path d="M 0 0 L 10 5 L 0 10 z" fill="var(--pg-edge)"/>
11+
</marker>
12+
<marker id="arrow-accent" viewBox="0 0 10 10" refX="9" refY="5" markerWidth="8" markerHeight="8" orient="auto-start-reverse">
13+
<path d="M 0 0 L 10 5 L 0 10 z" fill="#DA7194"/>
14+
</marker>
15+
<radialGradient id="gloss" cx="35%" cy="30%" r="75%">
16+
<stop offset="0%" stop-color="#ffffff" stop-opacity="0.35"/>
17+
<stop offset="55%" stop-color="#ffffff" stop-opacity="0.0"/>
18+
</radialGradient>
19+
<filter id="nshadow" x="-30%" y="-30%" width="160%" height="160%">
20+
<feDropShadow dx="0" dy="3" stdDeviation="4" flood-color="#0b1020" flood-opacity="0.28"/>
21+
</filter>
22+
</defs>
23+
<path d="M 155.8 271.1 L 292.7 160.2" class="rel" fill="none" marker-end="url(#arrow)"/>
24+
<g transform="translate(224.2,215.6)"><rect x="-39.0" y="-11" width="78.0" height="22" rx="11" class="pill"/><text x="0" y="4" class="rtype" text-anchor="middle">J_HAS_UNIT</text></g>
25+
<path d="M 330.0 176.0 L 330.0 422.0" class="rel" fill="none" marker-end="url(#arrow)"/>
26+
<g transform="translate(330.0,333.4)"><rect x="-56.5" y="-11" width="113.0" height="22" rx="11" class="pill"/><text x="0" y="4" class="rtype" text-anchor="middle">J_DECLARES_TYPE</text></g>
27+
<path d="M 284.0 470.0 L 168.0 470.0" class="rel" fill="none" marker-end="url(#arrow)"/>
28+
<g transform="translate(226.0,470.0)"><rect x="-42.5" y="-11" width="85.0" height="22" rx="11" class="pill"/><text x="0" y="4" class="rtype" text-anchor="middle">J_HAS_FIELD</text></g>
29+
<path d="M 367.0 442.7 L 521.4 328.5" class="rel" fill="none" marker-end="url(#arrow)"/>
30+
<g transform="translate(444.2,385.6)"><rect x="-53.0" y="-11" width="106.0" height="22" rx="11" class="pill"/><text x="0" y="4" class="rtype" text-anchor="middle">J_HAS_CALLABLE</text></g>
31+
<path d="M 560.0 346.0 L 560.0 422.0" class="rel" fill="none" marker-end="url(#arrow)"/>
32+
<g transform="translate(560.0,394.6)"><rect x="-56.5" y="-11" width="113.0" height="22" rx="11" class="pill"/><text x="0" y="4" class="rtype" text-anchor="middle">J_HAS_PARAMETER</text></g>
33+
<path d="M 597.0 272.7 L 751.4 158.5" class="rel" fill="none" marker-end="url(#arrow)"/>
34+
<g transform="translate(674.2,215.6)"><rect x="-53.0" y="-11" width="106.0" height="22" rx="11" class="pill"/><text x="0" y="4" class="rtype" text-anchor="middle">J_HAS_CALLSITE</text></g>
35+
<path d="M 823.9 161.1 L 939.7 267.5" class="rel" fill="none" marker-end="url(#arrow)"/>
36+
<g transform="translate(881.8,214.3)"><rect x="-49.5" y="-11" width="99.0" height="22" rx="11" class="pill"/><text x="0" y="4" class="rtype" text-anchor="middle">J_RESOLVES_TO</text></g>
37+
<path d="M 606.0 300.0 Q 734.4 170.0 927.0 300.0" class="rel-accent" fill="none" marker-end="url(#arrow-accent)"/>
38+
<g transform="translate(725.2,236.7)"><rect x="-28.5" y="-11" width="57.0" height="22" rx="11" class="pill-accent"/><text x="0" y="4" class="rtype" text-anchor="middle">J_CALLS</text></g>
39+
<circle cx="120" cy="300" r="46" fill="#C990C0" stroke="#ffffff" stroke-width="2.5" filter="url(#nshadow)"/>
40+
<circle cx="120" cy="300" r="46" fill="url(#gloss)"/>
41+
<text x="120" y="304" class="nlabel" text-anchor="middle">:JApplication</text>
42+
<text x="120" y="362" class="prop" text-anchor="middle">name</text>
43+
<text x="120" y="378" class="prop" text-anchor="middle">schema_version</text>
44+
<circle cx="330" cy="130" r="46" fill="#4C8EDA" stroke="#ffffff" stroke-width="2.5" filter="url(#nshadow)"/>
45+
<circle cx="330" cy="130" r="46" fill="url(#gloss)"/>
46+
<text x="330" y="134" class="nlabel" text-anchor="middle">:JCompilationUnit</text>
47+
<text x="330" y="192" class="prop" text-anchor="middle">file_path</text>
48+
<text x="330" y="208" class="prop" text-anchor="middle">package_name</text>
49+
<circle cx="330" cy="470" r="46" fill="#F79767" stroke="#ffffff" stroke-width="2.5" filter="url(#nshadow)"/>
50+
<circle cx="330" cy="470" r="46" fill="url(#gloss)"/>
51+
<text x="330" y="466" class="nlabel" text-anchor="middle">:JType</text>
52+
<text x="330" y="483" class="nsub" text-anchor="middle">:JSymbol</text>
53+
<text x="330" y="532" class="prop" text-anchor="middle">fqn</text>
54+
<text x="330" y="548" class="prop" text-anchor="middle">is_interface</text>
55+
<circle cx="120" cy="470" r="46" fill="#57C7E3" stroke="#ffffff" stroke-width="2.5" filter="url(#nshadow)"/>
56+
<circle cx="120" cy="470" r="46" fill="url(#gloss)"/>
57+
<text x="120" y="474" class="nlabel" text-anchor="middle">:JField</text>
58+
<text x="120" y="532" class="prop" text-anchor="middle">name</text>
59+
<text x="120" y="548" class="prop" text-anchor="middle">type</text>
60+
<circle cx="560" cy="300" r="53" fill="none" stroke="#FFC454" stroke-width="3" stroke-dasharray="5 5" opacity="0.95"/>
61+
<circle cx="560" cy="300" r="46" fill="#8DCC93" stroke="#ffffff" stroke-width="2.5" filter="url(#nshadow)"/>
62+
<circle cx="560" cy="300" r="46" fill="url(#gloss)"/>
63+
<text x="560" y="296" class="nlabel" text-anchor="middle">:JCallable</text>
64+
<text x="560" y="313" class="nsub" text-anchor="middle">:JSymbol</text>
65+
<g transform="translate(560,238)"><rect x="-52" y="-12" width="104" height="20" rx="10" fill="#1b1b2b" opacity="0.92"/><text x="0" y="2" class="badge" text-anchor="middle">★ :JEntrypoint</text></g>
66+
<text x="560" y="362" class="prop" text-anchor="middle">signature</text>
67+
<text x="560" y="378" class="prop" text-anchor="middle">cyclomatic_complexity</text>
68+
<circle cx="560" cy="470" r="46" fill="#ECB5C9" stroke="#ffffff" stroke-width="2.5" filter="url(#nshadow)"/>
69+
<circle cx="560" cy="470" r="46" fill="url(#gloss)"/>
70+
<text x="560" y="474" class="nlabel" text-anchor="middle">:JParameter</text>
71+
<text x="560" y="532" class="prop" text-anchor="middle">name</text>
72+
<text x="560" y="548" class="prop" text-anchor="middle">type</text>
73+
<circle cx="790" cy="130" r="46" fill="#FFC454" stroke="#ffffff" stroke-width="2.5" filter="url(#nshadow)"/>
74+
<circle cx="790" cy="130" r="46" fill="url(#gloss)"/>
75+
<text x="790" y="134" class="nlabel" text-anchor="middle">:JCallSite</text>
76+
<text x="790" y="192" class="prop" text-anchor="middle">method_name</text>
77+
<text x="790" y="208" class="prop" text-anchor="middle">receiver_type</text>
78+
<circle cx="975" cy="300" r="46" fill="#569480" stroke="#ffffff" stroke-width="2.5" filter="url(#nshadow)"/>
79+
<circle cx="975" cy="300" r="46" fill="url(#gloss)"/>
80+
<text x="975" y="304" class="nlabel" text-anchor="middle">:JCallable</text>
81+
<text x="975" y="362" class="prop" text-anchor="middle">name</text>
82+
<text x="975" y="378" class="prop" text-anchor="middle">is_entrypoint</text>
83+
</svg>
84+
<figcaption>
85+
The analysis is a <strong>Neo4j property graph</strong>: every node carries a
86+
<em>label</em> (its color) and <em>properties</em>; every relationship carries a
87+
<em>type</em>. The dashed ring marks an <code>:JEntrypoint</code>;
88+
the <span class="calls">J_CALLS</span> edge is the resolved call graph.
89+
</figcaption>
90+
</figure>
91+
92+
<style>
93+
.pgraph-figure {
94+
margin: 1.5rem 0 2rem;
95+
padding: 0.5rem 0.25rem 0.25rem;
96+
border: 1px solid var(--sl-color-gray-5);
97+
border-radius: 14px;
98+
background:
99+
radial-gradient(120% 80% at 15% 0%, color-mix(in srgb, var(--sl-color-accent-low) 55%, transparent), transparent 60%),
100+
var(--sl-color-black);
101+
overflow: hidden;
102+
}
103+
.pgraph { width: 100%; height: auto; display: block; }
104+
/* edges + labels adapt to theme */
105+
.pgraph { --pg-edge: var(--sl-color-gray-3); }
106+
.pgraph .rel { stroke: var(--pg-edge); stroke-width: 2.2; }
107+
.pgraph .rel-accent { stroke: #DA7194; stroke-width: 3.2; }
108+
.pgraph .nlabel { fill: #fff; font: 700 14px/1.1 ui-sans-serif, system-ui, sans-serif; letter-spacing: .2px; }
109+
.pgraph .nsub { fill: rgba(255,255,255,.92); font: 600 12px ui-sans-serif, system-ui, sans-serif; }
110+
.pgraph .prop { fill: var(--sl-color-gray-2); font: 500 12px ui-monospace, "SFMono-Regular", monospace; }
111+
.pgraph .rtype { fill: #fff; font: 700 11px ui-monospace, monospace; letter-spacing: .3px; }
112+
.pgraph .pill { fill: #3a4252; stroke: rgba(255,255,255,.18); }
113+
.pgraph .pill-accent { fill: #DA7194; }
114+
.pgraph .badge { fill: #ffd98a; font: 700 11px ui-monospace, monospace; }
115+
.pgraph-figure figcaption {
116+
color: var(--sl-color-gray-2);
117+
font-size: 0.85rem; line-height: 1.5;
118+
padding: 0.4rem 1rem 0.75rem; text-align: center;
119+
}
120+
.pgraph-figure .calls { color: #DA7194; font-weight: 700; font-family: ui-monospace, monospace; }
121+
</style>

src/content/docs/frameworks/crud.mdx

Lines changed: 28 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -79,15 +79,42 @@ These come from `EntityManager.createQuery(String)` and `EntityManager.createNam
7979
CRUD data is part of the symbol table, so it's available at [analysis level 1](/codeanalyzer-java/guides/analysis-levels/) — you don't need a call graph to enumerate persistence operations across the app.
8080
</Aside>
8181

82+
## In the Neo4j graph
83+
84+
When you project the analysis with `--emit neo4j`, CRUD detection is not flattened into the callable — it becomes **first-class graph structure**. Each detected operation is a `:JCrudOperation` node and each query a `:JCrudQuery` node, hung off its owning `:JCallable` or `:JCallSite`:
85+
86+
```cypher
87+
(:JCallable | :JCallSite)-[:J_HAS_CRUD_OPERATION]->(:JCrudOperation)
88+
(:JCallable | :JCallSite)-[:J_HAS_CRUD_QUERY]->(:JCrudQuery)
89+
```
90+
91+
That turns "where does this application write to persistent storage?" into a Cypher traversal across the whole graph — and, once many applications share one database, across the entire portfolio. For example, every method that issues a write:
92+
93+
```cypher
94+
MATCH (c:JCallable)-[:J_HAS_CRUD_OPERATION]->(op:JCrudOperation)
95+
WHERE op.operation_type IN ['CREATE', 'UPDATE', 'DELETE']
96+
RETURN c.signature, op.operation_type
97+
```
98+
99+
`JCrudOperation` exposes `operation_type` along with the `target_table`, `involved_columns`, `condition`, and `joined_tables` properties — keep in mind those last four are reserved (see above), so today you filter on `operation_type`. See the [Neo4j graph-schema reference](/codeanalyzer-java/schema/neo4j-graph/) for the full node and relationship inventory.
100+
82101
## Using it downstream
83102

84103
```python
85-
analysis = CLDK(language="java").analysis(project_path="my-app")
104+
from cldk import CLDK
105+
from cldk.analysis import AnalysisLevel
106+
107+
analysis = CLDK.java(
108+
project_path="my-app",
109+
analysis_level=AnalysisLevel.symbol_table,
110+
)
86111

87112
for cls in analysis.get_classes():
88113
for sig, m in analysis.get_methods_in_class(cls).items():
89114
for op in m.crud_operations:
90115
print(f"{op.operation_type} at {cls}:{op.line_number}")
91116
```
92117

118+
The same query works unchanged against a graph that was produced out of band: pass a `Neo4jConnectionConfig` instead of `project_path` and the read-only Neo4j backend reconstructs the identical models — no JDK, native binary, or project source required. Set `application_name` to the `--app-name` the graph was loaded with. See the [Neo4j graph output guide](/codeanalyzer-java/guides/neo4j-output/) for the full read-back flow.
119+
93120
Combine with [entry-point](/codeanalyzer-java/frameworks/entry-points/) and call-graph data to answer questions like "which externally-reachable methods perform writes?"

src/content/docs/frameworks/entry-points.mdx

Lines changed: 36 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,7 @@ WALA builds the call graph by traversing outward from entry points. If a project
5959

6060
## Using entry points downstream
6161

62-
Once analyzed, you can filter on the flags to drive reachability. Via the Python SDK:
62+
Once analyzed, you can filter on the flags to drive reachability. Via the Python SDK, against an in-process analysis:
6363

6464
```python
6565
analysis = CLDK(language="java").analysis(
@@ -77,3 +77,38 @@ entrypoints = [
7777
```
7878

7979
Seed a `networkx` reachability query from these to ask whether a sink is reachable from any externally-invocable method.
80+
81+
## Entry points in the Neo4j projection
82+
83+
When you project to a Neo4j property graph with [`--emit neo4j`](/codeanalyzer-java/guides/neo4j-output/), the `is_entrypoint` / `is_entrypoint_class` properties are still carried on the `:JCallable` and `:JType` nodes — but the projection *also* layers a marker label, `:JEntrypoint`, onto the owning callable (and entry-point class). That turns reachability seeding into a one-line `MATCH` instead of a property filter:
84+
85+
```cypher
86+
// Every entry-point method in one application — the reachability seeds
87+
MATCH (a:JApplication {name: 'daytrader8'})-[:J_HAS_UNIT]->(:JCompilationUnit)
88+
-[:J_DECLARES_TYPE]->(:JType)-[:J_HAS_CALLABLE]->(c:JCallable:JEntrypoint)
89+
RETURN c.signature
90+
```
91+
92+
From there you traverse `J_CALLS` edges (present once you analyze at [level 2](/codeanalyzer-java/guides/analysis-levels/)) to ask the same reachability question entirely in Cypher, across every application in the database. Reading the seeds back through the SDK is identical to the in-process path above — point the facade at the graph and the same typed `JCallable` objects come back, with no JDK, native binary, or project source on the consumer:
93+
94+
```python
95+
from cldk import CLDK
96+
from cldk.analysis import AnalysisLevel
97+
from cldk.analysis.commons.backend_config import Neo4jConnectionConfig
98+
99+
analysis = CLDK.java(
100+
analysis_level=AnalysisLevel.call_graph,
101+
backend=Neo4jConnectionConfig(
102+
uri="bolt://localhost:7687",
103+
username="neo4j",
104+
password="neo4j", # or set NEO4J_PASSWORD
105+
application_name="daytrader8", # must match the --app-name the graph was loaded with
106+
),
107+
)
108+
109+
entrypoints = analysis.get_entry_point_methods()
110+
```
111+
112+
<Aside type="note" title="Where the labels and properties are defined">
113+
`:JEntrypoint` and the `is_entrypoint` / `is_entrypoint_class` properties are part of the versioned graph contract. For the full node-label and relationship inventory, see the [Neo4j graph schema](/codeanalyzer-java/schema/neo4j-graph/).
114+
</Aside>

src/content/docs/guides/analysis-levels.mdx

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,12 @@ See [Build integration](/codeanalyzer-java/guides/build-integration/) for how th
4545
WALA needs entry points to anchor the graph. If a project has no `main(String[])` and no recognized framework entry points, the call graph can come back empty. See [Entry points](/codeanalyzer-java/frameworks/entry-points/) and [Troubleshooting](/codeanalyzer-java/troubleshooting/).
4646
</Aside>
4747

48+
## The level also shapes the Neo4j graph
49+
50+
The analysis level governs the [Neo4j projection](/codeanalyzer-java/guides/neo4j-output/) the same way it governs `analysis.json`. Level 1 emits the lossless symbol-table subgraph — compilation units, types, callables, fields, and the rest — with **no `J_CALLS` edges**. Level 2 adds the call graph as `(:JCallable)-[:J_CALLS]->(:JCallable)` relationships on top of it.
51+
52+
This carries through the `-t` downgrade: because passing `-t` with `-a 2` forces level 1, a targeted incremental Bolt push (`--emit neo4j -t ...`) replaces only the changed units' symbol-table subgraphs and **carries no call edges**. Refresh `J_CALLS` with a full level-2 run.
53+
4854
## Choosing a level
4955

5056
- **Use level 1** when you need the program's *structure*: listing classes and methods, reading method bodies, finding fields, extracting Javadoc, surveying imports, or doing incremental per-file updates.
@@ -57,4 +63,5 @@ WALA needs entry points to anchor the graph. If a project has no `main(String[])
5763
<LinkCard title="Symbol table schema" description="Everything level 1 produces." href="/codeanalyzer-java/schema/symbol-table/" />
5864
<LinkCard title="Call graph schema" description="The edge shape level 2 adds." href="/codeanalyzer-java/schema/call-graph/" />
5965
<LinkCard title="Incremental analysis" description="Level-1 target-file updates to an existing analysis.json." href="/codeanalyzer-java/guides/incremental-analysis/" />
66+
<LinkCard title="Neo4j graph output" description="How each level projects into the property graph — and why level 2 adds J_CALLS." href="/codeanalyzer-java/guides/neo4j-output/" />
6067
</CardGrid>

0 commit comments

Comments
 (0)