You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(neo4j): namespace all node labels (Py*) and relationship types (PY_*)
In a shared Neo4j instance, unprefixed labels and relationship types from
different language analyzers collide: `MERGE (:Application {name})` and
`:Symbol`/`HAS_MODULE` from a future Java/JS backend would fuse with Python's.
Labels and relationship types are separate Neo4j namespaces, so both are
prefixed — every node label gets `Py` (e.g. `:PyClass`, shared MERGE label
`:PySymbol`) and every relationship type gets `PY_` (e.g. `PY_CALLS`).
Constraint/index names are also globally unique per-DB, so they get a `py_`
prefix too.
- catalog.py: the source-of-truth labels, merge labels, and rel types
- schema.py: DDL label refs + constraint/index names
- project.py, cypher.py, bolt.py, rows.py: emitter + both writers
- tests, sample app, README, CHANGELOG, --app-name help, schema.neo4j.json
- neo4j-schema.drawio: new property-graph diagram; schema-uml.drawio: relayout
SCHEMA_VERSION stays 1.0.0 (the schema is new on this branch — no released
consumer has seen the unprefixed 1.0.0).
Copy file name to clipboardExpand all lines: CHANGELOG.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,10 +8,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
8
8
## [0.2.0] - 2026-06-20
9
9
10
10
### Added
11
-
-**Neo4j property-graph output** (`--emit neo4j`). The same in-memory analysis (`PyApplication`) is projected to a labeled property graph, mirroring the `codeanalyzer-typescript` backend. Two writers:
11
+
-**Neo4j property-graph output** (`--emit neo4j`). The same in-memory analysis (`PyApplication`) is projected to a labeled property graph, mirroring the `codeanalyzer-typescript` backend. Node labels are `Py`-prefixed and relationship types are `PY_`-prefixed (e.g. `:PyClass`, `PY_CALLS`) so multiple language analyzers can coexist in one database without label or relationship-type collisions. Two writers:
12
12
-**`graph.cypher` snapshot** (default) — a self-contained Cypher script (constraints + indexes, a scoped wipe of the project's prior subgraph, then batched `UNWIND … MERGE`). Load it with `cypher-shell < graph.cypher`. Needs no extra dependencies.
13
13
-**Live Bolt push** (`--neo4j-uri`) — an **incremental** writer: only modules whose `content_hash` changed are rewritten, and on a full run modules whose source file vanished are pruned. Requires the optional `neo4j` driver (`pip install 'codeanalyzer-python[neo4j]'`).
14
-
-**`--emit schema`** — emit the machine-readable, version-stamped Neo4j schema contract (`schema.json`: node labels, relationships, properties, constraints, indexes). Needs no project; bundled in every release as a GitHub Release asset and checked in as `schema.neo4j.json`. A `schema_version` (`1.0.0`) is stamped onto every graph's `:Application` node.
14
+
-**`--emit schema`** — emit the machine-readable, version-stamped Neo4j schema contract (`schema.json`: node labels, relationships, properties, constraints, indexes). Needs no project; bundled in every release as a GitHub Release asset and checked in as `schema.neo4j.json`. A `schema_version` (`1.0.0`) is stamped onto every graph's `:PyApplication` node.
15
15
-**New CLI options** mirroring the TypeScript analyzer's entrypoints: `--emit {json,neo4j,schema}`, `--app-name`, `--neo4j-uri`, `--neo4j-user`, `--neo4j-password`, `--neo4j-database`. `-i/--input` is now optional (not required for `--emit schema`). The four Neo4j connection options also read from the standard `NEO4J_URI` / `NEO4J_USERNAME` / `NEO4J_PASSWORD` / `NEO4J_DATABASE` environment variables when the flag is omitted (an explicit flag wins), so the password need not appear in shell history or the process list.
16
16
-**`codeanalyzer.neo4j`** package: `catalog` (the single source-of-truth schema catalog), `project` (pure IR → graph rows), `cypher` (snapshot writer), `bolt` (incremental writer), and `rows` (the output-agnostic intermediate).
17
17
-**Schema conformance test** (`test/test_neo4j_schema.py`, always runs) — asserts the emitter never produces a label/relationship/property the catalog doesn't declare, and that the checked-in `schema.neo4j.json` is regenerated.
│ --app-name TEXT Logical application name for the graph :Application anchor. │
97
+
│ --app-name TEXT Logical application name for the graph :PyApplication anchor. │
98
98
│ --neo4j-uri TEXT Push the graph to a live Neo4j over Bolt. [env: NEO4J_URI] │
99
99
│ --neo4j-user TEXT Neo4j username. [env: NEO4J_USERNAME] [default: neo4j] │
100
100
│ --neo4j-password TEXT Neo4j password. [env: NEO4J_PASSWORD] [default: neo4j] │
@@ -176,12 +176,12 @@ By default this is printed to stdout in JSON; with `--output` it is written to `
176
176
177
177
### Neo4j graph
178
178
179
-
`--emit neo4j` projects the same analysis into a labeled property graph (declarations keyed by their signature under a shared `:Symbol` label; calls, imports, inheritance, decorators, and call sites as relationships):
179
+
`--emit neo4j` projects the same analysis into a labeled property graph. Every node label is `Py`-prefixed and every relationship type is `PY_`-prefixed (e.g. `:PyClass`, `PY_CALLS`) so multiple language analyzers can share one database without label or relationship-type collisions. Declarations are keyed by their signature under a shared `:PySymbol` label; calls, imports, inheritance, decorators, and call sites are relationships:
180
180
181
181
- **Without `--neo4j-uri`** — writes a self-contained `graph.cypher` (constraints + indexes, a scoped wipe, then batched `MERGE`s). Load it with `cypher-shell < graph.cypher`. Needs no extra dependencies.
182
-
- **With `--neo4j-uri`** — pushes to a live Neo4j over Bolt **incrementally**: only modules whose content hash changed are rewritten, and on a full run modules whose source file vanished are pruned. Requires the `neo4j` extra. Every graph carries a `schema_version` on its `:Application` node.
182
+
- **With `--neo4j-uri`** — pushes to a live Neo4j over Bolt **incrementally**: only modules whose content hash changed are rewritten, and on a full run modules whose source file vanished are pruned. Requires the `neo4j` extra. Every graph carries a `schema_version` on its `:PyApplication` node.
183
183
184
-
Call-graph endpoints that aren't present in the symbol table (third-party / framework / RPC targets) are materialized as `:External` ghost nodes, mirroring the analyzer's own ghost-node behaviour.
184
+
Call-graph endpoints that aren't present in the symbol table (third-party / framework / RPC targets) are materialized as `:PyExternal` ghost nodes, mirroring the analyzer's own ghost-node behaviour.
185
185
186
186
The connection options also read from the standard Neo4j environment variables — `NEO4J_URI`, `NEO4J_USERNAME`, `NEO4J_PASSWORD`, `NEO4J_DATABASE` — when the corresponding flag is omitted (an explicit flag wins). Prefer the env var forthe password so it doesn't landin shell history or the process list:
0 commit comments