|
2 | 2 |
|
3 | 3 | Native WALA implementation of source code analysis tool for Enterprise Java Applications. |
4 | 4 |
|
| 5 | +`codeanalyzer` extracts a comprehensive **symbol table** and **call graph** from Java applications |
| 6 | +and emits them either as the canonical `analysis.json`, or as a **Neo4j property graph** |
| 7 | +(`--emit neo4j`) — a `graph.cypher` snapshot or a live, incremental push over Bolt. See |
| 8 | +[§4. Neo4j graph output](#4-neo4j-graph-output). |
| 9 | + |
| 10 | +## Quick install |
| 11 | + |
| 12 | +Grab the latest release jar and a `codeanalyzer` launcher (requires a Java 11+ runtime): |
| 13 | + |
| 14 | +```sh |
| 15 | +curl --proto '=https' --tlsv1.2 -LsSf https://github.com/codellm-devkit/codeanalyzer-java/releases/latest/download/codeanalyzer-installer.sh | sh |
| 16 | +# or with wget: |
| 17 | +wget -qO- https://github.com/codellm-devkit/codeanalyzer-java/releases/latest/download/codeanalyzer-installer.sh | sh |
| 18 | +``` |
| 19 | + |
| 20 | +Overrides: `CODEANALYZER_INSTALL_DIR` (default `~/.local/bin`), `CODEANALYZER_VERSION` (default `latest`). |
| 21 | +Prefer to build from source? See [§2. Building `codeanalyzer`](#2-building-codeanalyzer). |
| 22 | + |
5 | 23 | ## 1. Prerequisites |
6 | 24 |
|
7 | 25 | Before you begin, ensure you have met the following requirements: |
@@ -68,30 +86,42 @@ Run the Gradle wrapper script to build the project. This will compile the projec |
68 | 86 |
|
69 | 87 | ### 2.3. Using `codeanalyzer` |
70 | 88 |
|
71 | | -The jar will be built at `build/libs/codeanalyzer-1.0.jar`. It may be used as follows: |
| 89 | +The jar will be built at `build/libs/codeanalyzer-<version>.jar`. It may be used as follows: |
72 | 90 |
|
73 | 91 | ```help |
74 | | -Usage: java -jar /path/to/codeanalyzer.jar [-hvV] [--no-build] [-a=<analysisLevel>] [-b=<build>] |
| 92 | +Usage: codeanalyzer [-hvV] [--no-build] [--no-clean-dependencies] |
| 93 | + [-a=<analysisLevel>] [-b=<build>] [-f=<projectRootPom>] |
75 | 94 | [-i=<input>] [-o=<output>] [-s=<sourceAnalysis>] |
76 | | -Convert java binary into a comprehensive system dependency graph. |
77 | | - -i, --input=<input> Path to the project root directory. |
78 | | - -s, --source-analysis=<sourceAnalysis> |
79 | | - Analyze a single string of java source code instead |
80 | | - the project. |
81 | | - -o, --output=<output> Destination directory to save the output graphs. By |
82 | | - default, the SDG formatted as a JSON will be |
83 | | - printed to the console. |
84 | | - -b, --build-cmd=<build> Custom build command. Defaults to auto build. |
85 | | - --no-build Do not build your application. Use this option if |
86 | | - you have already built your application. |
87 | | - -a, --analysis-level=<analysisLevel> |
88 | | - Level of analysis to perform. Options: 1 (for just |
89 | | - symbol table) or 2 (for call graph). Default: 1 |
90 | | - -v, --verbose Print logs to console. |
91 | | - -h, --help Show this help message and exit. |
92 | | - -V, --version Print version information and exit. |
93 | | - -t, --target-files For each file user wants to perform source analysis on top of existing analysis.json |
94 | | -
|
| 95 | + [--emit=<emit>] [--app-name=<appName>] |
| 96 | + [--neo4j-uri=<uri>] [--neo4j-user=<user>] |
| 97 | + [--neo4j-password=<password>] [--neo4j-database=<db>] |
| 98 | + [-t=<targetFiles>]... |
| 99 | +Analyze java application. |
| 100 | + -i, --input=<input> Path to the project root directory. |
| 101 | + -s, --source-analysis=<s> Analyze a single string of java source code instead |
| 102 | + of the project. |
| 103 | + -o, --output=<output> Destination directory to save the output graphs. By |
| 104 | + default, the analysis JSON is printed to the console. |
| 105 | + -b, --build-cmd=<build> Custom build command. Defaults to auto build. |
| 106 | + --no-build Do not build your application (use if already built). |
| 107 | + -a, --analysis-level=<n> Level of analysis: 1 (symbol table) or 2 (call graph). |
| 108 | + Default: 1. Level 2 adds J_CALLS edges to the graph. |
| 109 | + -t, --target-files=<f>... Restrict analysis to specific files (incremental). |
| 110 | + --emit=<emit> Output target: json (analysis.json, default) | |
| 111 | + neo4j (graph.cypher or live Bolt push) | |
| 112 | + schema (the Neo4j schema.neo4j.json contract). |
| 113 | + --app-name=<name> Logical application name for the graph :JApplication |
| 114 | + anchor (default: input dir name). |
| 115 | + --neo4j-uri=<uri> Push the graph to a live Neo4j over Bolt (incremental); |
| 116 | + omit to write graph.cypher. Falls back to the |
| 117 | + NEO4J_URI environment variable. |
| 118 | + --neo4j-user=<user> Neo4j username (env: NEO4J_USERNAME, default: neo4j). |
| 119 | + --neo4j-password=<pw> Neo4j password (env: NEO4J_PASSWORD, default: neo4j). |
| 120 | + --neo4j-database=<db> Neo4j database name (env: NEO4J_DATABASE, default: |
| 121 | + server default). |
| 122 | + -v, --verbose Print logs to console. |
| 123 | + -h, --help Show this help message and exit. |
| 124 | + -V, --version Print version information and exit. |
95 | 125 | ``` |
96 | 126 |
|
97 | 127 |
|
@@ -157,6 +187,66 @@ There is a sample application in `src/test/resources/sample_apps/daytrader8/bina |
157 | 187 |
|
158 | 188 | This will produce print the SDG on the console. Explore other flags to save the output to a JSON. |
159 | 189 |
|
| 190 | +## 4. Neo4j graph output |
| 191 | + |
| 192 | +`codeanalyzer` can project the analysis IR into a [Neo4j](https://neo4j.com/) property graph instead |
| 193 | +of `analysis.json`. The graph is a **lossless** projection of the IR: compilation units, types, |
| 194 | +callables, fields, parameters, call sites, variables, enum constants, record components, |
| 195 | +initialization blocks, CRUD operations/queries, comments, annotations and packages are all |
| 196 | +first-class nodes and relationships, and (at `-a 2`) it adds `J_CALLS` edges from the call graph. |
| 197 | +Every field of the Lombok entity model is represented (scalars as node properties — maps such as a |
| 198 | +field's per-variable initializers are kept as a `*_json` property since Neo4j has no map type; |
| 199 | +comments are `:JComment` nodes in addition to the convenience `docstring` property). |
| 200 | + |
| 201 | +The full contract (node labels, their keys and typed properties, relationship types and endpoints, |
| 202 | +plus the constraint/index DDL) lives in [`schema.neo4j.json`](./schema.neo4j.json) and is visualized |
| 203 | +in [`neo4j-schema.drawio`](./neo4j-schema.drawio). All node labels are `J`-prefixed and relationship |
| 204 | +types `J_`-prefixed (e.g. `:JType`, `:JCallable`, `J_CALLS`) so a Java graph can share a Neo4j |
| 205 | +database with the Python (`Py*`/`PY_*`) and TypeScript (`TS*`/`TS_*`) backends without colliding. |
| 206 | +`SCHEMA_VERSION` is stamped onto the `:JApplication` node of every emitted graph. |
| 207 | + |
| 208 | +### 4.1. Cypher snapshot (no database required) |
| 209 | + |
| 210 | +```sh |
| 211 | +codeanalyzer -i /path/to/project -a 2 --emit neo4j -o ./out |
| 212 | +# → writes ./out/graph.cypher (a self-contained, re-runnable script) |
| 213 | +cypher-shell -u neo4j -p <password> < ./out/graph.cypher |
| 214 | +``` |
| 215 | + |
| 216 | +The snapshot is **not** incremental: it constraints, scopes-wipes this application's prior subgraph, |
| 217 | +then `UNWIND … MERGE`-loads the full truth. |
| 218 | + |
| 219 | +### 4.2. Live incremental push over Bolt |
| 220 | + |
| 221 | +```sh |
| 222 | +codeanalyzer -i /path/to/project -a 2 --emit neo4j \ |
| 223 | + --neo4j-uri bolt://localhost:7687 --neo4j-user neo4j --neo4j-password <password> |
| 224 | +``` |
| 225 | + |
| 226 | +The Bolt writer reads the database's current state and updates **only what changed**: it diffs each |
| 227 | +compilation unit's `content_hash`, replaces just the changed units' subgraphs (idempotent |
| 228 | +`MERGE` upserts), and — on a full run — prunes units whose source file vanished. Combine with |
| 229 | +`--target-files` for a targeted, partial re-push (orphan pruning is then skipped). |
| 230 | + |
| 231 | +### 4.3. Schema contract |
| 232 | + |
| 233 | +```sh |
| 234 | +codeanalyzer --emit schema -o ./out # → ./out/schema.neo4j.json (no project analysis needed) |
| 235 | +codeanalyzer --emit schema # → prints the contract to stdout |
| 236 | +``` |
| 237 | + |
| 238 | +### 4.4. Verifying the writers |
| 239 | + |
| 240 | +A no-container conformance test (`Neo4jSchemaConformanceTest`) asserts the projector never emits a |
| 241 | +label/relationship/property the catalog doesn't declare, and that `schema.neo4j.json` is current. A |
| 242 | +Testcontainers-backed integration test (`Neo4jBoltWriterTest`) spins up a real Neo4j and exercises |
| 243 | +the Bolt writer (full push, idempotent re-push, orphan pruning). The container suite is **opt-in** |
| 244 | +(it needs Docker/Podman) and runs only when `RUN_CONTAINER_TESTS` is set: |
| 245 | + |
| 246 | +```sh |
| 247 | +RUN_CONTAINER_TESTS=1 ./gradlew test |
| 248 | +``` |
| 249 | + |
160 | 250 | ## FAQ |
161 | 251 |
|
162 | 252 | 1. After making a few code changes, my native binary gives random exceptions. But, my code works perfectly with `java -jar`. |
|
0 commit comments