test(DON'T MERGE): develop-v2.0.0-rc.2 by shuklaayush · Pull Request #2846 · openvm-org/openvm

shuklaayush · 2026-06-05T17:41:54Z

No description provided.

Use fixed 32-row blocks with a full 31-bit decomposition and terminal row. Enforce canonical BabyBear representatives for bit_src.

…2532)

Resolves INT-6719. Removes the **compression layer** from the aggregation pipeline. The compression circuit was a specialized verifier circuit that wrapped a single internal-recursive proof *without* a cached trace (`has_cached: false`). With it gone, all verifier circuits unconditionally use a cached trace, so the `has_cached` flag, the `DagCommitSubAir`, and the `CachedTraceCtx` enum are all dead code and are removed. The final STARK proof is now always an internal-recursive proof. - **`continuations-v2/src/prover/compression/`** — the `CompressionProver` and its trace generation. - **`recursion/src/batch_constraint/expr_eval/dag_commit.rs`** — the `DagCommitSubAir`, `DagCommitCols`, onion-hash logic, and helper functions (`generate_dag_commit_info`, `cached_symbolic_expr_cols_to_digest`, etc.). - **`recursion/cuda/src/batch_constraint/expr_eval/dag_commit.cuh`** — CUDA counterpart of the DAG commit tracegen. - **`sdk-v2/src/prover/compression.rs`** — SDK-level `CompressionProver` wrapper. - **`VerifierConfig.has_cached` removed** — was `true` for every circuit except compression. Now the field doesn't exist; cached trace is always assumed. - **`CachedTraceCtx` enum removed** — distinguished `PcsData` (cached) vs `Records` (no-cached/compression). The `generate_proving_ctxs` API now takes `CommittedTraceData` directly. - **`SymbolicExpressionAir`** simplified: - No longer generic over `F` (was `SymbolicExpressionAir<F>`). - Always emits a cached trace column and never emits `DagCommitCols` in the common main trace. - Zero public values (previously exposed `DagCommitPvs` when `has_cached` was false). - **`BatchConstraintModule`** no longer carries `has_cached`; its `ModuleSpecificCtx` simplifies from a tuple `(&CachedTraceRecord, &PowerChecker)` to just `PowerChecker`. - **CUDA kernel `symbolic_expression_tracegen`** — `CachedRecord` pointer argument removed; `DagCommitCols` write paths removed. - `CompressionProver` and its lazy initialization removed from `Sdk`. - `compression_pk` removed from `AggProvingKey`. - `StarkProver` no longer stores or invokes a compression prover. - `agg_vk()` now always returns the internal-recursive VK directly. - `compression_commit` removed from `VerificationBaseline`. - Verification no longer checks for or validates compression commit public values. - All prover constructors (`InnerAggregationProver`, `RootProver`, `DeferredVerifyProver`, `DeferralHookProver`, `DeferralInnerProver`) drop `has_cached: true` from their `VerifierConfig` since the field no longer exists. - All trace generation methods pass `child_vk_pcs_data` directly instead of wrapping it in `CachedTraceCtx::PcsData(...)`. - Compression test (`test_compression_prover`) removed. - `CompressionCpuProver` / `CompressionGpuProver` type aliases removed. - Minor: `trace_heights_tracing_info` now also logs total width. - Bumps `openvm-stark-backend` (and related crates) to a newer `develop-v2` revision. - **`crates/continuations-v2/README.md`** — removed the "Compression Basic Prover" section. - **`crates/verify-stark/README.md`** — removed the compression layer from the aggregation chain, `compression_commit` from the public values list, and its baseline check from verification. - **`docs/vocs/docs/pages/specs/architecture/continuations.mdx`** — removed the compression layer type, `CompressionSdkProver`, the compression aggregation subcircuit section, the `has_cached` verifier parameter, the `DagCommitSubAir` description, and `compression_commit` references throughout. The STARK proof pipeline now terminates at the internal-recursive layer. 1. Start with `crates/recursion/src/system/mod.rs` — the removal of `CachedTraceCtx` and `has_cached` from `VerifierConfig` is the core change that cascades everywhere. 2. Then `crates/recursion/src/batch_constraint/expr_eval/symbolic_expression/air.rs` — the `SymbolicExpressionAir` simplification (always cached, no `DagCommitSubAir` branch). 3. The prover files in `crates/continuations-v2/src/prover/` are mechanical: drop `has_cached: true` and replace `CachedTraceCtx::PcsData(x)` with `x`. 4. `crates/sdk-v2/` and `crates/verify-stark/` are straightforward removals of compression-related fields and code paths. 5. The deleted files (`dag_commit.rs`, `compression/mod.rs`, `compression.rs`, `dag_commit.cuh`) can be skimmed to confirm they are only used by the compression path. 6. Documentation changes in the READMEs and `continuations.mdx` remove all compression references and reflect that the STARK proof is now an internal-recursive proof.

closes INT-6535 --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Thread tidx_final_poly_start through internal bus so that it is constant across the evaluation tree.

Not exploitable without breaking Posiedon2, but still nice to enforce

resolves int-6773

I hope this works, but the idea is that stuff merged to develop-v2.0.0-beta will then rebase develop-v2.0.0-rc.1 On main, we do not include develop-v2.0.0-rc.1, so merge to main should set a chain of rebases main -> develop-v2.0.0-beta -> develop-v2.0.0-rc.1

Resolves INT-6673. When any dimension has size of 1, the T6 odometer carry constraint degenerates because "wrap" and "stay" produce the same diff (0), breaking the completeness guarantee of lexicographic enumeration. Add an assert in the Chip constructor to reject this configuration, and a test to verify it panics. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

This resolves INT-6777

This resolves INT-6774. Eq3bAir now maintains `n_logup` as well, constrains that it's constant across the AIR as well as `n_lift` and receives them one per AIR.

## Summary - Removed `openvm/` prefix from source paths across 8 documentation files (21 occurrences) - Updated stale `v2-proof-system` repository reference in `docs/crates/recursion/README.md` to reflect the current openvm repo - Updated `stark-backend` path in `docs/crates/recursion/verifier-mapping.md` to link to the [stark-backend GitHub repo](https://github.com/openvm-org/stark-backend) Fixes issues identified in #2553 (comment) ## Test plan - [x] Verified no remaining `openvm/crates/recursion/src/` prefixes exist - [x] Verified no remaining `v2-proof-system` references exist - [x] Verified no remaining `stark-backend/crates/` local path references exist 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Jonathan Wang <jonathanpwang@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

This resolves INT-6814

This resolves INT-6828. Now `Eq3bAir` propagates the `row_idx` by the correct amount for each AIR. Note that there is no need for this air to know `air_idx` or anything other than the number of interactions and the `n_lift` for this air, so no new columns are added (but we now do the interaction with proof shape air on the last row of the air instead of the first one).

…2563) ## Summary - Add a `changes` detection job using `dorny/paths-filter@v3` to skip the `lint-cuda` job when no CUDA-related files (`*.cu`, `*.cuh`, `**/cuda/**`, `**/cuda*.rs`, etc.) are modified - Request at least 8 CPUs on the GPU runner (`/cpu=8`) for faster builds - The workflow file and CUDA cache action are also included as triggers so changes to CI itself still run the CUDA lint ## Test plan - [ ] Open a PR that doesn't touch any CUDA files and verify `lint-cuda` is skipped - [ ] Open a PR that touches a `.cu` or `.cuh` file and verify `lint-cuda` runs --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

## Summary - replace deprecated RunsOn v2 disk labels with explicit v3 volume labels - remove the custom .github/runs-on.yml profile dependency and inline the GPU runner labels in workflows - keep default GPU runner families aligned with the previous profile behavior while allowing future Blackwell/G7e support via CUDA_ARCH=75,86,89,120 - fix the RISC-V CUDA matrix branch so the GPU job still runs the prover test ## Validation - parsed all workflow YAML files locally - checked for stale test-gpu-nvidia, .github/runs-on.yml, disk=, and disk: references under .github - checked for undefined matrix field references across workflows

## Summary - Emit a `segmentation_trigger` metric when metered execution creates a segment, labeled by trigger reason. - Include AIR id/name labels for height-triggered segmentation so the metric identifies the chip that crossed the trace-height limit. - Drop `g4dn`/SM75 CUDA CI coverage from this branch now that G4dn is no longer a supported runner family. ## Validation - Parsed all workflow YAML files locally. - Checked `.github` for remaining `g4dn` references and `CUDA_ARCH` values containing `75`; none remain. - Rust tests were not run locally for this metadata/CI cleanup pass.

The docs step took 10–20 min per PR: `cargo doc --workspace` without `--no-deps` documents ~425 third-party crates, and rustdoc output is not cached by sccache. - Split the docs check into a parallel job that only documents workspace crates with `RUSTDOCFLAGS=-D warnings`; full docs with dependencies are still published by `docs.yml` on main/tags. - Fix all 37 rustdoc intra-doc link warnings (including stale references to renamed items). - Re-enable `cargo shear`: remove 50+ unused dependencies and dead feature entries, move dev-only deps to `[dev-dependencies]`, and ignore false positives (deps referenced via derive expansions, `cfg_if!`, and feature forwarding). Pinned to 1.1.11 — 1.13+ falsely flags `ff_derive`'s `num-bigint03` because two renames of the same package collide in its package→import map. Verified: `cargo check --workspace --all-targets`, `cargo shear`, and the docs build all pass locally. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>

socket-security · 2026-06-10T16:04:10Z

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff	Package	Supply Chain Security	Vulnerability	Quality	Maintenance	License
	cargo/tokio@1.47.1 ⏵ 1.50.0	^-3
	cargo/inferno@0.12.6
	cargo/aws-sdk-s3@1.119.0 ⏵ 1.123.0	⁺¹
	cargo/ndarray@0.16.1
	cargo/struct-reflection@0.1.0
	cargo/zstd@0.13.3
	cargo/aws-config@1.8.12 ⏵ 1.8.14

View full report

This tightens several static verifier invariants and trims redundant Halo2 cells in the verifier circuit. - Require `SafeBool` for BabyBear field select APIs - Add explicit shape/width checks for BabyBear reduction, base-2^31 packing, GKR claims, WHIR sumcheck polys, and unsupported rotations - Keep the constraints-only prover harness private to the integration test instead of exposing it on `StaticVerifierCircuit` - Remove unused Poseidon selector code and reuse the shared packing helper in the transcript - Avoid redundant BabyBear reductions in `add`, `sub`, `mul`, and `mul_add` - Assert verifier equalities directly instead of materializing residuals first

- rename `prove_unwrapped` to `prove_root` - move the looping logic to root prover

`state` is a byte sub-slice of a record and is not guaranteed to be 8-byte aligned, so we cannot reinterpret it in place as `&mut [u64; 8]`. Copy through an aligned buffer (using native-endian bytes). --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>

Fixes INT-5507 Fixes INT-5505 Fixes INT-5506 ## Summary Reject public-values accesses past the configured `num_public_values` limit instead of silently accepting them. Neither execution path rejected this before. Non-AOT now gets the expected bounds check because `MmapMemory` exposes the configured size through `size()` and slices, rather than the page-rounded mmap length. AOT public-values load/store instructions now fall back to the normal executor path, so they use the same checked memory access instead of emitting unchecked x86 memory access. ## Tests Adds regressions for direct public-values memory writes and `reveal` past the configured public-values limit.

Resolves a batch of TODOs from the rc.2 TODO audit, one commit per ticket. ## Summary - **Stale TODO cleanups**: dropped the boundary-memory-image TODO in `online.rs` (per discussion, the type change touches too much) and the stale error-handling TODO in `branch_eq` AOT execution (the flagged path already returns `Err(AotError::InvalidInstruction)`). - **`sw_declare!` docs**: replaced the `[TODO]` placeholder with a real `Secp256k1Point` example. - **API rename**: `get_*_step` → `get_*_executor` for the six algebra/ecc constructor functions, matching the `*Executor` types they return. - **SDK examples**: re-enabled `sdk_app`, `sdk_stark`, and `sdk_evm` (gated on `evm-verify`) — the sources were already ported to the v2 API, only the `[[example]]` wiring was stale. - **Test util consolidation**: extracted `assert_vm_states_equivalent` (pc + Merkle-root memory equality) into `openvm_circuit::arch::testing` and replaced all six hand-rolled copies (jalr/mul/mulh tests, `check_aot_equivalence`, riscv test vectors, sha2/keccak256 guest-lib tests). - **Docs**: removed the stale `guest-libs/ruint` entry from `layout.md`, updated README links from deprecated `book.openvm.dev` to `docs.openvm.dev/book`, and fixed a broken rustdoc intra-doc link in `docs/crates/vm.md`. Resolves INT-8171 Resolves INT-8181 Resolves INT-8190 Resolves INT-8192 Resolves INT-8195 Resolves INT-8196 Resolves INT-8236 Resolves INT-8238 Resolves INT-8239 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>

## Summary - Allocate the system Poseidon2 GPU record scratch buffer per segment from exact memory tracegen counts instead of `max_trace_height`. - Drop the scratch buffer before Poseidon2 trace allocation and guard release builds against OOB writes. - Save GPU memory by decoupling this scratch allocation from `max_segment_length` for memory-bound segments. ## Memory Impact With `max_segment_length = 2^24`, the standalone reth benchmarks on block `23992138`, `g7e.2xlarge`, `prove-stark` show the sampled whole-process GPU peak dropping by 1.40 GB while keeping the same 85 memory-triggered segments: | Run | Commit | Segments | Peak GPU memory (`nvidia-smi`) | | --- | --- | ---: | ---: | | [Before](https://github.com/axiom-crypto/openvm-eth/actions/runs/27349752147) | `d519aa6` | 85 | 17.30 GB | | [After](https://github.com/axiom-crypto/openvm-eth/actions/runs/27349786356) | `d3a30b6` | 85 | 15.90 GB | | Savings | | | -1.40 GB | The same runs show the expected drop in OpenVM tracked GPU allocation peaks from removing the persistent `max_trace_height`-sized Poseidon2 scratch buffer: | Module | Before | After | Delta | | --- | ---: | ---: | ---: | | `generate mem proving ctxs` | 7.09 GB | 5.20 GB | -1.89 GB | | `set initial memory` | 6.87 GB | 4.87 GB | -2.00 GB | | `prover.rs_code_matrix` | 10.69 GB | 8.70 GB | -1.99 GB | | `prover.batch_constraints.before_round0` | 16.51 GB | 14.51 GB | -2.00 GB | A separate [openvm-eth comparison run 27329017888](https://github.com/axiom-crypto/openvm-eth/actions/runs/27329017888) with a smaller resolved segment length also shows the affected tracked peaks dropping: | Module | Before | After | Delta | | --- | ---: | ---: | ---: | | `generate mem proving ctxs` | 5.59 GB | 5.20 GB | -0.39 GB | | `set initial memory` | 5.37 GB | 4.87 GB | -0.50 GB | | `prover.rs_code_matrix` | 9.19 GB | 8.70 GB | -0.49 GB | | `prover.batch_constraints.before_round0` | 14.70 GB | 14.25 GB | -0.45 GB | ## Testing - `cargo build --profile fast -p openvm-circuit` - `cargo +nightly fmt --all -- --check` - `cargo build --profile fast -p openvm-circuit --features cuda` - `cargo nextest run --cargo-profile=fast -p openvm-circuit --features cuda test_empty_touched_memory_uses_full_chunk_values test_touched_memory_updates_memory_address_space test_cuda_merkle_tree_cpu_gpu_root_equivalence` Resolves int-8291

depends on #2875 resolves int-8290

… bigint examples (#2831) ## Problem The `#[cfg(not(target_os = "zkvm"))]` fallback branches in three extension examples contain only placeholder comments instead of `unimplemented!()`, causing a compile error on non-zkvm targets. ## Changes keccak.mdx: `// Regular Keccak-256 implementation` → `unimplemented!("native keccak256 is only available on zkvm target")` sha-256.mdx: `// Regular SHA-256 implementation` → `unimplemented!("native sha256 is only available on zkvm target")` big-integer.mdx: `// Regular wrapping add implementation` → `unimplemented!("native bigint ops are only available on zkvm target")` ## Test plan Documentation-only change; no library code paths affected. Fixes #2830

## Summary - Derive test `SystemParams` from `segmentation_limits.max_trace_height` instead of the old hard-coded `2^22` cap. - Keep the max trace-height power-of-two invariant at the `SegmentationLimits` config boundary. - Verified the SHA2 CUDA guest-lib proving tests pass with the new `2^24` default. ## Testing - `cargo +nightly fmt --all -- --check` - `cargo build --profile fast -p openvm-circuit` - `CUDA_OPT_LEVEL=3 OPENVM_SKIP_DEBUG=1 cargo nextest run --cargo-profile=fast --features=cuda --run-ignored=all --no-tests=pass --test-threads=1` in `guest-libs/sha2` on `ayush-gpu`

Resolves: INT-8261 --------- Co-authored-by: Allan Lin <allanl@intrinsictech.xyz>

github-actions · 2026-06-12T21:39:35Z

group	app.proof_time_ms	app.cycles	leaf.proof_time_ms
fibonacci	3,980	12,000,265	(-3331 [-74.3%]) 1,155
keccak	22,122	18,655,329	4,685
sha2_bench	9,579	14,793,960	1,851
regex	1,503	4,137,067	(-11571 [-96.4%]) 426
ecrecover	603	123,583	(-5580 [-95.3%]) 276
pairing	948	1,745,757	(-6074 [-95.2%]) 306
kitchen_sink	4,147	2,579,903	887
fibonacci_e2e	1,709	12,000,265	494
regex_e2e	720	4,137,067	198
ecrecover_e2e	367	123,583	142
pairing_e2e	503	1,745,757	147
kitchen_sink_e2e	2,167	2,579,903	387

Note: cells_used metrics omitted because CUDA tracegen does not expose unpadded trace heights.

Commit: a35d130

Benchmark Workflow

## Summary - Remove configurable segmentation trace-height and interaction-limit knobs. - Derive segmentation trace-height limits from the engine stacked height and interaction limits from the proving field order. - Keep only segmentation max memory configurable through `SystemConfig` / CLI. - Remove stale max segment height plumbing from benchmarks, workflows, and tests. - Add sccache startup-timeout configuration to reduce CI server startup races. ## Testing - `cargo +nightly fmt --all -- --check` - `cargo clippy --profile fast -p openvm-circuit --all-targets --tests -- -D warnings` - `cargo check --profile fast -p cargo-openvm` - `cargo check --profile fast --no-default-features -p openvm-benchmarks-prove --bin keccak_par --features metrics,parallel,jemalloc` - `cargo clippy --profile fast --no-default-features -p openvm-benchmarks-prove --bin keccak_par --features metrics,parallel,jemalloc -- -D warnings` - `python3 -m json.tool ci/benchmark-config.json >/dev/null && python3 -m json.tool ci/benchmark-config.example.json >/dev/null && bash -n ci/scripts/utils.sh && python3 -m py_compile ci/scripts/bench.py` - `actionlint -ignore 'label ".*" is unknown' -ignore '"github.head_ref" is potentially untrusted' -ignore 'object, array, and null values should not be evaluated' .github/workflows/*.yml` - `ruby -e 'require "yaml"; YAML.load_file(".github/actions/sccache/action.yml"); puts "ok"'`

zlangley and others added 30 commits May 18, 2026 11:42

fix(recursion): fully constrain exp_bits_len in BabyBear (#2530)

f3362f5

Use fixed 32-row blocks with a full 31-bit decomposition and terminal row. Enforce canonical BabyBear representatives for bit_src.

feat: incorporate child vk pre-hash into inner verifier pvs (#2527)

8de0854

fix: MerkleVerifyAir constrains the Merkle path taken to merkle_idx (#…

6aa2f2d

…2532)

chore: rename v2 crates and folders (#2534)

fbf686b

perf(refactor): use row-major cpu backend with SIMD (#2516)

c320f9b

closes INT-6535 --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

fix: tidx_final_poly_start constraint (#2535)

f168f06

Thread tidx_final_poly_start through internal bus so that it is constant across the evaluation tree.

fix: constrain WHIR merkle index across query rows (#2536)

5a66fd2

fix: chunk len smaller than chunk (#2537)

5cb1e12

Not exploitable without breaking Posiedon2, but still nice to enforce

fix: constrain omega in whir query air (#2538)

54c26af

fix: accumulate eq_partial over proof, not just round (#2541)

07af592

fix: WhirRoundAir missing constraints (#2540)

b6f13af

fix: constrain proof_idx to start from 0 (#2543)

dc20cfa

resolves int-6773

fix: various proof shape + stacking + transcript fixes (#2542)

ed42400

docs: update stale NestedForLoopSubAir docs (#2539)

fb6c505

fix: zero boundary condition in ConstraintsFoldingAir (#2546)

3dbae02

This resolves INT-6777

fix: make Eq3bAir receive n_logup and n_lift (#2547)

115a956

This resolves INT-6774. Eq3bAir now maintains `n_logup` as well, constrains that it's constant across the AIR as well as `n_lift` and receives them one per AIR.

chore: Remove the now redundant DagCommitBus (#2551)

27ad873

docs: migrate recursion docs to OpenVM (#2553)

c765f75

fix: constrain is_last in ProofShapeAir (#2557)

25a9a80

feat: verify-stark guest library and SDK integration (#2555)

275397b

fix: Make it impossible for a row to be first and second (#2556)

fc79f34

This resolves INT-6814

fix: assert there is one row when flag in {0, 2} (#2561)

e76003f

fix: deduplicate MerkleTreeSubAir and UserPvsCommitSubAir (#2559)

fddc85a

fix: constrain row_idx_flags to row_idx in UserPvsCommitAir (#2562)

400c0b1