Skip to content

feat: reuse compiled artifacts in rvr#2833

Draft
mansur20478 wants to merge 60 commits into
develop-v2.1.0-rv64from
feat/rvr-reuse-compiled-artifacts
Draft

feat: reuse compiled artifacts in rvr#2833
mansur20478 wants to merge 60 commits into
develop-v2.1.0-rv64from
feat/rvr-reuse-compiled-artifacts

Conversation

@mansur20478

@mansur20478 mansur20478 commented Jun 1, 2026

Copy link
Copy Markdown

Part 2 of the original ticket split into two. Part 1: link

Part 2 consist of reusing persisted artifacts whenever possible instead of recompiling. The idea is to compile in rvr-mode till getting .c files, and instead of invoking make, check whether already existing .so are fit for reuse. Artifact fingerprints must match for .so file to be reused. During compilation

Artifact fingerprint is a SHA-256 fingerprint of:

  1. native_debug_info flag
  2. toolchain.compiler, toolchain.linker, toolchain.make, toolchain.host_os
  3. host_cpu_features: affects -march=native (unnecessary?)
  4. make_args: make variables except EXT_LIBS and EXT_SRCS
  5. name + content of every file under generated C project_dir in sorted order. The folder includes external static libs and external source files too.

New methods:
compile_pure_cached, compile_metered_cached, compile_metered_cost_cached that take cache_dir as an additional argument - a place on disk for storing artifacts. Invokable from sdk.

The following loading functions do not perform validation:
load_compiled_pure, load_compiled_metered, load_compiled_metered_cost methods don't have .so validation.

Modified build.rs of extensions to rebuild rvr FFI staticlibs on Cargo.lock and ffi-common source changes.

towards INT-7843

@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown

Claude finished @mansur20478's task in 1m 34s —— View job


Code Review: PR #2833 — Reuse compiled artifacts in RVR

  • Gather context and read changed files
  • Review artifact cache implementation
  • Review compile.rs changes
  • Review SDK integration
  • Review tests
  • Post review findings

Summary

This PR adds content-addressed caching for compiled RVR shared libraries. The fingerprint is a truncated SHA-256 of the generated C project files, toolchain identity, host CPU features, and make arguments. On cache hit, the .so is loaded directly from the cache directory; on miss, it's compiled and stored. The design is clean and the approach is sound.

Findings

1. Potential race condition on concurrent cache writes (Medium)

compile_impl at crates/vm/src/arch/rvr/compile.rs:470-474 does save_artifact which calls fs::copy directly to the cache path. If two processes compile the same artifact concurrently, they can race on writing the same file — one may read a half-written .so. The standard mitigation is write-to-temp-then-rename (atomic on the same filesystem):

// write to a temp file in cache_dir, then rename
let tmp = cache_dir.join(format!(".rvr-{fingerprint}.tmp.{}", std::process::id()));
fs::copy(&self.lib_path, &tmp)?;
fs::rename(&tmp, &dest_path)?;

This is especially relevant since the PR description mentions SDK-level usage, where parallel compilation processes are likely.

Fix this →

2. host_cpu_features() on Linux only reads the first CPU's flags (Low)

crates/vm/src/arch/rvr/artifact_cache.rs:70-77 — On multi-socket systems, /proc/cpuinfo contains multiple flags: lines (one per logical CPU). The code uses .find() which returns only the first match. This is fine for the common case (all sockets identical), but on heterogeneous systems it could lead to a wrong fingerprint. A comment noting this assumption would be helpful, or alternatively the code could read all flags: lines.

3. Fingerprint truncation to 128 bits (Nit)

crates/vm/src/arch/rvr/artifact_cache.rs:50 — The SHA-256 is truncated to 128 bits (&result[..16]). For a content-addressed cache this is more than sufficient — collision probability is negligible. Just noting this is intentional (the doc comment says "32-character lowercase hex string (128 bits of SHA-256)").

4. hash_dir_into doesn't hash relative paths, only filenames (Low)

crates/vm/src/arch/rvr/artifact_cache.rs:96-114 — The function hashes the filename (file_name()) of each entry, not the path relative to the root project_dir. This means if a directory name changes but all file names inside stay the same, the fingerprint won't change. However, since this is called recursively and directory names are hashed (line 103-104), this is actually correct — directory names serve as path separators in the hash stream. Good.

5. Redundant project generation on cache hit (Medium — Performance)

crates/vm/src/arch/rvr/compile.rs:415setup_project (which includes IR conversion, block building, C project generation, and writing extension staticlibs) runs before the cache-hit check. This means even on a cache hit, all the C code generation work is performed. The fingerprint needs the project files to hash, so this is somewhat unavoidable with the current design. However, if performance is a concern, a two-phase approach could be considered: first compute a cheaper "pre-fingerprint" from just the VmExe + extensions + toolchain (skipping C codegen), and only do full codegen on miss. This is a future optimization, not a blocker.

6. Missing compile_cached variant for compile_metered_segment_boundary (Low)

There's no compile_metered_segment_boundary_cached function, and metered_segment_rvr_instance in vm.rs doesn't have a cached counterpart. This may be intentional if segment-boundary mode isn't used in the target workflow, but it's worth noting for completeness.

7. Cache miss check also uses exists() (Low)

crates/vm/src/arch/rvr/compile.rs:437cache_path.exists() is a TOCTOU check (file could be deleted between the check and the load). In practice this is unlikely for a build cache, but using load_compiled_from_path directly and handling the error would be more robust. Low priority.

8. load_compiled_from_path fingerprint recovery is fragile (Low)

crates/vm/src/arch/rvr/compile.rs:378-383 — The fingerprint is recovered from the filename by stripping "rvr-" prefix. If someone manually names a file rvr-foobar.so, it would get a bogus fingerprint "foobar". This is fine for the cache (which always writes proper names), but the function is public and could be called with arbitrary paths. The warning log on line 368 already signals this — acceptable.

9. Test coverage is thorough

The 7 new tests in crates/sdk/src/tests.rs cover all three modes (pure, metered, metered_cost) for both save/load roundtrips and cache hit/miss roundtrips, plus an interleaving test. The test_sdk_cache_hit_after_interleaved_program test verifying mtime is unchanged on cache hit is a nice touch.

10. API surface expansion — many near-identical functions (Design)

The PR adds compile_cached, compile_metered_cached, compile_metered_cost_cached which are identical to their non-cached counterparts except for cache_dir: Some(cache_dir). Similarly in vm.rs, each mode gets a _cached variant. This is the standard pattern in the existing codebase (e.g., compile, compile_metered, compile_metered_cost already existed), so it's consistent. However, since CompileOptions already has cache_dir: Option<&Path>, callers who use compile_with_options can just set it there — the convenience functions are optional sugar. The growing combinatorial surface could become a maintenance burden. Not blocking, but worth considering whether the convenience wrappers are all needed vs. directing callers to compile_with_options.

Overall Assessment

The implementation is well-structured and the fingerprinting approach is sound. The main actionable item is the race condition on concurrent cache writes (finding #1). The rest are low-priority observations. The test coverage is good. The code follows existing patterns in the codebase.

@github-actions

This comment has been minimized.

@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown

Claude finished @mansur20478's task in 1m 42s —— View job


Code Review: PR #2833 — Reuse compiled artifacts in RVR

  • Gather context and read changed files
  • Review artifact cache implementation (artifact_cache.rs)
  • Review compile.rs changes
  • Review SDK integration (lib.rs, vm.rs)
  • Review tests
  • Post review findings

Summary

This PR adds content-addressed caching for compiled RVR shared libraries. A SHA-256 fingerprint is computed from the generated C project files, toolchain identity, host CPU features, and make arguments. On cache hit, the .so is loaded directly; on miss, it's compiled and stored. The race condition from the previous review has been fixed with write-to-temp-then-rename.

Findings

1. setup_project runs on every call, even cache hits (Medium — Performance)

crates/vm/src/arch/rvr/compile.rs:425-452setup_project (IR conversion, block building, C code generation, writing extension staticlibs) runs before the fingerprint is computed and the cache is checked. This means even on a cache hit, all the expensive C codegen work is performed and then thrown away.

This is somewhat inherent to the current design since the fingerprint needs the generated files to hash. However, it's worth calling out as it limits the wall-clock benefit of caching to just the make invocation. If the codegen step is a significant fraction of total time, a two-phase approach could help: compute a cheaper pre-fingerprint from the VmExe + config (skip codegen), and only do full codegen on miss. Not blocking, but worth tracking as a follow-up optimization.

2. Fingerprint domain separation between hash inputs is minimal (Low)

crates/vm/src/arch/rvr/artifact_cache.rs:33-47 — The fields are separated by \0 bytes, but there's no length-prefixing or structured delimiter between the different categories of input. For example, if toolchain.compiler ends with \0 followed by bytes that look like the start of toolchain.linker, the hash could be confused. In practice this won't happen with real paths, but a more robust approach would use length-prefixed fields or a structured format. Very low risk for real-world usage.

3. hash_dir_into silently skips unreadable entries via flatten() (Low)

crates/vm/src/arch/rvr/artifact_cache.rs:97fs::read_dir(dir)?.flatten().collect() silently drops entries that fail to read (e.g., permission errors). This means two project directories that differ only in an unreadable file would produce the same fingerprint. Given the controlled temp directory context this is very unlikely, but using .collect::<Result<Vec<_>, _>>()? instead of .flatten().collect() would be more correct:

let mut entries: Vec<_> = fs::read_dir(dir)?
    .collect::<Result<Vec<_>, _>>()?;

4. load_compiled_from_path warning fires on every cache hit (Low)

crates/vm/src/arch/rvr/compile.rs:378-382load_compiled_from_path logs a tracing::warn! about "loading rvr artifact without compatibility validation." On a cache hit (compile.rs:449), the code calls load_compiled_from_path(&cache_path), which triggers this warning even though the artifact was just fingerprint-validated. Consider either:

  • Adding a separate load_from_cache path that skips the warning
  • Or downgrading to debug! with a note that cache-loaded artifacts are fingerprint-validated

Fix this →

5. host_cpu_features() on Linux reads only the first CPU's flags (Low)

crates/vm/src/arch/rvr/artifact_cache.rs:70-77 — On multi-socket systems, /proc/cpuinfo has one flags: line per logical CPU. The .find() call only returns the first match. For homogeneous systems (the common case) this is fine. On heterogeneous systems or certain virtualized environments, different CPUs can expose different flag sets. Low risk in practice.

6. Temp file name doesn't guard against PID reuse (Nit)

crates/vm/src/arch/rvr/compile.rs:95 — The temp file uses {file_name}.tmp.{pid}. If a process crashes and its PID is reused, the stale temp file could collide. Adding a timestamp or random suffix would be more robust, though PID reuse within the same cache directory is extremely unlikely.

7. API surface growth (Design)

The PR adds 3 new compile_*_cached functions in compile.rs, 3 *_cached methods on VmExecutor, 3 get_*_cached methods on VirtualMachine, and 3 SDK-level methods — 12 new functions that are near-identical to their non-cached counterparts except for cache_dir. Since CompileOptions already has cache_dir: Option<&Path>, callers using compile_with_options get caching for free. The convenience wrappers are consistent with the existing pattern but the combinatorial surface will grow with each new mode. Not blocking, but consider whether directing callers to compile_with_options could replace some of these.

8. Race condition fix is solid

crates/vm/src/arch/rvr/compile.rs:93-107 — The write-to-temp-then-rename pattern is correctly implemented. The cleanup on rename failure (fs::remove_file(&tmp_path)) is a nice touch. fs::rename is atomic on the same filesystem (which is guaranteed since both files are in cache_dir).

9. Test coverage is thorough

The 7 new tests in crates/sdk/src/tests.rs cover:

  • Save/load roundtrips for all 3 modes (pure, metered, metered_cost)
  • Cold cache → warm cache roundtrips for all 3 modes
  • Interleaved programs with mtime-based cache hit verification

The mtime assertion in test_sdk_cache_hit_after_interleaved_program is a nice touch for verifying no unnecessary work on cache hit.

Overall Assessment

The implementation is well-structured and the fingerprinting approach is sound. The race condition fix from the prior review is properly implemented. The main actionable item is finding #4 (spurious warning on cache hit). The rest are low-priority observations. Good test coverage across all modes.

@github-actions

This comment has been minimized.

@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown

Claude finished @mansur20478's task in 1m 38s —— View job


Code Review: PR #2833 — Reuse compiled artifacts in RVR

  • Gather context and read changed files
  • Review artifact_cache.rs (new file)
  • Review compile.rs changes
  • Review vm.rs changes
  • Review SDK integration (lib.rs, compiled.rs)
  • Review tests
  • Post review findings

Summary

This PR adds content-addressed caching for compiled RVR shared libraries. The fingerprint is a truncated SHA-256 of the generated C project files, toolchain identity, host CPU features, and make arguments. On cache hit the .so is loaded directly; on miss it's compiled and stored. The race condition from the first review round has been properly fixed with write-to-temp-then-rename. Previous load_*_instance methods were removed as redundant.

Overall the implementation is clean and well-tested. Here are my findings, in priority order:


Findings

1. hash_dir_into silently drops unreadable directory entries (Medium)

crates/vm/src/arch/rvr/artifact_cache.rs:97.flatten() silently swallows io::Errors from read_dir entries. If a file can't be stat'd (e.g. permission error, broken symlink), it's simply skipped, meaning two directories that differ only by an unreadable entry produce the same fingerprint. Given this is a controlled temp directory the practical risk is low, but correctness dictates propagating errors:

// Before:
let mut entries: Vec<_> = fs::read_dir(dir)?.flatten().collect();

// After:
let mut entries: Vec<_> = fs::read_dir(dir)?
    .collect::<Result<Vec<_>, _>>()?;

Fix this →

2. Full setup_project runs on every call, even cache hits (Medium — Performance)

crates/vm/src/arch/rvr/compile.rs:415-442 — The entire C codegen pipeline (convert_vmexe_to_ir_with_debug, build_blocks, CProject::write_all, write_extension_staticlibs) runs before the fingerprint is computed and the cache is checked. This means even on a cache hit, all the expensive IR conversion and C generation work is performed and discarded.

This is inherent to the current design since the fingerprint needs the generated files. Worth tracking as a follow-up optimization: a two-phase fingerprint (cheap pre-fingerprint from VmExe bytes + config to skip codegen on hit, full fingerprint on miss for validation) could eliminate this overhead.

3. compile_generated_project log files (make.stdout.log, make.stderr.log) are included in the fingerprint (Low — Correctness on recompile)

crates/vm/src/arch/rvr/compile.rs:615-623compile_generated_project writes make.stdout.log and make.stderr.log into output_dir (the same directory passed to hash_dir_into). However, the fingerprint is computed before compile_generated_project is called (line 418-427), so on the initial compile pass these files don't exist yet and don't affect the fingerprint. This is correct for the current flow. But if compile_impl were ever restructured to recompute the fingerprint after compilation (e.g. for validation), the log files would pollute the hash. Consider writing logs to a subdirectory or outside the hashed project directory for robustness.

4. No domain separation between the initial flag byte and toolchain strings (Nit)

crates/vm/src/arch/rvr/artifact_cache.rs:33-41 — The native_debug_info flag is hashed as a single byte (0x00 or 0x01) with no trailing separator before toolchain.compiler. This means (debug=false, compiler="...") hashes \x00... while (debug=true, compiler="...") hashes \x01.... Since the flag byte is fixed-length (1 byte), there's technically no ambiguity — but adding a \0 after it (like the other fields) would make the format more uniform and defensive:

h.update([native_debug_info as u8]);
h.update(b"\0"); // separator for consistency

5. Temp file name doesn't fully guard against concurrent PID reuse (Nit)

crates/vm/src/arch/rvr/compile.rs:95 — The temp file is named {file_name}.tmp.{pid}. If a process crashes and its PID is reused before the stale temp is cleaned up, there's a collision. Extremely unlikely in practice, but appending a random suffix or timestamp would close the hole completely. Low priority.

6. load_compiled_from_path is a public building block but only used internally (Design)

crates/vm/src/arch/rvr/compile.rs:372load_compiled_from_path is fn (crate-private), which is appropriate. Previously there was concern about a warn! log firing on cache hits (from the earlier review), but that warn! has been removed — the current version just loads and recovers the fingerprint from the filename. This is clean.

7. API surface growth (Design)

The PR adds 3 compile_*_cached functions in compile.rs, 3 *_cached methods on VmExecutor, 3 get_*_cached methods on VirtualMachine, and 3 SDK-level methods — 12 new functions that are near-identical to their non-cached counterparts except for cache_dir. Since CompileOptions already has cache_dir: Option<&Path>, callers using compile_with_options get caching for free. The convenience wrappers are consistent with the existing pattern (compile, compile_metered, compile_metered_cost already existed), so this follows the established convention. Not blocking, but the growing combinatorial surface may warrant directing advanced callers to compile_with_options in the future.

8. Test coverage is thorough

The 4 new tests in crates/sdk/src/tests.rs (lines 189-350) cover:

  • Pure, metered, and metered-cost cold→warm cache roundtrips with result verification
  • Interleaved programs with mtime-based cache hit verification (no unnecessary rewrite)

The mtime assertion in test_sdk_cache_hit_after_interleaved_program (line 343) is a nice touch for verifying no unnecessary work on cache hit.

9. Race condition fix is solid

crates/vm/src/arch/rvr/compile.rs:93-107 — The write-to-temp-then-rename pattern is correctly implemented. The cleanup on rename failure (fs::remove_file(&tmp_path)) handles the error path well. fs::rename is atomic on the same filesystem (guaranteed since both files are in cache_dir).


Overall Assessment

The implementation is well-structured and the fingerprinting approach is sound. The main actionable items are finding #1 (error propagation in hash_dir_into) and finding #2 (tracking the performance optimization opportunity for cache hits). The rest are low-priority observations. Good test coverage across all modes.

@github-actions

This comment has been minimized.

@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown

Claude finished @mansur20478's task in 1m 56s —— View job


Code Review: PR #2833 — Reuse compiled artifacts in RVR

  • Gather context and read changed files
  • Review artifact_cache.rs (new fingerprinting module)
  • Review compile.rs changes (caching logic)
  • Review vm.rs changes (API surface)
  • Review SDK integration (lib.rs, compiled.rs)
  • Review tests
  • Post review findings

Summary

This PR replaces the unvalidated load_*_instance methods with content-addressed caching. A SHA-256 fingerprint is computed from the generated C project files, toolchain identity, host CPU features, and make arguments. On cache hit the .so is loaded directly; on miss it's compiled and stored. The race condition fix from earlier review rounds (write-to-temp-then-rename) is correctly in place. The old load_compiled_from_path is now crate-private and the warning log is gone.

The implementation is clean and well-tested. I have one medium-priority finding and several low-priority observations.

Findings

1. Duplicate code in _cached variants at VmExecutor level (Medium — Maintainability)

crates/vm/src/arch/vm.rs:459-493 (metered_rvr_instance_cached) and crates/vm/src/arch/vm.rs:555-582 (metered_cost_rvr_instance_cached) are near-exact copies of their non-cached counterparts, differing only in compile_metered vs compile_metered_cached. The same pattern exists at the compile.rs level (6 functions, 3 cached + 3 non-cached, differing only by cache_dir: None vs cache_dir: Some(cache_dir)).

Since CompileOptions already has cache_dir: Option<&Path> and compile_with_options passes it through correctly, the convenience functions could be collapsed. For example, each non-cached variant could call through to a single internal helper that takes Option<&Path>:

pub fn compile<F: PrimeField32>(
    exe: &VmExe<F>,
    extensions: &ExtensionRegistry<F>,
) -> Result<RvrCompiled, CompileError> {
    compile_inner(exe, extensions, None)
}

pub fn compile_cached<F: PrimeField32>(
    exe: &VmExe<F>,
    extensions: &ExtensionRegistry<F>,
    cache_dir: &Path,
) -> Result<RvrCompiled, CompileError> {
    compile_inner(exe, extensions, Some(cache_dir))
}

This isn't blocking — the current approach follows the established pattern and each function is straightforward. But with 12 new functions across 3 layers (compile.rsVmExecutorVirtualMachine → SDK), the maintenance surface is significant. Worth considering whether advanced callers can just use compile_with_options directly.

2. setup_project runs on every call, even cache hits (Low — Performance, tracking item)

crates/vm/src/arch/rvr/compile.rs:415 — The full C codegen pipeline (IR conversion → block building → CProject::write_allwrite_extension_staticlibs) runs before the fingerprint is computed and the cache is checked. On a cache hit all that work is discarded — only the make step is skipped.

This is inherent to the current design since the fingerprint needs the generated files to hash. Worth tracking as a follow-up: a two-phase fingerprint (hash VmExe bytes + config cheaply first, skip codegen on hit) could eliminate this overhead. The comment on line 417-418 shows this was a deliberate design choice.

3. TracerMode not explicitly in fingerprint — relies on generated file content (Low)

The TracerMode isn't directly hashed in compute_fingerprint (artifact_cache.rs:26-51). Instead it's captured indirectly because different modes produce different C code (via CProject::new(…, opts.tracer_mode)), which changes the file hashes. This works correctly — the interleaved test (test_sdk_cache_hit_after_interleaved_program) proves pure and metered modes get distinct cache entries. However, explicitly hashing the mode would be more defensive against future changes where two modes could produce identical C but different runtime behavior.

4. Platform extension string repeated in 3 places (Nit)

The if cfg!(target_os = "macos") { "dylib" } else { "so" } pattern appears at compile.rs:74, compile.rs:432, and compile.rs:525 (the last for a different purpose). Consider extracting a const or small helper to avoid drift:

fn shared_lib_ext() -> &'static str {
    if cfg!(target_os = "macos") { "dylib" } else { "so" }
}

5. Extra blank line at vm.rs:989-990 (Nit)

There's a double blank line between metered_cost_rvr_instance_cached and preflight_interpreter. Nightly rustfmt may or may not catch this depending on config.

6. host_cpu_features() on Linux reads only the first CPU's flags (Low)

artifact_cache.rs:70-77.find() returns the first flags: line from /proc/cpuinfo. On homogeneous systems this is correct. On heterogeneous multi-socket or certain virtualized environments, different logical CPUs can expose different flag sets. Low risk in practice — worth a brief comment noting the assumption.

7. Test coverage is thorough

The 4 tests in crates/sdk/src/tests.rs cover:

  • Cold → warm cache roundtrips for pure, metered, and metered-cost modes with result verification
  • Interleaved programs with mtime-based cache hit verification (no unnecessary rewrite)

The mtime assertion in test_sdk_cache_hit_after_interleaved_program (line 343) is a particularly nice touch — it proves the cache hit path doesn't copy a new file.

8. Race condition fix is solid

compile.rs:93-107 — The write-to-temp-then-rename pattern is correctly implemented. Cleanup on rename failure is handled. fs::rename is atomic on the same filesystem, which is guaranteed since both files are in cache_dir.

Overall Assessment

The implementation is well-structured. The fingerprinting approach is sound and the previous review findings (race condition, hash_dir_into error propagation, warn! on cache hit) have all been addressed. The main observation is the growing API surface (finding #1), which isn't blocking but worth tracking. Good test coverage across all modes.

@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown

Claude finished @mansur20478's task in 2m 4s —— View job


Code Review: PR #2833 — Reuse compiled artifacts in RVR

  • Gather context and read changed files
  • Review artifact_cache.rs (new fingerprinting module)
  • Review compile.rs changes (caching logic)
  • Review vm.rs changes (API surface)
  • Review SDK integration (lib.rs, compiled.rs)
  • Review tests
  • Post review findings

Summary

This PR adds content-addressed caching for compiled RVR shared libraries. A SHA-256 fingerprint is computed from the generated C project files, toolchain identity, host CPU features, and make arguments. On cache hit the .so is loaded directly; on miss it's compiled and stored. Previous reviews' findings (race condition, hash_dir_into error propagation, warn! on cache hit, load_*_instance removal) have all been addressed. The code is in good shape.

Findings

1. Non-cached functions now delegate through Option<&Path> — clean but compile_cached takes Option, not &Path (Low — API consistency)

crates/vm/src/arch/rvr/compile.rs:261-265compile_cached accepts cache_dir: Option<&Path>, making it strictly a superset of compile(). This is fine internally, but at the VmExecutor layer (vm.rs:307-316) and VirtualMachine layer, the _cached methods take cache_dir: &Path (non-optional). The inconsistency between layers is minor but could confuse callers about whether None is a valid input at the compile level. Consider either making compile_cached take &Path (since compile already covers the None case) or documenting that the Option is an internal convenience.

2. setup_project runs on every call, even cache hits (Medium — Performance, tracking item)

crates/vm/src/arch/rvr/compile.rs:371 — The full C codegen pipeline (IR conversion, block building, CProject::write_all, extension staticlibs) runs before the fingerprint is computed and the cache is checked (line 375-397). On a cache hit, all that work is discarded — only the make step is skipped.

This is inherent to the current design since the fingerprint needs the generated files to hash. Worth tracking as a follow-up optimization: a two-phase fingerprint (hash VmExe bytes + config cheaply first, skip codegen on hit) could eliminate this overhead. The benefit of caching is currently limited to skipping the make compilation only.

3. No domain separator after native_debug_info flag byte (Nit)

crates/vm/src/arch/rvr/artifact_cache.rs:33 — The native_debug_info flag is hashed as a single byte (0x00 or 0x01) without a trailing \0 separator before toolchain.compiler. Since the flag is fixed-length (1 byte) there's technically no ambiguity, but adding a \0 after it would make the format consistent with all other fields:

h.update([native_debug_info as u8]);
h.update(b"\0"); // separator for consistency

4. host_cpu_features() reads only the first CPU's flags on Linux (Low)

crates/vm/src/arch/rvr/artifact_cache.rs:70-77.find() returns the first flags: line from /proc/cpuinfo. On homogeneous systems this is correct. On heterogeneous multi-socket or certain virtualized environments, different logical CPUs can expose different flag sets. Low risk in practice; a brief inline comment noting the assumption would be sufficient.

5. Temp file name uses PID only (Nit)

crates/vm/src/arch/rvr/compile.rs:95{file_name}.tmp.{pid} as a temp file name is fine for the common case, but if a process crashes and its PID is reused before cleanup, there's a collision. Extremely unlikely in practice. Adding a random suffix (e.g., via rand or a timestamp) would close this gap entirely, but not blocking.

6. TracerMode not explicitly in fingerprint (Low)

The TracerMode isn't directly hashed in compute_fingerprint (artifact_cache.rs:26-51). It's captured indirectly because different modes produce different C code. The interleaved test (test_sdk_cache_hit_after_interleaved_program) proves this works. However, explicitly hashing the mode as a byte would be more defensive against future changes where two modes could produce identical C but different runtime behavior.

7. Double blank line at vm.rs:989-990 (Nit)

There's a double blank line between metered_cost_rvr_instance_cached and preflight_interpreter.

8. API surface growth (Design)

The PR adds ~12 new functions across 4 layers (compile.rsVmExecutorVirtualMachine → SDK) that are near-identical to their non-cached counterparts, differing only in cache_dir. Since CompileOptions already has cache_dir: Option<&Path> and compile_with_options passes it through, callers who use compile_with_options get caching for free. The convenience wrappers are consistent with the existing pattern (compile, compile_metered, compile_metered_cost already existed pre-PR). Not blocking, but the combinatorial surface may warrant directing advanced callers to compile_with_options in the future.

What looks good

  • Race condition fix (compile.rs:93-107): Write-to-temp-then-rename is correctly implemented. Cleanup on rename failure is handled. Atomic on the same filesystem.
  • Error propagation in hash_dir_into (artifact_cache.rs:97): Uses .collect::<Result<Vec<_>, _>>()? instead of .flatten() — properly propagates IO errors.
  • Non-cached functions delegate to cached with None (compile.rs:212-257): Clean refactor that avoids code duplication in compile_impl.
  • Test coverage (tests.rs:189-350): Four tests covering cold/warm cache roundtrips for all 3 modes plus interleaved program cache isolation with mtime verification.
  • Fingerprint design: Covers all relevant inputs (files, toolchain, CPU features, make args, debug flag). Truncation to 128 bits is more than sufficient for content-addressing.
  • load_compiled_from_path (compile.rs:328-345): Now crate-private, warning removed, fingerprint recovered from filename — clean.

Overall Assessment

The implementation is well-structured and all previous review findings have been addressed. The fingerprinting approach is sound and the test coverage is thorough. The main tracking item is finding #2 (performance: codegen runs even on cache hits). The remaining findings are low-priority nits. Ship it.

@github-actions

This comment has been minimized.

@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown

Claude finished @mansur20478's task in 1m 49s —— View job


Code Review: PR #2833 — Reuse compiled artifacts in RVR

  • Gather context and read changed files
  • Review artifact_cache.rs (new fingerprinting module)
  • Review compile.rs changes (caching logic)
  • Review vm.rs changes (API surface)
  • Review SDK integration (lib.rs, compiled.rs)
  • Review tests
  • Post review findings

Summary

This PR adds content-addressed caching for compiled RVR shared libraries. A SHA-256 fingerprint is computed from the generated C project files, toolchain identity, host CPU features, and make arguments. On cache hit the .so is loaded from the cache directory; on miss it's compiled and stored. Previous review findings (race condition, hash_dir_into error propagation) have been addressed. The implementation is clean and well-tested.

Findings

1. load_compiled_from_path warning fires on every cache hit (Medium)

crates/vm/src/arch/rvr/compile.rs:334-338load_compiled_from_path logs a tracing::warn! about "loading rvr artifact without compatibility validation." On a cache hit (compile.rs:406), the code calls load_compiled_from_path(&cache_path), which triggers this warning even though the artifact was just fingerprint-validated. This will be noisy in production logs where cache hits are the common case.

Options:

  • Add a separate internal load_from_cache_path that skips the warning
  • Or add a validated: bool parameter and downgrade to debug! when true
  • Or simply downgrade the log level from warn! to debug!, since the doc comment already states the caller's responsibility

Fix this →

2. setup_project runs on every call, even cache hits (Medium — Performance)

crates/vm/src/arch/rvr/compile.rs:381 — The full C codegen pipeline (convert_vmexe_to_ir_with_debug, build_blocks, CProject::write_all, write_extension_staticlibs) runs before the fingerprint is computed and the cache is checked (line 385-408). On a cache hit all that work is discarded — only the make step is skipped.

This is inherent to the current design since the fingerprint needs the generated files to hash. Worth tracking as a follow-up: a two-phase fingerprint (hash VmExe bytes + config cheaply first, skip codegen on hit) could eliminate this overhead. The benefit of caching is currently limited to skipping the make compilation step only.

3. compile_cached accepts Option<&Path> while VmExecutor _cached methods accept Option<&Path> too — but compile delegates to compile_cached with None (Low — API clarity)

crates/vm/src/arch/rvr/compile.rs:261-265compile_cached takes cache_dir: Option<&Path>, making it a strict superset of compile(). This is clean internally (no code duplication in compile_impl), but the public API has both compile (no cache) and compile_cached (with optional cache) where the latter with None is identical to the former. This is fine as-is for discoverability, just noting the relationship.

4. TracerMode not explicitly in fingerprint (Low)

The TracerMode isn't directly hashed in compute_fingerprint (artifact_cache.rs:26-51). It's captured indirectly because different modes produce different C code (via CProject::new(…, opts.tracer_mode)), which changes the file hashes. The interleaved test (test_sdk_cache_hit_after_interleaved_program) proves this works. However, explicitly hashing the mode as a discriminant byte would be more defensive against future changes where two modes could produce identical C but different runtime behavior:

h.update([opts.tracer_mode as u8]);
h.update(b"\0");

5. host_cpu_features() on Linux reads only the first CPU's flags (Low)

crates/vm/src/arch/rvr/artifact_cache.rs:70-77.find() returns only the first flags: line from /proc/cpuinfo. On homogeneous systems (the common case) this is correct. On heterogeneous multi-socket or certain virtualized environments, different logical CPUs can expose different flag sets. Low risk in practice; a brief inline comment noting the assumption would help.

6. Temp file name uses PID only (Nit)

crates/vm/src/arch/rvr/compile.rs:95{file_name}.tmp.{pid} as a temp file name works for the common case. If a process crashes and its PID is reused before cleanup, there's a collision. Extremely unlikely in practice. Adding a random suffix (e.g., a timestamp or counter) would close this gap entirely but isn't blocking.

7. API surface growth (Design observation)

The PR adds ~12 new functions across 4 layers (compile.rsVmExecutorVirtualMachine → SDK) that are near-identical to their non-cached counterparts, differing only in cache_dir. Since CompileOptions already has cache_dir: Option<&Path> and compile_with_options passes it through, callers who use compile_with_options get caching for free. The convenience wrappers are consistent with the existing pattern (compile, compile_metered, compile_metered_cost already existed pre-PR). Not blocking, but worth considering whether directing advanced callers to compile_with_options could replace some of these.

8. No compile_metered_segment_boundary_cached variant (Nit)

crates/vm/src/arch/rvr/compile.rs:229-248compile_metered_segment_boundary and metered_segment_rvr_instance (vm.rs:482-506) don't have _cached counterparts. This may be intentional if segment-boundary mode isn't used in the caching workflow, just noting for completeness.

What looks good

  • Race condition fix (compile.rs:93-107): Write-to-temp-then-rename is correctly implemented. Cleanup on rename failure is handled. fs::rename is atomic on the same filesystem.
  • Error propagation in hash_dir_into (artifact_cache.rs:98): Uses .collect::<Result<Vec<_>, _>>()? — properly propagates IO errors. Previous review finding addressed.
  • Non-cached functions delegate to compile_impl with cache_dir: None (compile.rs:212-257): Clean refactor avoiding code duplication.
  • Fingerprint design (artifact_cache.rs:26-51): Covers all relevant inputs (files, toolchain, CPU features, make args, debug flag). Separator bytes between fields. Truncation to 128 bits is more than sufficient for content-addressing.
  • Test coverage (tests.rs:189-440): Seven new tests covering save/load roundtrips for all 3 modes, cold/warm cache roundtrips for all 3 modes, and interleaved program cache isolation with mtime verification. The mtime assertion in test_sdk_cache_hit_after_interleaved_program (line 433) is a nice touch.
  • load_compiled_from_path is crate-private (fn not pub) — wait, it is actually pub. It's re-exported in mod.rs:16. This is fine since the doc comment clearly documents the caller's responsibility.

Overall Assessment

The implementation is well-structured and the fingerprinting approach is sound. The main actionable item is finding #1 (spurious warn! on cache hit). Finding #2 is a performance optimization worth tracking as a follow-up. The rest are low-priority observations. Good test coverage across all modes.

@github-actions

This comment has been minimized.

@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown

Claude finished @mansur20478's task in 1m 40s —— View job


Code Review: PR #2833 — Reuse compiled artifacts in RVR

  • Gather context and read changed files
  • Review artifact_cache.rs (new fingerprinting module)
  • Review compile.rs changes (caching logic)
  • Review vm.rs changes (API surface)
  • Review SDK integration (lib.rs, compiled.rs)
  • Review tests
  • Post review findings

Summary

This PR adds content-addressed caching for compiled RVR shared libraries (.so/.dylib). A SHA-256 fingerprint is computed from the generated C project files, toolchain identity, host CPU features, and make arguments. On cache hit the .so is loaded directly; on miss it's compiled and stored. The design is clean, the race condition fix (write-to-temp-then-rename) is solid, and hash_dir_into properly propagates IO errors. Previous review findings have been addressed well.

The code is in good shape. I have one medium finding and several low-priority observations.


Findings

1. load_compiled_from_path warning fires on every cache hit (Medium)

crates/vm/src/arch/rvr/compile.rs:337-343load_compiled_from_path takes a validated: bool parameter, but the conditional only gates the warn! log. On a cache hit (compile.rs:411), the code calls load_compiled_from_path(&cache_path, true), which correctly suppresses the warning. This is good — the previous review concern about the spurious warn! has been addressed.

However, the validated parameter name is misleading at the call site. It reads as "this artifact has been validated" when really it just means "suppress the warning." Consider renaming to suppress_warning or inverting to warn_unvalidated for clarity. Alternatively, a brief doc comment on the parameter would suffice.

2. Full setup_project runs on every call, even cache hits (Medium — Performance, tracking item)

crates/vm/src/arch/rvr/compile.rs:384-399 — The entire C codegen pipeline (convert_vmexe_to_ir_with_debug, build_blocks, CProject::write_all, write_extension_staticlibs) runs before the fingerprint is computed and the cache is checked. On a cache hit, all that work is discarded — only the make step is skipped.

This is inherent to the current design since the fingerprint needs the generated files to hash. Worth tracking as a follow-up: a two-phase fingerprint (hash VmExe bytes + config cheaply first, skip codegen on hit) could eliminate this overhead significantly. The current caching benefit is limited to skipping the make step.

3. TracerMode not explicitly in fingerprint — relies on generated file content (Low)

The TracerMode isn't directly hashed in compute_fingerprint (artifact_cache.rs:26-51). It's captured indirectly because different modes produce different C code. The test_sdk_cache_hit_after_interleaved_program test validates this works in practice. However, explicitly hashing the mode as a discriminant byte would be more defensive against future changes where two modes could produce identical C but different runtime behavior:

h.update([opts.tracer_mode as u8]);
h.update(b"\0");

4. host_cpu_features() reads only the first CPU's flags on Linux (Low)

crates/vm/src/arch/rvr/artifact_cache.rs:71-77.find() returns only the first flags: line from /proc/cpuinfo. On homogeneous systems (the common case) this is correct. On heterogeneous multi-socket or certain virtualized environments, different logical CPUs can expose different flag sets. Low risk in practice; a brief inline comment noting the assumption would help future readers.

5. Temp file name uses PID only (Nit)

crates/vm/src/arch/rvr/compile.rs:95{file_name}.tmp.{pid} works for the common case. If a process crashes and its PID is reused before cleanup, there's a collision. Extremely unlikely in practice. Adding a random suffix or timestamp would close this gap entirely, but not blocking.

6. API surface growth (Design observation)

The PR adds ~12 new functions across 4 layers (compile.rsVmExecutorVirtualMachine → SDK). Each _cached variant is near-identical to its non-cached counterpart, differing only in cache_dir. At the compile.rs level the non-cached functions already delegate to the cached variants with None (e.g., compile() calls compile_cached(…, None)), which is a clean refactor. The convenience wrappers at higher layers follow the established pattern — not blocking, but the combinatorial surface will grow if new modes are added. Worth considering if compile_with_options could replace some higher-layer wrappers in the future.

7. No compile_metered_segment_boundary_cached variant (Nit)

crates/vm/src/arch/rvr/compile.rs:229-248compile_metered_segment_boundary and metered_segment_rvr_instance don't have _cached counterparts. This may be intentional if segment-boundary mode isn't used in the caching workflow, just noting for completeness.


What looks good

  • Race condition fix (compile.rs:93-107): Write-to-temp-then-rename is correctly implemented. Cleanup on rename failure is handled. fs::rename is atomic on the same filesystem.
  • Error propagation in hash_dir_into (artifact_cache.rs:98): Uses .collect::<Result<Vec<_>, _>>()? — properly propagates IO errors from read_dir entries.
  • Non-cached functions delegate through cached with None (compile.rs:212-257): Clean refactor avoiding code duplication in compile_impl.
  • Fingerprint design (artifact_cache.rs:26-51): Covers all relevant inputs (files, toolchain, CPU features, make args, debug flag). Separator bytes between fields. Truncation to 128 bits is more than sufficient for content-addressing.
  • validated parameter on load_compiled_from_path (compile.rs:334-343): Properly gates the warning, and cache-hit path passes true.
  • Test coverage (tests.rs:189-440): Seven new tests covering save/load roundtrips for all 3 modes, cold/warm cache roundtrips for all 3 modes, and interleaved program cache isolation with mtime verification. The mtime assertion is a nice touch.

Overall Assessment

The implementation is well-structured and the fingerprinting approach is sound. All actionable findings from previous review rounds have been addressed. The main tracking item is finding #2 (performance: codegen runs even on cache hits). The remaining findings are low-priority. Good test coverage across all modes. Ship it.

@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown

Claude finished @mansur20478's task in 1m 30s —— View job


Code Review: PR #2833 — Reuse compiled artifacts in RVR

  • Gather context and read changed files
  • Review artifact_cache.rs (new fingerprinting module)
  • Review compile.rs changes (caching logic)
  • Review vm.rs changes (API surface)
  • Review SDK integration (lib.rs, compiled.rs)
  • Review tests
  • Post review findings

Summary

This PR adds content-addressed caching for compiled RVR shared libraries. A SHA-256 fingerprint is computed from the generated C project files, toolchain identity, host CPU features, and make arguments. On cache hit the .so is loaded directly from cache; on miss it's compiled and stored. The implementation addresses prior review feedback: race condition (write-to-temp-then-rename), error propagation in hash_dir_into, and warning suppression on cache hits.

The code is in good shape overall. The fingerprinting approach is sound and the test coverage is thorough. Here are my findings:


Findings

1. Full setup_project runs on every call, even cache hits (Medium — Performance)

crates/vm/src/arch/rvr/compile.rs:384-399 — The entire C codegen pipeline (convert_vmexe_to_ir_with_debug, build_blocks, CProject::write_all, write_extension_staticlibs) runs before the fingerprint is computed and the cache is checked (line 402-414). On a cache hit, all that work is discarded — only the make step is skipped.

This is inherent to the current design since the fingerprint needs the generated files to hash. Worth tracking as a follow-up: a two-phase fingerprint (hash VmExe bytes + config cheaply first, skip codegen on hit) could eliminate this overhead. Currently the caching benefit is limited to skipping the make compilation step only.

2. TracerMode not explicitly in fingerprint — relies on generated file content divergence (Low)

crates/vm/src/arch/rvr/artifact_cache.rs:26-51TracerMode isn't directly hashed in compute_fingerprint. It's captured indirectly because different modes produce different C code. The test_sdk_cache_hit_after_interleaved_program test validates this works today. However, explicitly hashing the mode discriminant would be more defensive against future changes where two modes could produce identical C but different runtime behavior:

h.update([opts.tracer_mode as u8]);
h.update(b"\0");

3. host_cpu_features() reads only the first CPU's flags on Linux (Low)

crates/vm/src/arch/rvr/artifact_cache.rs:71-77.find() returns only the first flags: line from /proc/cpuinfo. On homogeneous systems (the common case) this is correct. On heterogeneous multi-socket or certain virtualized environments, different logical CPUs can expose different flag sets. Low risk in practice; a brief inline comment noting the assumption would help future readers.

4. Temp file name uses PID only — no guard against PID reuse after crash (Nit)

crates/vm/src/arch/rvr/compile.rs:95{file_name}.tmp.{pid} works for the common case. If a process crashes and its PID is reused before cleanup, there's a collision on the temp file. Extremely unlikely in practice. Appending a random suffix or timestamp would close the hole entirely, but not blocking.

5. No compile_metered_segment_boundary_cached variant (Nit — Completeness)

crates/vm/src/arch/rvr/compile.rs:229-248compile_metered_segment_boundary and metered_segment_rvr_instance (vm.rs:482-506) don't have _cached counterparts. This may be intentional if segment-boundary mode isn't used in the caching workflow, but worth noting for completeness.

6. API surface growth — 12+ new near-identical functions (Design observation)

The PR adds ~12 new functions across 4 layers (compile.rsVmExecutorVirtualMachine → SDK) that are near-identical to their non-cached counterparts, differing only in cache_dir. At the compile.rs level the non-cached functions cleanly delegate to cached variants with None (e.g., compile()compile_cached(…, None)), which avoids duplication at that layer. The convenience wrappers at higher layers follow the existing codebase pattern. Not blocking, but worth considering if compile_with_options could replace some higher-layer wrappers in the future as the combinatorial surface grows.


What looks good

  • Race condition fix (compile.rs:93-107): Write-to-temp-then-rename is correctly implemented. Cleanup on rename failure is handled. fs::rename is atomic on the same filesystem, which is guaranteed since both files are in cache_dir.
  • Error propagation in hash_dir_into (artifact_cache.rs:98): Uses .collect::<Result<Vec<_>, _>>()? — properly propagates IO errors from read_dir entries.
  • Non-cached functions delegate through cached with None (compile.rs:212-257): Clean refactor avoiding code duplication in compile_impl.
  • Fingerprint design (artifact_cache.rs:26-51): Covers all relevant inputs (files, toolchain, CPU features, make args, debug flag). Separator bytes between fields. Truncation to 128 bits is more than sufficient for content-addressing.
  • suppress_warning parameter on load_compiled_from_path (compile.rs:333-343): Properly gates the warning log, and the cache-hit path passes true to suppress it.
  • Deterministic directory hashing (artifact_cache.rs:97-114): Entries are sorted by path, directories are recursed into with a /\0 separator — correct and deterministic.
  • Test coverage (tests.rs:189-440): Seven new tests covering save/load roundtrips for all 3 modes, cold/warm cache roundtrips for all 3 modes, and interleaved program cache isolation with mtime verification. The mtime assertion in test_sdk_cache_hit_after_interleaved_program is a nice touch for proving the cache-hit path doesn't unnecessarily rewrite.

Overall Assessment

The implementation is well-structured and the fingerprinting approach is sound. All prior review findings have been addressed. The main tracking item is finding #1 (performance: codegen runs even on cache hits). The remaining findings are low-priority observations. Good test coverage across all modes. Ship it.

@mansur20478 mansur20478 self-assigned this Jun 1, 2026
@mansur20478 mansur20478 marked this pull request as ready for review June 1, 2026 21:00
@mansur20478 mansur20478 requested a review from shuklaayush June 1, 2026 21:00
@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown

Claude finished @mansur20478's task in 1m 53s —— View job


Code Review: PR #2833 — Reuse compiled artifacts in RVR

  • Gather context and read changed files
  • Review artifact_cache.rs (new fingerprinting module)
  • Review compile.rs changes (caching logic)
  • Review vm.rs changes (API surface)
  • Review SDK integration (lib.rs, compiled.rs)
  • Review tests
  • Post review findings

Summary

This PR adds content-addressed caching for compiled RVR shared libraries. A SHA-256 fingerprint is computed from the generated C project files, toolchain identity, host CPU features, and make arguments. On cache hit the .so is loaded directly from the cache directory; on miss it's compiled and stored. The race condition fix (write-to-temp-then-rename) is solid, hash_dir_into correctly propagates IO errors, and the suppress_warning parameter properly gates the log on cache hits. Previous review feedback has been addressed well.

The code is in good shape. I have one real bug (incomplete doc comment), one medium performance observation, and several low-priority findings.


Findings

1. Truncated doc comment on load_compiled_metered_cost (Bug)

crates/sdk/src/lib.rs:497-498 — The doc comment was broken when the "no compatibility validation" clause was removed:

/// Load a previously saved metered-cost-mode `.so`. The `MeteredCostCtx` is
/// rebuilt. Caller supplies `app_exe`;

The sentence ends with a dangling semicolon. Should be either Caller supplies \app_exe`.(period) or the trailing clause should be removed entirely to match the style of theload_compiled_puredoc comment on line 337 which just says/// Load a previously saved pure-mode `.so``.

Fix this →

2. Full setup_project runs on every call, even cache hits (Medium — Performance)

crates/vm/src/arch/rvr/compile.rs:384-399 — The entire C codegen pipeline (convert_vmexe_to_ir_with_debug, build_blocks, CProject::write_all, write_extension_staticlibs) runs before the fingerprint is computed and the cache is checked (line 390-399). On a cache hit, all that codegen work is discarded — only the make step is skipped.

This is inherent to the current design since the fingerprint needs the generated files to hash. Worth tracking as a follow-up: a two-phase fingerprint (hash VmExe bytes + config cheaply first, skip codegen on hit) could eliminate this overhead. Currently the caching benefit is limited to skipping the make compilation step.

3. TracerMode not explicitly in fingerprint — relies on C code divergence (Low)

crates/vm/src/arch/rvr/artifact_cache.rs:26-51TracerMode isn't directly hashed in compute_fingerprint. It's captured indirectly because different modes produce different C code. The test_sdk_cache_hit_after_interleaved_program test validates this works today. However, explicitly hashing the mode discriminant would be more defensive against future changes where two modes could produce identical C but different runtime behavior:

h.update([opts.tracer_mode as u8]);
h.update(b"\0");

4. host_cpu_features() reads only the first CPU's flags on Linux (Low)

crates/vm/src/arch/rvr/artifact_cache.rs:71-77.find() returns only the first flags: line from /proc/cpuinfo. On homogeneous systems (the common case) this is correct. On heterogeneous multi-socket or certain virtualized environments, different logical CPUs can expose different flag sets. Low risk in practice; a brief inline comment noting the assumption of homogeneous CPUs would help future readers.

5. Temp file name uses PID only (Nit)

crates/vm/src/arch/rvr/compile.rs:95{file_name}.tmp.{pid} works for the common case. If a process crashes and its PID is reused before cleanup, there's a collision on the temp file. Extremely unlikely in practice. Appending a random suffix or timestamp would close the hole entirely, but not blocking.

6. No compile_metered_segment_boundary_cached variant (Nit — Completeness)

crates/vm/src/arch/rvr/compile.rs:228-248compile_metered_segment_boundary and metered_segment_rvr_instance (vm.rs:482-506) don't have _cached counterparts. This may be intentional if segment-boundary mode isn't used in the caching workflow, but worth noting for completeness.


What looks good

  • Race condition fix (compile.rs:93-107): Write-to-temp-then-rename is correctly implemented. Cleanup on rename failure is handled. fs::rename is atomic on the same filesystem, which is guaranteed since both files are in cache_dir.
  • Error propagation in hash_dir_into (artifact_cache.rs:98): Uses .collect::<Result<Vec<_>, _>>()? — properly propagates IO errors from read_dir entries.
  • Non-cached functions delegate through cached with None (compile.rs:212-257): Clean refactor avoiding code duplication in compile_impl.
  • Fingerprint design (artifact_cache.rs:26-51): Covers all relevant inputs (files, toolchain, CPU features, make args, debug flag). Separator bytes between fields. Truncation to 128 bits is more than sufficient for content-addressing.
  • suppress_warning parameter on load_compiled_from_path (compile.rs:333-343): Properly gates the warning log, and the cache-hit path passes true to suppress it.
  • Deterministic directory hashing (artifact_cache.rs:97-114): Entries are sorted by path, directories are recursed into with a /\0 separator — correct and deterministic.
  • Test coverage (tests.rs:279-440): Four new tests covering cold/warm cache roundtrips for all 3 modes plus interleaved program cache isolation with mtime verification. The mtime assertion in test_sdk_cache_hit_after_interleaved_program is a nice touch for proving the cache-hit path doesn't rewrite.
  • Validation moved earlier (compile.rs:366-382): Mode/suspension-policy validation now happens before setup_project, failing fast before expensive codegen.

Overall Assessment

The implementation is well-structured and the fingerprinting approach is sound. All prior review findings have been addressed. The one actionable bug is finding #1 (truncated doc comment). Finding #2 is a performance optimization worth tracking as a follow-up. The remaining findings are low-priority observations. Good test coverage across all modes.

@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown

Claude finished @mansur20478's task in 1m 47s —— View job


Code Review: PR #2833 — Reuse compiled artifacts in RVR

  • Gather context and read changed files
  • Review artifact_cache.rs (new fingerprinting module)
  • Review compile.rs changes (caching logic)
  • Review vm.rs changes (API surface)
  • Review SDK integration (lib.rs, compiled.rs)
  • Review tests
  • Post review findings

Summary

This PR adds content-addressed caching for compiled RVR shared libraries. A SHA-256 fingerprint is computed from the generated C project files, toolchain identity, host CPU features, and make arguments. On cache hit the .so is loaded directly from cache; on miss it's compiled and stored. The implementation addresses prior review feedback well: race condition (write-to-temp-then-rename), error propagation in hash_dir_into, and warning suppression on cache hits via suppress_warning parameter.

The code is in good shape overall. The fingerprinting approach is sound, the race condition fix is solid, and test coverage is thorough. Here are my findings:


Findings

1. Full setup_project runs on every call, even cache hits (Medium — Performance)

crates/vm/src/arch/rvr/compile.rs:384-399 — The entire C codegen pipeline (convert_vmexe_to_ir_with_debug, build_blocks, CProject::write_all, write_extension_staticlibs) runs before the fingerprint is computed and the cache is checked (lines 390–413). On a cache hit, all that work is discarded — only the make step is skipped.

This is inherent to the current design since the fingerprint needs the generated files to hash. Worth tracking as a follow-up: a two-phase fingerprint (hash VmExe bytes + config cheaply first, skip codegen on hit) could eliminate this overhead. Currently the caching benefit is limited to skipping the make compilation step.

2. TracerMode not explicitly in fingerprint — relies on generated file content divergence (Low)

crates/vm/src/arch/rvr/artifact_cache.rs:26-51TracerMode isn't directly hashed in compute_fingerprint. It's captured indirectly because different modes produce different C code. The test_sdk_cache_hit_after_interleaved_program test validates this works today. However, explicitly hashing the mode discriminant would be more defensive against future changes where two modes could produce identical C but different runtime behavior:

h.update([opts.tracer_mode as u8]);
h.update(b"\0");

3. host_cpu_features() reads only the first CPU's flags on Linux (Low)

crates/vm/src/arch/rvr/artifact_cache.rs:71-77.find() returns only the first flags: line from /proc/cpuinfo. On homogeneous systems (the common case) this is correct. On heterogeneous multi-socket or certain virtualized environments, different logical CPUs can expose different flag sets. Low risk in practice; a brief inline comment noting the assumption of homogeneous CPUs would help future readers.

4. Temp file name uses PID only — no guard against PID reuse after crash (Nit)

crates/vm/src/arch/rvr/compile.rs:95{file_name}.tmp.{pid} works for the common case. If a process crashes and its PID is reused before cleanup, there's a collision on the temp file. Extremely unlikely in practice. Appending a random suffix or timestamp would close the hole entirely, but not blocking.

5. No compile_metered_segment_boundary_cached variant (Nit — Completeness)

crates/vm/src/arch/rvr/compile.rs:229-248compile_metered_segment_boundary and metered_segment_rvr_instance (vm.rs:482-506) don't have _cached counterparts. This may be intentional if segment-boundary mode isn't used in the caching workflow, but worth noting for completeness.

6. API surface growth — 12+ new near-identical functions (Design)

The PR adds ~12 new functions across 4 layers (compile.rsVmExecutorVirtualMachine → SDK) that are near-identical to their non-cached counterparts, differing only in cache_dir. At the compile.rs level the non-cached functions cleanly delegate to cached variants with None (e.g., compile()compile_cached(…, None)) which avoids duplication at that layer. The convenience wrappers at higher layers follow the existing codebase pattern. Not blocking, but worth considering if compile_with_options could replace some higher-layer wrappers in the future as the combinatorial surface grows.


What looks good

  • Race condition fix (compile.rs:93-107): Write-to-temp-then-rename is correctly implemented. Cleanup on rename failure is handled. fs::rename is atomic on the same filesystem, which is guaranteed since both files are in cache_dir.
  • Error propagation in hash_dir_into (artifact_cache.rs:98): Uses .collect::<Result<Vec<_>, _>>()? — properly propagates IO errors from read_dir entries.
  • Non-cached functions delegate through cached with None (compile.rs:212-257): Clean refactor avoiding code duplication in compile_impl.
  • Fingerprint design (artifact_cache.rs:26-51): Covers all relevant inputs (files, toolchain, CPU features, make args, debug flag). Separator bytes between fields. Truncation to 128 bits is more than sufficient for content-addressing.
  • suppress_warning parameter on load_compiled_from_path (compile.rs:333-343): Properly gates the warning log, and the cache-hit path passes true to suppress it.
  • Deterministic directory hashing (artifact_cache.rs:97-114): Entries are sorted by path, directories are recursed into with a /\0 separator — correct and deterministic.
  • Test coverage (tests.rs:189-440): Seven new tests covering save/load roundtrips for all 3 modes, cold/warm cache roundtrips for all 3 modes, and interleaved program cache isolation with mtime verification. The mtime assertion in test_sdk_cache_hit_after_interleaved_program is a nice touch for proving the cache-hit path doesn't rewrite unnecessarily.
  • Validation moved early (compile.rs:366-382): Mode/suspension-policy validation happens before setup_project, failing fast before expensive codegen.

Overall Assessment

The implementation is well-structured and the fingerprinting approach is sound. All prior review findings have been addressed. The main tracking item is finding #1 (performance: codegen runs even on cache hits). The remaining findings are low-priority observations. Good test coverage across all modes. Ship it.

@github-actions

This comment has been minimized.

@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown

Claude finished @mansur20478's task in 1m 49s —— View job


Code Review: PR #2833 — Reuse compiled artifacts in RVR

  • Gather context and read changed files
  • Review artifact_cache.rs (new fingerprinting module)
  • Review compile.rs changes (caching logic)
  • Review vm.rs changes (API surface)
  • Review SDK integration (lib.rs, compiled.rs, tests.rs)
  • Review build.rs changes
  • Post review findings

Summary

This PR adds content-addressed caching for compiled RVR shared libraries. A SHA-256 fingerprint is computed from the generated C project files, toolchain identity, host CPU features, and make arguments. On cache hit the .so is loaded directly from the cache directory; on miss it's compiled and stored. Prior review findings (race condition, hash_dir_into error propagation, warning suppression on cache hits) have all been addressed. The implementation is clean and well-tested.

Findings

1. Full setup_project runs on every call, even cache hits (Medium — Performance)

crates/vm/src/arch/rvr/compile.rs:384-399 — The entire C codegen pipeline (convert_vmexe_to_ir_with_debug, build_blocks, CProject::write_all, write_extension_staticlibs) runs before the fingerprint is computed and the cache is checked (lines 390–413). On a cache hit, all that work is discarded — only the make step is skipped.

This is inherent to the current design since the fingerprint needs the generated files to hash. Worth tracking as a follow-up: a two-phase fingerprint (hash VmExe bytes + config cheaply first, skip codegen on hit) could eliminate this overhead. Currently the caching benefit is limited to skipping the make compilation step.

2. TracerMode not explicitly in fingerprint — relies on C code divergence (Low)

crates/vm/src/arch/rvr/artifact_cache.rs:26-51TracerMode isn't directly hashed in compute_fingerprint. It's captured indirectly because different modes produce different C code. The test_sdk_cache_hit_after_interleaved_program test validates this works today. However, explicitly hashing the mode discriminant would be more defensive against future changes where two modes could produce identical C but different runtime behavior:

h.update([opts.tracer_mode as u8]);
h.update(b"\0");

3. host_cpu_features() reads only the first CPU's flags on Linux (Low)

crates/vm/src/arch/rvr/artifact_cache.rs:71-77.find() returns only the first flags: line from /proc/cpuinfo. On homogeneous systems (the common case) this is correct. On heterogeneous multi-socket or certain virtualized environments, different logical CPUs can expose different flag sets. Low risk in practice; a brief inline comment noting the assumption of homogeneous CPUs would help future readers.

4. Temp file name uses PID only — no guard against PID reuse after crash (Nit)

crates/vm/src/arch/rvr/compile.rs:95{file_name}.tmp.{pid} works for the common case. If a process crashes and its PID is reused before cleanup, there's a collision on the temp file. Extremely unlikely in practice. Appending a random suffix or timestamp would close the hole entirely, but not blocking.

5. No compile_metered_segment_boundary_cached variant (Nit — Completeness)

crates/vm/src/arch/rvr/compile.rs:229-248compile_metered_segment_boundary and metered_segment_rvr_instance (vm.rs:482-506) don't have _cached counterparts. This may be intentional if segment-boundary mode isn't used in the caching workflow, but worth noting for completeness.

6. API surface growth — 12+ new near-identical functions (Design)

The PR adds ~12 new functions across 4 layers (compile.rsVmExecutorVirtualMachine → SDK) that are near-identical to their non-cached counterparts, differing only in cache_dir. At the compile.rs level the non-cached functions cleanly delegate to cached variants with None (e.g., compile()compile_cached(…, None)) which avoids duplication at that layer. The convenience wrappers at higher layers follow the existing codebase pattern. Not blocking, but the combinatorial surface will grow if new modes are added. Worth considering if compile_with_options could replace some higher-layer wrappers in the future.

What looks good

  • Race condition fix (compile.rs:93-107): Write-to-temp-then-rename is correctly implemented. Cleanup on rename failure is handled. fs::rename is atomic on the same filesystem, which is guaranteed since both files are in cache_dir.
  • Error propagation in hash_dir_into (artifact_cache.rs:98): Uses .collect::<Result<Vec<_>, _>>()? — properly propagates IO errors from read_dir entries.
  • Non-cached functions delegate through cached with None (compile.rs:212-257): Clean refactor avoiding code duplication in compile_impl.
  • Fingerprint design (artifact_cache.rs:26-51): Covers all relevant inputs (files, toolchain, CPU features, make args, debug flag). Separator bytes between fields. Truncation to 128 bits is more than sufficient for content-addressing.
  • suppress_warning parameter on load_compiled_from_path (compile.rs:333-343): Properly gates the warning log, and the cache-hit path passes true to suppress it.
  • Deterministic directory hashing (artifact_cache.rs:97-114): Entries are sorted by path, directories are recursed into with a /\0 separator — correct and deterministic.
  • Validation moved early (compile.rs:366-382): Mode/suspension-policy validation happens before setup_project, failing fast before expensive codegen.
  • build.rs changes (all 7 extension build.rs files): Adding rerun-if-changed for Cargo.toml, Cargo.lock, and ffi-common/src ensures FFI staticlibs are rebuilt when shared dependencies change. Good defensive change.
  • Test coverage (tests.rs:189-440): Seven tests covering save/load roundtrips for all 3 modes, cold/warm cache roundtrips for all 3 modes, and interleaved program cache isolation with mtime verification. The mtime assertion in test_sdk_cache_hit_after_interleaved_program is a nice touch for proving the cache-hit path doesn't rewrite unnecessarily.

Overall Assessment

The implementation is well-structured and the fingerprinting approach is sound. All prior review findings have been addressed. The main tracking item is finding #1 (performance: codegen runs even on cache hits — inherent to the current design, good follow-up optimization). The remaining findings are low-priority. Good test coverage across all modes. Ship it.

@github-actions

This comment has been minimized.

@shuklaayush shuklaayush requested a review from gdmlcjs June 3, 2026 15:23

@gdmlcjs gdmlcjs left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No requests other than this

Comment on lines +45 to +48
for arg in make_args {
h.update(arg.as_bytes());
h.update(b"\0");
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this contains EXT_LIBS, the absolute paths to the FFI staticlibs that get written into a random temporary directory. Could we exclude those and have only the contents in the fingerprinting process?

Base automatically changed from feat/rvr-persist-compiled-artifacts to develop-v2.1.0-rvr June 3, 2026 23:08
@github-actions

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown

Claude finished @mansur20478's task in 3m 2s —— View job


Code Review: PR #2833 — Reuse compiled artifacts in RVR

  • Gather context and read changed files
  • Review artifact_cache.rs (fingerprinting module)
  • Review compile.rs changes (caching logic)
  • Review vm.rs changes (API surface)
  • Review SDK integration (lib.rs, compiled.rs, tests)
  • Review build.rs changes
  • Check reviewer's request (EXT_LIBS absolute paths in fingerprint)
  • Post review findings

Summary

This PR adds content-addressed caching for compiled RVR shared libraries. A SHA-256 fingerprint is computed from the generated C project files, toolchain identity, host CPU features, and make arguments. On cache hit the .so is loaded directly from cache; on miss it's compiled and stored. The cross-process cache test validates that fingerprints are stable across processes (different PIDs/temp dirs).

Addressing @gdmlcjs's Review Comment

The reviewer's concern about EXT_LIBS containing absolute paths to FFI staticlibs in random temp directories has been addressed. At compile.rs:386-392, EXT_LIBS and EXT_SRCS are filtered out of the make_args before fingerprinting:

let fingerprint_args: Vec<String> = make_args
    .iter()
    .filter(|a| !a.starts_with("EXT_LIB") && !a.starts_with("EXT_SRCS"))
    .cloned()
    .collect();

This is correct because:

  • The staticlib contents are already copied into the project dir by write_extension_staticlibs() and are captured by hash_dir_into(project_dir, ...)
  • EXT_SRCS just contains filenames (not absolute paths), and those files are also written into the project dir

However, one minor concern: the filter uses !a.starts_with("EXT_LIB") (singular), while the make variable is EXT_LIBS (plural). This still works because starts_with("EXT_LIB") matches both "EXT_LIBS=..." and "EXT_LIB=...". But if any future make variable like EXT_LIB_VERSION were added, it would be silently excluded. Using !a.starts_with("EXT_LIBS=") would be more precise.

Findings

1. Race condition on concurrent cache writes — still present (Medium)

compile.rs:99-111 (save_artifact) and compile.rs:449-453 — The cache store path uses fs::copy directly to the final cache path. If two processes compile the same artifact concurrently, one may read a half-written .so. The standard mitigation is write-to-temp-then-rename:

let tmp = cache_dir.join(format!(".rvr-{fingerprint}.tmp.{}", std::process::id()));
fs::copy(&self.lib_path, &tmp)?;
fs::rename(&tmp, &cache_path)?;

Previous reviews stated this was fixed, but the current code (save_artifact at line 106) still uses a direct fs::copy. This is especially relevant since the PR description mentions SDK-level usage where parallel compilation processes are likely, and the cross-process test validates that this scenario exists.

Fix this →

2. load_compiled_from_path warning fires on every cache hit (Medium)

compile.rs:337-341load_compiled_from_path logs a tracing::warn! about "loading rvr artifact without compatibility validation." On a cache hit (compile.rs:417), the code calls load_compiled_from_path(&cache_path), which triggers this warning even though the artifact was just fingerprint-validated. This will be noisy in production where cache hits are the common case.

Suggestion: either add a separate internal loader for the cache-hit path that skips the warning, or downgrade to debug! since the doc comment already documents the caller's responsibility.

Fix this →

3. Comment on EXT_SRCS filtering is misleading (Low)

compile.rs:386-387:

// EXT_LIB and EXT_SRCS contain process-id dependent paths, making fingerprint process dependent

EXT_SRCS does not contain process-dependent paths — it's built from filename.to_string() (lines 530-541), which produces plain filenames like "secp256k1.c", not absolute paths. Only EXT_LIBS has the absolute-path problem. The filtering is still correct (the files are captured by hash_dir_into), but the comment should be updated to accurately describe why each is excluded.

4. Full setup_project runs on every call, even cache hits (Low — Performance, tracking item)

compile.rs:384 — The entire C codegen pipeline (IR conversion, block building, CProject::write_all, write_extension_staticlibs) runs before the fingerprint is computed and the cache is checked. On a cache hit all that work is discarded — only the make step is skipped.

This is inherent to the current design since the fingerprint needs the generated files to hash. Worth tracking as a follow-up: a two-phase fingerprint (hash VmExe bytes + config cheaply first, skip codegen on hit) could eliminate this overhead.

5. TracerMode not explicitly in fingerprint (Low)

artifact_cache.rs:26-51TracerMode isn't directly hashed. It's captured indirectly because different modes produce different C code. The cross-process test validates this works today. However, explicitly hashing the mode discriminant would be more defensive against future changes where two modes produce identical C but different runtime behavior:

h.update([opts.tracer_mode as u8]);
h.update(b"\0");

6. host_cpu_features() reads only the first CPU's flags on Linux (Low)

artifact_cache.rs:71-77.find() returns only the first flags: line from /proc/cpuinfo. On homogeneous systems this is correct. On heterogeneous multi-socket or virtualized environments, different logical CPUs can expose different flag sets. Low risk in practice; a brief inline comment noting the assumption would help.

7. EXT_LIB filter is slightly imprecise (Nit)

compile.rs:390!a.starts_with("EXT_LIB") would also match a hypothetical EXT_LIB_VERSION=... or similar future make variable. Using !a.starts_with("EXT_LIBS=") (exact prefix) would be more precise without changing current behavior.


What looks good

  • Fingerprint design (artifact_cache.rs:26-51): Covers all relevant inputs with separator bytes. Truncation to 128 bits is sufficient for content-addressing.
  • Deterministic directory hashing (artifact_cache.rs:97-114): Entries sorted by path, directories recursed with /\0 separator. Error propagation via .collect::<Result<Vec<_>, _>>()? is correct.
  • Non-cached functions delegate through cached with None (compile.rs:214-260): Clean refactor avoiding code duplication.
  • Validation moved early (compile.rs:364-380): Mode/suspension-policy validation before setup_project, failing fast before expensive codegen.
  • Cross-process test (rvr_cross_process_cache.rs): Spawns a separate binary to validate fingerprint stability across processes. The mtime assertion proves no unnecessary rewrite on cache hit.
  • build.rs changes: Adding rerun-if-changed for Cargo.toml, Cargo.lock, and ffi-common/src ensures FFI staticlibs are rebuilt when shared dependencies change.

Overall Assessment

The implementation is well-structured and the fingerprinting approach is sound. The main actionable items are finding #1 (race condition — save_artifact still uses direct fs::copy) and finding #2 (noisy warn! on cache hit). The reviewer's concern about EXT_LIBS absolute paths has been addressed with the starts_with("EXT_LIB") filter. Good test coverage including cross-process validation.

@github-actions

This comment has been minimized.

@github-actions

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown

Claude finished @mansur20478's task in 4m 5s —— View job


Code Review: PR #2833 — Reuse compiled artifacts in RVR

  • Gather context and read changed files
  • Review artifact_cache.rs (fingerprinting module)
  • Review compile.rs changes (caching logic)
  • Review vm.rs changes (API surface)
  • Review SDK integration (lib.rs, compiled.rs, tests)
  • Review build.rs changes
  • Check reviewer's request (EXT_LIBS absolute paths in fingerprint)
  • Post review findings

Summary

This PR adds content-addressed caching for compiled RVR shared libraries. A SHA-256 fingerprint is computed from the generated C project files, toolchain identity, host CPU features, and make arguments. On cache hit the .so is loaded directly; on miss it's compiled and stored. The design is clean and the approach is sound overall.

Findings

1. Bug: EXT_LIBS filter is broken — regression in latest commit (High)

crates/vm/src/arch/rvr/compile.rs:391 — The filter uses !a.starts_with("EXT_LIB=") but the actual make arg format (per project.rs:974) is "EXT_LIBS=..." (with an S). Since "EXT_LIBS=/tmp/..." does NOT start with "EXT_LIB=", this filter does not exclude the EXT_LIBS entries.

The previous commit used !a.starts_with("EXT_LIB") which correctly matched "EXT_LIBS=..." (since "EXT_LIBS" starts with "EXT_LIB"). The latest commit (b1da846) changed this to !a.starts_with("EXT_LIB=") for "precision", but this breaks the filtering entirely.

This means EXT_LIBS paths (which contain process-specific temp dir paths) are still included in the fingerprint, exactly the issue @gdmlcjs flagged. The cross-process test doesn't catch this because Sdk::riscv32() doesn't include extensions with FFI staticlibs. The bug is latent — it will cause cache misses across processes whenever extensions like algebra, bigint, ecc, etc. are used.

Fix: change "EXT_LIB=" to "EXT_LIBS=":

.filter(|a| !a.starts_with("EXT_LIBS=") && !a.starts_with("EXT_SRCS="))

Fix this →

2. Race condition on concurrent cache writes — still present (Medium)

crates/vm/src/arch/rvr/compile.rs:99-111save_artifact uses a direct fs::copy to the final cache path. If two processes compile the same artifact concurrently, one may read a half-written .so. The standard mitigation is write-to-temp-then-rename (atomic on the same filesystem):

pub fn save_artifact(&self, dest_lib: &Path) -> Result<PathBuf, CompileError> {
    if let Some(parent) = dest_lib.parent() {
        fs::create_dir_all(parent).map_err(|source| CompileError::CProject {
            path: parent.to_path_buf(),
            source,
        })?;
    }
    let tmp = dest_lib.with_extension(format!("tmp.{}", std::process::id()));
    fs::copy(&self.lib_path, &tmp).map_err(|source| CompileError::CProject {
        path: tmp.clone(),
        source,
    })?;
    fs::rename(&tmp, dest_lib).map_err(|source| CompileError::CProject {
        path: dest_lib.to_path_buf(),
        source,
    })?;
    Ok(dest_lib.to_path_buf())
}

Previous reviews stated this was fixed, but the current code still uses a direct fs::copy. This is relevant since the PR description mentions SDK-level usage where parallel compilation processes are likely.

Fix this →

3. load_compiled_from_path warning fires on every cache hit (Medium)

crates/vm/src/arch/rvr/compile.rs:337-341load_compiled_from_path logs a tracing::warn! about "loading rvr artifact without compatibility validation." On a cache hit (compile.rs:418), the code calls load_compiled_from_path(&cache_path), which triggers this warning even though the artifact was just fingerprint-validated. This will be noisy in production where cache hits are the common case.

Suggestion: either extract a crate-private load_lib_from_path without the warning for the cache-hit path, or add a suppress_warning: bool parameter.

Fix this →

4. Comment on EXT_SRCS filtering is inaccurate (Low)

crates/vm/src/arch/rvr/compile.rs:386-388:

// EXT_LIB contain process-id dependent paths, making fingerprint process dependent
// however, the file themselves are copied into CProject dir, so we can safely omit EXT_LIB
// and EXT_SRCS

EXT_SRCS does not contain process-dependent paths — it's built from filename.to_string() (lines 536-546), which produces plain filenames like "secp256k1.c". Only EXT_LIBS has the absolute-path problem. The filtering of EXT_SRCS is still correct (the file contents are captured by hash_dir_into), but the comment should clarify that EXT_SRCS is excluded because its information is already captured via file hashing, not because of process-dependent paths.

5. Full setup_project runs on every call, even cache hits (Low — Performance, tracking item)

crates/vm/src/arch/rvr/compile.rs:384 — The entire C codegen pipeline (IR conversion → block building → CProject::write_allwrite_extension_staticlibs) runs before the fingerprint is computed and the cache is checked. On a cache hit, all that work is discarded — only the make step is skipped.

This is inherent to the current design since the fingerprint needs the generated files to hash. Worth tracking as a follow-up: a two-phase fingerprint (hash VmExe bytes + config cheaply first, skip codegen on hit) could eliminate this overhead.

6. TracerMode not explicitly in fingerprint (Low)

crates/vm/src/arch/rvr/artifact_cache.rs:26-51TracerMode isn't directly hashed. It's captured indirectly because different modes produce different C code. The cross-process test validates this works today. However, explicitly hashing the mode discriminant would be more defensive against future changes where two modes could produce identical C but different runtime behavior:

h.update([opts.tracer_mode as u8]);
h.update(b"\0");

7. host_cpu_features() reads only the first CPU's flags on Linux (Low)

crates/vm/src/arch/rvr/artifact_cache.rs:71-77.find() returns only the first flags: line from /proc/cpuinfo. On homogeneous systems (the common case) this is correct. On heterogeneous multi-socket or certain virtualized environments, different logical CPUs can expose different flag sets. Low risk in practice; a brief inline comment noting the assumption would help.

What looks good

  • Fingerprint design (artifact_cache.rs:26-51): Covers all relevant inputs with separator bytes. Truncation to 128 bits is sufficient for content-addressing.
  • Deterministic directory hashing (artifact_cache.rs:97-114): Entries sorted by path, directories recursed with /\0 separator. Error propagation via .collect::<Result<Vec<_>, _>>()? is correct.
  • Non-cached functions delegate through cached with None (compile.rs:214-260): Clean refactor avoiding code duplication in compile_impl.
  • Validation moved early (compile.rs:364-380): Mode/suspension-policy validation before setup_project, failing fast before expensive codegen.
  • Cross-process test (rvr_cross_process_cache.rs): Spawns a separate binary to validate fingerprint stability across processes. The mtime assertion proves no unnecessary rewrite on cache hit.
  • build.rs changes: Adding rerun-if-changed for Cargo.toml, Cargo.lock, and ffi-common/src ensures FFI staticlibs are rebuilt when shared dependencies change.

Overall Assessment

The implementation is well-structured and the fingerprinting approach is sound. Finding #1 is a bug that needs fixing — the EXT_LIB= filter doesn't match the actual EXT_LIBS= make args, meaning absolute temp-dir paths leak into the fingerprint for any config with extension staticlibs. This is exactly the issue @gdmlcjs flagged, and it was accidentally broken in the latest commit. Finding #2 (race condition) and #3 (noisy warning) are medium-priority improvements. The rest are low-priority observations.


@github-actions

This comment has been minimized.

@github-actions

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown

Claude finished @mansur20478's task in 2m 26s —— View job


Code Review: PR #2833 — Reuse compiled artifacts in RVR

  • Gather context and read changed files
  • Review artifact_cache.rs (fingerprinting module)
  • Review compile.rs changes (caching logic)
  • Review vm.rs changes (API surface)
  • Review SDK integration (lib.rs, compiled.rs, tests)
  • Review build.rs changes
  • Check @gdmlcjs's review comment (EXT_LIBS absolute paths)
  • Post review findings

Summary

This PR adds content-addressed caching for compiled RVR shared libraries. A SHA-256 fingerprint is computed from the generated C project files, toolchain identity, host CPU features, and make arguments. On cache hit the .so is loaded directly; on miss it's compiled and stored. The latest commits address @gdmlcjs's review comment about EXT_LIBS absolute paths and add a keccak extension to the cross-process test to exercise the EXT_LIBS filtering path.

The implementation is clean and well-tested. Here are my findings:


Findings

1. save_artifact uses direct fs::copy — not atomic for concurrent writers (Medium)

crates/vm/src/arch/rvr/compile.rs:99-111save_artifact does a direct fs::copy to the final cache path. If two processes compile the same artifact concurrently, one may read a half-written .so. The standard mitigation is write-to-temp-then-rename (atomic on the same filesystem):

let tmp = dest_lib.with_extension(format!("so.tmp.{}", std::process::id()));
fs::copy(&self.lib_path, &tmp)?;
fs::rename(&tmp, dest_lib)?;

This is especially relevant since the PR description mentions SDK-level usage where parallel compilation processes are likely, and the cross-process test validates that this scenario exists.

Fix this →

2. load_compiled_from_path warning fires on every cache hit (Medium)

crates/vm/src/arch/rvr/compile.rs:337-341load_compiled_from_path logs a tracing::warn! about "loading rvr artifact without compatibility validation." On a cache hit (compile.rs:417), the code calls load_compiled_from_path(&cache_path), which triggers this warning even though the artifact was just fingerprint-validated. This will be noisy in production where cache hits are the common case.

Suggestion: extract a crate-private helper that does the actual libloading::Library::new + fingerprint recovery without the warning, and call it from both the cache-hit path and load_compiled_from_path. Or add a suppress_warning: bool parameter.

Fix this →

3. Comment about EXT_SRCS is inaccurate (Low)

crates/vm/src/arch/rvr/compile.rs:386:

// EXT_LIBS and EXT_SRCS contain process-id dependent paths; the files themselves are written
// into project_dir and captured by hash_dir_into, so we can safely omit them here.

EXT_SRCS does not contain process-dependent paths. Looking at project.rs:962, it's built from ext_sources.join(" ") which comes from filename.to_string() (plain filenames like "secp256k1.c", not absolute paths). Only EXT_LIBS has the absolute-path problem (it uses p.display().to_string() at project.rs:972). The filtering is correct but the comment should say that EXT_LIBS contains temp-dir paths and EXT_SRCS is omitted because its content is already captured by hash_dir_into.

4. Full setup_project runs on every call, even cache hits (Low — Performance, tracking item)

crates/vm/src/arch/rvr/compile.rs:384 — The entire C codegen pipeline (convert_vmexe_to_ir_with_debug, build_blocks, CProject::write_all, write_extension_staticlibs) runs before the fingerprint is computed and the cache is checked. On a cache hit, all that work is discarded — only the make step is skipped.

This is inherent to the current design since the fingerprint needs the generated files to hash. Worth tracking as a follow-up: a two-phase fingerprint (hash VmExe bytes + config cheaply first, skip codegen on hit) could eliminate this overhead.

5. TracerMode not explicitly in fingerprint — relies on C code divergence (Low)

crates/vm/src/arch/rvr/artifact_cache.rs:26-51TracerMode isn't directly hashed in compute_fingerprint. It's captured indirectly because different modes produce different C code. The cross-process test validates this works today. However, explicitly hashing the mode discriminant would be more defensive against future changes where two modes could produce identical C but different runtime behavior:

h.update([opts.tracer_mode as u8]);
h.update(b"\0");

6. host_cpu_features() reads only the first CPU's flags on Linux (Low)

crates/vm/src/arch/rvr/artifact_cache.rs:71-77.find() returns only the first flags: line from /proc/cpuinfo. On homogeneous systems (the common case) this is correct. On heterogeneous multi-socket or certain virtualized environments, different logical CPUs can expose different flag sets. Low risk in practice; a brief inline comment noting the assumption would help future readers.


Addressing @gdmlcjs's Review Comment

The EXT_LIBS filtering has been fixed correctly. At compile.rs:388-392, EXT_LIBS and EXT_SRCS are filtered out of the make_args before fingerprinting:

let fingerprint_args: Vec<String> = make_args
    .iter()
    .filter(|a| !a.starts_with("EXT_LIBS=") && !a.starts_with("EXT_SRCS="))
    .cloned()
    .collect();

This matches the actual format "EXT_LIBS=/tmp/openvm-rvr-XXX/lib..." generated at project.rs:974. The staticlib contents are already copied into the project dir by write_extension_staticlibs() and are captured by hash_dir_into(project_dir, ...).

The cross-process test has been updated to include a keccak extension (rvr_cache_probe.rs:43) which exercises the EXT_LIBS path, verifying that the filtering works correctly across processes.


What looks good

  • Fingerprint design (artifact_cache.rs:26-51): Covers all relevant inputs with separator bytes. Truncation to 128 bits is more than sufficient for content-addressing.
  • Error propagation in hash_dir_into (artifact_cache.rs:98): Uses .collect::<Result<Vec<_>, _>>()? — properly propagates IO errors from read_dir entries.
  • Deterministic directory hashing (artifact_cache.rs:97-114): Entries sorted by path, directories recursed with /\0 separator — correct and deterministic.
  • Non-cached functions delegate through cached with None (compile.rs:214-260): Clean refactor avoiding code duplication in compile_impl.
  • Validation moved early (compile.rs:364-380): Mode/suspension-policy validation happens before setup_project, failing fast before expensive codegen.
  • Cross-process test (rvr_cross_process_cache.rs): Spawns a separate binary to validate fingerprint stability across processes with mtime verification. Now includes keccak to exercise EXT_LIBS filtering.
  • build.rs changes (all 7 extension build.rs files): Adding rerun-if-changed for Cargo.toml, Cargo.lock, and ffi-common/src ensures FFI staticlibs are rebuilt when shared dependencies change.

Overall Assessment

The implementation is well-structured and the fingerprinting approach is sound. The EXT_LIBS filtering fix from @gdmlcjs's review has been addressed correctly. The main actionable items are finding #1 (non-atomic save_artifact) and finding #2 (noisy warn! on cache hit). The remaining findings are low-priority observations.

@github-actions

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown

Claude finished @mansur20478's task in 2m 17s —— View job


Code Review: PR #2833 — Reuse compiled artifacts in RVR

  • Gather context and read changed files
  • Review artifact_cache.rs (fingerprinting module)
  • Review compile.rs changes (caching logic)
  • Review vm.rs changes (API surface)
  • Review SDK integration (lib.rs, compiled.rs, tests)
  • Review build.rs changes
  • Check @gdmlcjs's review comment (EXT_LIBS absolute paths)
  • Post review findings

Summary

This PR adds content-addressed caching for compiled RVR shared libraries. A SHA-256 fingerprint is computed from the generated C project files, toolchain identity, host CPU features, and make arguments. On cache hit the .so is loaded directly from cache; on miss it's compiled and stored. The EXT_LIBS filtering that @gdmlcjs flagged has been addressed. The race condition fix (write-to-temp-then-rename in save_artifact) is solid. The cross-process test with keccak extension validates fingerprint stability across processes.

Overall the implementation is clean, well-structured, and thoroughly tested. Here are my findings:


Findings

1. load_compiled_from_path warning fires on every cache hit (Medium)

crates/vm/src/arch/rvr/compile.rs:347-351load_compiled_from_path logs a tracing::warn! about "loading rvr artifact without compatibility validation." On a cache hit (compile.rs:427), the code calls load_compiled_from_path(&cache_path), which triggers this warning even though the artifact was just fingerprint-validated. In production where cache hits are the common case, this will spam warn-level logs.

Suggestion: extract the unsafe { libloading::Library::new(...) } + fingerprint recovery into a crate-private helper without the warning, and call that from the cache-hit path. Keep the public load_compiled_from_path with the warn! for the unvalidated load use case.

// crate-private: used by cache-hit path where fingerprint is already validated
fn load_lib(lib_path: &Path) -> Result<RvrCompiled, CompileError> {
    let lib = unsafe {
        libloading::Library::new(lib_path)
            .map_err(|e| CompileError::LibLoad(format!("{}: {}", lib_path.display(), e)))?
    };
    let fingerprint = lib_path.file_stem()
        .and_then(|s| s.to_str())
        .and_then(|s| s.strip_prefix("rvr-"))
        .map(str::to_owned);
    Ok(RvrCompiled { lib, lib_path: lib_path.to_path_buf(), artifact_dir: None, fingerprint })
}

pub fn load_compiled_from_path(lib_path: &Path) -> Result<RvrCompiled, CompileError> {
    tracing::warn!(/* ... */);
    load_lib(lib_path)
}

Fix this →

2. Comment about EXT_SRCS filtering is inaccurate (Low)

crates/vm/src/arch/rvr/compile.rs:396-397:

// EXT_LIBS and EXT_SRCS contain process-id dependent paths; the files themselves are written
// into project_dir and captured by hash_dir_into, so we can safely omit them here.

EXT_SRCS does not contain process-dependent paths. Looking at project.rs:962, EXT_SRCS is built from ext_sources.join(" ") which comes from filename.to_string() — plain filenames like "secp256k1.c", not absolute paths. Only EXT_LIBS has the absolute-path problem (it uses p.display().to_string() at project.rs:972). The filtering of EXT_SRCS is correct (its content is already captured via hash_dir_into), but the comment should distinguish between the two reasons:

// EXT_LIBS contains absolute paths into a random temp directory (process-dependent);
// EXT_SRCS lists plain filenames already written into project_dir.
// Both are fully captured by hash_dir_into, so we omit them from the make_args fingerprint.

3. TracerMode not explicitly in fingerprint — relies on C code divergence (Low)

crates/vm/src/arch/rvr/artifact_cache.rs:26-51TracerMode isn't directly hashed in compute_fingerprint. It's captured indirectly because different modes produce different C code. The test_rvr_cache_cross_process test validates this works today. However, explicitly hashing the mode discriminant would be more defensive against future changes where two modes might produce identical C but different runtime behavior:

h.update([opts.tracer_mode as u8]);
h.update(b"\0");

4. Full setup_project runs on every call, even cache hits (Low — Performance, tracking item)

crates/vm/src/arch/rvr/compile.rs:394 — The entire C codegen pipeline (convert_vmexe_to_ir_with_debug, build_blocks, CProject::write_all, write_extension_staticlibs) runs before the fingerprint is computed and the cache is checked (line 406-415). On a cache hit, all that work is discarded — only the make step is skipped.

This is inherent to the current design since the fingerprint needs the generated files to hash. Worth tracking as a follow-up: a two-phase fingerprint (hash VmExe bytes + config cheaply first, skip codegen on hit) could eliminate this overhead.

5. host_cpu_features() reads only the first CPU's flags on Linux (Low)

crates/vm/src/arch/rvr/artifact_cache.rs:71-77.find() returns only the first flags: line from /proc/cpuinfo. On homogeneous systems (the common case) this is correct. On heterogeneous multi-socket or certain virtualized environments, different logical CPUs can expose different flag sets. Low risk in practice; a brief inline comment noting the assumption would help future readers.

6. No compile_metered_segment_boundary_cached variant (Nit — Completeness)

crates/vm/src/arch/rvr/compile.rs:242-261compile_metered_segment_boundary and metered_segment_rvr_instance (vm.rs:482-506) don't have _cached counterparts. This may be intentional if segment-boundary mode isn't used in the caching workflow, but worth noting for completeness.


What looks good

  • Race condition fix (compile.rs:106-119): Write-to-temp-then-rename is correctly implemented. fs::rename is atomic on the same filesystem (guaranteed since both files are in cache_dir). Cleanup on rename failure is handled.
  • Error propagation in hash_dir_into (artifact_cache.rs:98): Uses .collect::<Result<Vec<_>, _>>()? — properly propagates IO errors from read_dir entries.
  • EXT_LIBS filtering (compile.rs:398-402): EXT_LIBS= and EXT_SRCS= are correctly filtered from the make args before fingerprinting. The staticlib contents are captured by hash_dir_into over the project directory. The cross-process test now includes keccak to exercise this path.
  • Non-cached functions delegate through cached with None (compile.rs:225-270): Clean refactor avoiding code duplication in compile_impl.
  • Fingerprint design (artifact_cache.rs:26-51): Covers all relevant inputs (files, toolchain, CPU features, make args, debug flag). Separator bytes between fields. Truncation to 128 bits is more than sufficient for content-addressing.
  • Deterministic directory hashing (artifact_cache.rs:97-114): Entries sorted by path, directories recursed with /\0 separator — correct and deterministic.
  • Validation moved early (compile.rs:374-390): Mode/suspension-policy validation happens before setup_project, failing fast before expensive codegen.
  • Cross-process test (rvr_cross_process_cache.rs): Spawns a separate binary to validate fingerprint stability across processes with mtime verification. Includes keccak to exercise EXT_LIBS filtering.
  • build.rs changes (all 7 extension build.rs files): Adding rerun-if-changed for Cargo.toml, Cargo.lock, and ffi-common/src ensures FFI staticlibs are rebuilt when shared dependencies change.

Overall Assessment

The implementation is well-structured and the fingerprinting approach is sound. The EXT_LIBS absolute-path issue from @gdmlcjs's review has been addressed. The main actionable item is finding #1 (spurious warn! on every cache hit). The remaining findings are low-priority observations. Good test coverage including cross-process validation.

mansur20478 and others added 19 commits June 8, 2026 16:39
Add save and load compiled artifacts feature in `rvr` mode. The feature
consist of having the ability to save compilation artifacts on disk and
load them into the sdk to execute (part 1). Reusing of the persisted
artifacts whenever possible instead of recompiling based on some
metadata (part 2) will be done in separate PR.

This PR is related to the part 1. The following methods were added:
1) `Sdk::load_compiled_pure`, `Sdk::load_compiled_metered`,
`Sdk::load_compiled_metered_cost` and related methods for loading pure,
metered and metered cost `.so` files

2) `RvrPureInstance::save`, `RvrMeteredInstance::save`,
`RvrMeteredCostInstance::save` and related methods for saving `.so` file

towards INT-7843
@mansur20478 mansur20478 force-pushed the feat/rvr-reuse-compiled-artifacts branch from db2328d to 61ee036 Compare June 8, 2026 20:55
@mansur20478 mansur20478 marked this pull request as draft June 8, 2026 20:59
@mansur20478 mansur20478 marked this pull request as ready for review June 8, 2026 20:59
@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown

Code review

Findings

1. Cache fingerprint does not differentiate aarch64 (or any non-x86) Linux hostscrates/vm/src/arch/rvr/artifact_cache.rs

host_cpu_features() reads /proc/cpuinfo and looks for a line starting with "flags":

/// Other platforms: fall back to the Rust target arch string.
pub fn host_cpu_features() -> String {
#[cfg(target_os = "linux")]
{
// The `flags` line lists every ISA extension the OS exposes.
// Two CPUs with the same model name can differ here under
// virtualization (hypervisor feature masking).
fs::read_to_string("/proc/cpuinfo")
.ok()
.and_then(|s| {
s.lines()
.find(|l| l.starts_with("flags"))
.map(|l| l.to_string())
})
.unwrap_or_default()
}
#[cfg(target_os = "macos")]
{
std::process::Command::new("sysctl")
.args(["-n", "machdep.cpu.brand_string"])
.output()
.ok()
.map(|o| String::from_utf8_lossy(&o.stdout).trim().to_string())
.unwrap_or_default()
}
#[cfg(not(any(target_os = "linux", target_os = "macos")))]
{
std::env::consts::ARCH.to_string()
}
}

That line only exists on x86/x86_64. On aarch64 Linux the equivalent line is Features: (capital F); on riscv64 it is isa:. So find(|l| l.starts_with("flags")) returns None on those arches and host_cpu_features() falls back to the empty string. Combined with the rest of the fingerprint inputs — toolchain.host_os is only "Linux"/"Darwin" with no arch (crates/rvr/rvr-openvm/src/toolchain.rs), make_args carry no arch info, and the generated C content under project_dir is identical between two aarch64 hosts — two different aarch64 Linux machines compute the same fingerprint.

Meanwhile the Makefile hardcodes -march=native in CFLAGS unconditionally (no arch guard), so the compiled .so does encode CPU-specific instructions on aarch64.

Concrete failure: A shared cache_dir (NFS, CI artifact cache, container volume mounted across nodes) used by two aarch64 Linux machines with the same OS/toolchain but different ISA extensions — e.g., AWS Graviton2 (no SVE) and Graviton3 (SVE2/BF16), or a build host with newer extensions than a deploy host. The richer machine compiles, stores rvr-<fp>.so using SVE/BF16; the older machine hits the cache (same fingerprint), dlopens the wrong-arch .so, and SIGILLs on the first unsupported instruction. macOS aarch64 is unaffected — it goes through sysctl machdep.cpu.brand_string, which embeds the chip model. On a single machine this is harmless; the bug only bites when cache_dir is shared across aarch64 hosts.

Suggested fix (smallest): on Linux, look for both flags (x86) and Features: (arm64), and fold std::env::consts::ARCH (and ideally uname -m / CPU implementer/part) into the hash so a missing-features fallback still differentiates architectures.


CLAUDE.md compliance: clean — no file-structure, CUDA-gating, generated-C, or naming-convention violations in the changed lines.

Reviewed at PR head 5d2277b. Generated with Claude Code.

@mansur20478 mansur20478 requested a review from gdmlcjs June 8, 2026 21:12
@mansur20478 mansur20478 marked this pull request as draft June 8, 2026 21:16
@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
group app.proof_time_ms app.cycles leaf.proof_time_ms
fibonacci 3,794 12,000,265 922
keccak 18,485 18,655,329 3,258
sha2_bench 10,242 14,793,960 1,473
regex 1,407 4,137,067 357
ecrecover 597 123,583 250
pairing 893 1,745,757 262
kitchen_sink 3,959 2,579,903 958

Note: cells_used metrics omitted because CUDA tracegen does not expose unpadded trace heights.

Commit: 5d2277b

Benchmark Workflow

@shuklaayush shuklaayush changed the base branch from develop-v2.1.0-rvr to develop-v2.1.0-rv64 June 19, 2026 08:27
@shuklaayush shuklaayush force-pushed the develop-v2.1.0-rv64 branch from 316c914 to 0b8d705 Compare June 19, 2026 10:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants