Skip to content

feat: 2^32 memory address #2850

Open
GunaDD wants to merge 16 commits into
perf/merkle-tree-mem-optfrom
feat/2-pow-32-memory-addresses
Open

feat: 2^32 memory address #2850
GunaDD wants to merge 16 commits into
perf/merkle-tree-mem-optfrom
feat/2-pow-32-memory-addresses

Conversation

@GunaDD

@GunaDD GunaDD commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

feat: support 2^32 byte memory addresses

Raises the RV64 memory address space to the full 2^32 bytes. Since RV64_MEMORY_AS
stores u16 cells, that means cell pointers are now up to 31 bits wide
(POINTER_MAX_BITS: 28 → 31).

Why pointers become two limbs

A 31-bit pointer no longer fits safely in a single BabyBear field element, so every
pointer on the memory bus is now sent as two little-endian 16-bit limbs.
MemoryAddress.pointer becomes pointer_limbs: [T; 2], and the bus payload is
[address_space, ptr_lo, ptr_hi, data..., timestamp].

Main changes

  • Memory system: MemoryAddress, memory bus, and offline checker switched to
    two-limb pointers. The persistent boundary AIR decomposes the merkle leaf label
    into range-checked limbs so the leaf pointer can be emitted without composing the
    full 31-bit value.
  • Pointer conversion helpers (extensions/riscv/circuit/src/adapters/mod.rs):
    shared AIR + tracegen helpers that convert RV64 byte pointers (read from
    registers) into cell-pointer limbs using a carry witness plus range checks, and
    add per-block offsets with carries. Register-AS pointers always fit in the low
    limb, so register accesses skip the extra columns and range checks.
  • Chip updates: every chip that accesses heap memory previously composed its
    pointer into a single field element; each now carries new witness columns and
    uses the shared helpers instead:
    • one carry column per base pointer for the byte→cell conversion (loadstore,
      hintstore, vec_heap family, keccak256 xorin, sha2);
    • one carry column per memory block for the base + block_offset addition in
      chips that do multi-block accesses (vec_heap family, keccak256, sha2);
    • register accesses take a cheap path: register pointers are tiny, so the high
      limb is constant zero and needs no carry or range check.
  • CUDA: GPU tracegen updated to mirror the CPU side for all of the above
    (column/record layouts kept identical).
  • Test utilities: gen_pointer now draws from the full 2^31 cell range. This
    also resolves jpw's TODO in the test config: tests used to call the general
    gen_pointer on the register address space, which forced the test MemoryConfig
    to artificially widen the register AS. Tests now use the new
    gen_register_pointer / gen_distinct_register_pointers, which draw from the
    real 32-slot register file (the distinct variant avoids collisions when a test
    writes several register operands), so the register-AS resizing workaround is
    removed.
  • SHA-512 state alignment UB (extensions/sha2/circuit/src/sha2_chips/config.rs):
    the incremental hasher cast the record's state byte slice in place to
    &mut [u64; 8]. The state lives inside the trace-generation record buffer, which
    is only guaranteed 4-byte alignment (align_of::<Sha2RecordHeader>() = 4, since
    the header is all u32 fields), while [u64; 8] requires 8-byte alignment — so the
    cast was undefined behavior on host memory. Fixed by copying through an aligned
    local [u64; 8] buffer instead of reinterpreting in place.

Closes INT-8080

@GunaDD GunaDD changed the title feat:2^32 memory address feat: 2^32 memory address Jun 6, 2026
@github-actions

github-actions Bot commented Jun 6, 2026

Copy link
Copy Markdown

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

@GunaDD GunaDD force-pushed the perf/merkle-tree-mem-opt branch 3 times, most recently from 52efe21 to 4c9cea5 Compare June 8, 2026 16:01
@GunaDD GunaDD force-pushed the feat/2-pow-32-memory-addresses branch 2 times, most recently from e5623d6 to 50cec8e Compare June 8, 2026 19:59
@GunaDD GunaDD marked this pull request as draft June 9, 2026 14:14
@GunaDD GunaDD marked this pull request as ready for review June 9, 2026 14:14
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@GunaDD GunaDD force-pushed the perf/merkle-tree-mem-opt branch from 814c620 to 0e6d6a6 Compare June 10, 2026 22:09
@GunaDD GunaDD force-pushed the feat/2-pow-32-memory-addresses branch from 90b629c to 395639e Compare June 10, 2026 22:15
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@GunaDD GunaDD force-pushed the feat/2-pow-32-memory-addresses branch from efee83a to f338acf Compare June 11, 2026 16:19
@github-actions

Copy link
Copy Markdown
group app.proof_time_ms app.cycles leaf.proof_time_ms
fibonacci 5,485 4,000,051 542
keccak 20,362 14,365,133 3,046
sha2_bench 14,371 11,167,961 1,960
regex 3,807 4,090,656 438
ecrecover 1,975 112,210 311
pairing 2,095 592,827 299
kitchen_sink 5,624 1,979,971 872

Note: cells_used metrics omitted because CUDA tracegen does not expose unpadded trace heights.

Commit: f338acf

Benchmark Workflow

@GunaDD GunaDD requested a review from shuklaayush June 11, 2026 17:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant