Skip to content
This repository was archived by the owner on Dec 5, 2021. It is now read-only.

[pull] develop from ethereum-optimism:develop#573

Open
pull[bot] wants to merge 10000 commits into
omgnetwork:developfrom
ethereum-optimism:develop
Open

[pull] develop from ethereum-optimism:develop#573
pull[bot] wants to merge 10000 commits into
omgnetwork:developfrom
ethereum-optimism:develop

Conversation

@pull
Copy link
Copy Markdown

@pull pull Bot commented Oct 13, 2021

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

Inphi and others added 30 commits May 20, 2026 16:15
* test(loadtest): fix deferred cleanup order

* test(loadtest): use explicit Wait to avoid ctxInvalid startup race

The deferred closure `defer func() { cancelInvalid(); wg.Wait() }()` fires
the moment TestRelayWithInvalidMessagesSteady returns from its body, which
is immediately after spawning the two worker goroutines. cancelInvalid()
runs before goroutine 1 has a chance to enter NewInvalidExecMsgSpammer, so
its l2.Include(t, ...) call sees an already-canceled ctxInvalid and
t.Require().NoError(err) aborts the test.

The F10 "GOOD" pattern assumes the worker goroutines do bounded work and
exit on cancellation. Here they're meant to run until the parent test
context expires (3-minute timeout in setupLoadTest), and goroutine 1's
spammer setup uses t.Require() — so the ctx must stay alive until the
goroutines have started.

Switch to an explicit wg.Wait() at the end of the test body. The function
blocks until both goroutines exit naturally on parent context expiry. The
F10 linter only fires when defer cancel() is paired with defer wg.Wait();
making the Wait explicit defuses it without losing the cancel as a safety
net.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…pin test (#20837)

The build script previously wrote `[]` to etc/depsets.json on every build,
regardless of KONA_BIND. Combined with a committed `[]` snapshot, this meant
DEPENDENCY_SETS was always empty in default builds — making the embedded-first
lookup in BootInfo::load dead code outside the host-supplied preimage fallback,
which logs an "insecure in production" warning. Default `cargo test` now sees
the real registry-derived depsets, and `kona-client` prestates built without
KONA_BIND now embed them as well.

Build script:

- Move the depsets reset inside the `if kona_bind { ... }` branch so it runs
  alongside the re-derivation it pairs with, instead of clobbering the
  committed snapshot in every build. Default builds now use the committed
  snapshot directly, mirroring how configs.json and chainList.json work.

- Add unconditional `cargo:rerun-if-changed=etc/{chainList,configs,depsets}.json`
  directives. `include_str!` does not register file dependencies with cargo,
  so without these a regenerated snapshot is silently ignored by a cached
  compilation of lib.rs. Also drops the now-redundant gated copies inside
  merge_custom_configs.

Snapshots refreshed against current submodule pin (cc07e96d):
  - etc/depsets.json gains the rehearsal-0-bn cluster {420120009, 420120010}.
  - etc/configs.json gains the corresponding [interop] blocks plus the
    upstream rehearsal-0-bn L1 public_rpc URL change.

Tests:

- Add embedded_depset_for_rehearsal_0_bn_cluster, pinning the registry-derived
  interop cluster against the committed etc/depsets.json snapshot. Asserts
  cluster membership, cluster identity (both peers map to the same value),
  absence of expiry-window override, and the default 7-day MESSAGE_EXPIRY_WINDOW.

- Remove embedded_depsets_empty_by_default — its premise (default builds
  embed no depsets) no longer holds.

Behavior changes worth knowing:

- Custom-devnet builds that supply their own depsets.json now layer additively
  on top of the rehearsal cluster (previously they wrote a custom-only file).
  Overlapping chain ids with differing cluster contents will panic at build
  time via merge_custom_depsets, surfacing what would have been a runtime
  crash in lib.rs's reverse-index.

- Kona prestate hashes change (the embedded-first path now engages in
  production). Downstream pins (op-challenger, standard-prestates.toml) need
  a coordinated refresh in a follow-up PR.

Verified:
- `cargo nextest run -p kona-registry` (no envs): 15/15 pass, rehearsal test
  runs (no skips).
- `KONA_BIND=true cargo build -p kona-registry`: byte-idempotent against
  the committed snapshots (git diff clean after re-run).
- `just test-custom-embeds`: passes; etc/depsets.json is rehearsal +
  fixture clusters after the merge.
- `cargo nextest run -p kona-proof-interop -p kona-genesis -p kona-interop`:
  218/218 pass.
- `cargo +nightly fmt -p kona-registry` and
  `cargo clippy -p kona-registry --tests` clean.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Migrates the fuzz-golang job definition from xlarge (Gen1) to medium.gen2.
First W4 PR — see ethereum-optimism/core-team#2564.

All 6 matrix variants (op-challenger, op-node, op-service, op-chain-ops,
cannon, op-e2e) inherit the new class from this single job def.

Sizing rationale: across all 6 variants, peak CPU translates to ~0.95 vCPU
absolute and peak RAM to ~0.40 GB absolute (CircleCI telemetry × 1.5 safety
factor). medium.gen2 (2 vCPU / 4 GB) clears both constraints with headroom.
#20919)

Production chains (gasLimit >= 21M) are unaffected. Below that, the deploy
script derives a ResourceConfig whose maxResourceLimit + systemTxMaxGas
matches the requested gasLimit, so small chains (e.g. 5M) can deploy
without callers having to supply their own ResourceConfig.
…jobs across 3 files) (#20917)

* circleci: rust-e2e Gen2 swap (rust-e2e-sysgo-tests, rust-restart-sysgo-tests, kona-proof-action-tests)

3 docker jobs migrated from xlarge → xlarge.gen2 same-size.
op-reth-e2e-sysgo-tests left as xlarge (Gen1) — not in W4 inventory; revisit separately.

Part of W4 optimism Tier A migration — see ethereum-optimism/core-team#2564.

* circleci: rust-ci Gen2 swap (13 docker jobs)

13 docker job defs migrated to .gen2 (same-size):
  rust-ci-fmt, rust-ci-clippy, rust-ci-deny, rust-ci-zepter,
  rust-ci-typos, rust-ci-check-no-std, rust-ci-udeps,
  rust-ci-cargo-hack-build (medium → medium.gen2)
  rust-ci-docs, rust-ci-cargo-hack (xlarge → xlarge.gen2)
  rust-ci-doctest, rust-ci-cargo-tests (2xlarge → 2xlarge.gen2)
  kona-link-checker (medium → medium.gen2)

Held (machine executor, Gen2 perf unverified):
  kona-cargo-lint, kona-publish-prestate-artifacts, kona-build-fpvm, kona-host-client-offline

Not in W4 inventory (held separately):
  op-reth-compact-codec (xlarge, comment notes prior OOM at smaller class)

Part of W4 optimism Tier A migration — see ethereum-optimism/core-team#2564.

* circleci: main.yml Gen2 swap (25 docker job defs + 1 param-default + 1 wf-override)

Same-size .gen2 migration for these jobs in .circleci/continue/main.yml:
  small    -> small.gen2:    required-contracts-ci, check-op-geth-version,
                             prep-superchain, check-nut-locks, memory-all,
                             rust-binaries-for-sysgo
  medium   -> medium.gen2:   ai-contracts-test, go-binaries-for-sysgo,
                             publish-cannon-prestates, bedrock-go-tests,
                             diff-fetcher-forge-artifacts
  large    -> large.gen2:    initialize, contracts-bedrock-build,
                             sanitize-op-program, go-release
  xlarge   -> xlarge.gen2:   rust-build-binary, rust-build-vendored (x2 dup defs),
                             rust-lint-vendored, semgrep-scan
  2xlarge  -> 2xlarge.gen2:  contracts-bedrock-tests (A_forced — pinned by heavy-fuzz),
                             contracts-bedrock-heavy-fuzz-nightly, go-tests,
                             publish-contract-artifacts
  2xlarge+ -> 2xlarge+.gen2: op-acceptance-tests
  go-tests-with-fault-proof-deps: parameter default + develop-fault-proofs
                                  workflow override to 2xlarge.gen2

Excluded by design:
  fuzz-golang (pilot PR #20899 handled this with the medium.gen2 right-size)
  cannon-go-lint-and-test, fuzz-golang, analyze-op-program-client, kontrol-tests
    (not in W4 Tier A — separate review)
  All machine-executor jobs (kona-cargo-lint, kona-publish-prestate-artifacts,
    kona-build-fpvm, kona-host-client-offline, cannon-prestate, todo-issues,
    contracts-bedrock-upload, preimage-reproducibility, generate-flaky-report,
    stale-check, close-issue) — machine.gen2 perf unverified, held.

Part of W4 optimism Tier A migration — see ethereum-optimism/core-team#2564.

* ci: re-trigger CircleCI (no content changes)

Re-pushing rust-e2e.yml with identical content to produce a new commit
SHA and re-run the CircleCI pipeline. The prior run had suspected flakes
(contracts-bedrock-checks-fast-feature-tests, memory-all-opn-op-reth)
that need a retry to disambiguate from Gen2 issues.

No diff — same tree as parent.
…t map writes (#20920)

TestOperatorFeeConsistency spawns four StateRefund subtests in parallel
(HonestClaim/JunkClaim × isthmus/jovian fork variants via TestMatrix).
All four previously assigned testCfg.Allocs = actionsHelpers.DefaultAlloc,
sharing the same *AllocParams pointer, then raced to write
DefaultAlloc.L2Alloc concurrently — producing "fatal error: concurrent
map writes" and killing the test binary.

Root cause confirmed from CI artifact analysis: job 5075301 shard 1
logged the crash at operator_fee_test.go:62 with stack trace pointing
to internal/runtime/maps.fatal.

Fix: shallow-copy the struct before mutating L2Alloc.  AllocParams has
only two non-scalar fields (L1Alloc, L2Alloc), both nil in DefaultAlloc;
L2Alloc is immediately replaced by make(...) on the copy so no map is
shared between subtests.

Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…on is configured (#20894)

* fix(op-supernode): ignore interop.log-backfill-depth when no activation is configured

The flag defaults to a non-zero value, but the interop activity (and its
backfill) is only constructed when an activation timestamp is resolved.
Rejecting startup in that combination forced operators to set the flag
explicitly to 0 before interop activation, with no functional benefit.

* review: drop test-only interopLogBackfillEnabled predicate
…ssages (#20887)

Extracts the interop wire/message vocabulary out of
op-supervisor/supervisor/types into a new op-core/interop/messages
package, alongside op-core/interop/depset. These types are the
protocol surface that op-node, op-supernode, op-interop-{filter,mon},
op-acceptance-tests, rust/kona test harnesses, and op-service RPC
clients all speak — they don't belong inside a deprecated component.

Moved: BlockSeal (+ helpers), ExecutingMessage, Message,
MessageChecksum, Identifier, ContainsQuery, Access, ExecutingDescriptor,
ChecksumArgs, LogToMessagePayload, PayloadHashToLogHash,
EncodeAccessList, ParseAccess, ExecutingMessageEventTopic, and the
access-list prefix constants.

Left in op-supervisor/supervisor/types for now: SafetyLevel + its
constants (moving to op-service/eth in a follow-up), Revision,
DerivedBlock*Pair, BlockReplacement, IndexingEvent, and the supervisor
backend sentinel errors. supervisor/types now imports the new
messages package because DerivedBlockSealPair embeds BlockSeal.

Callers: 37 leaf-only swaps (import path + qualifier renamed to
`messages`), 66 mixed files (added second import) across op-supervisor
internals, op-supernode, op-interop-filter, op-interop-mon, op-service,
op-node, op-program, op-devstack, op-e2e, rust/kona tests, and
op-acceptance-tests.
Updates  the fuzz-golang job definition medium.gen2 to xlarge.gen2
Ref: ethereum-optimism/core-team#2564.
* refactor(interop): move SafetyLevel into op-service/eth/safety

Extracts the SafetyLevel type and its 6 constants (Finalized, CrossSafe,
LocalSafe, CrossUnsafe, LocalUnsafe, Invalid) from
op-supervisor/supervisor/types into a new op-service/eth/safety
sub-package. SafetyLevel is the canonical interop safety lattice
spoken by op-node, op-supernode, RPC clients, and the entire test
tree — it doesn't belong inside a deprecated component.

A sub-package rather than top-level op-service/eth, because
eth/label.go already declares an untyped Finalized constant for the
BlockLabel namespace ("latest"/"safe"/"finalized"). Keeping the two
namespaces in separate packages avoids forcing a type alias that
would conflate the RPC-label and safety-lattice concepts.

Callers (66 files) across op-acceptance-tests, op-supervisor,
op-supernode, op-interop-filter, op-service, op-devstack, rust/kona
tests rewritten to import safety and qualify references as safety.X.
op-service/eth/label.go untouched.

* refactor(safety): rename safety.SafetyLevel to safety.Level

Drops the redundant package prefix from the type name — within the
safety package it's just "Level", read at call sites as safety.Level.
Delete the op-node/rollup/interop subsystem (IndexingMode RPC server,
managed L1 traversal, supervisor-driven Promote/Invalidate flows) and
strip the indexingMode/supervisorEnabled/ManagedBySupervisor booleans
from driver, engine, finalizer, and derivation pipeline. Move the
InvalidatedBlockSourceDepositTx helper (still used by op-program for
fault proofs) into op-node/rollup/derive/deposit_source.go and inline
the supervisor RPC error code in op-supervisor's syncnode/rpc.go (the
supervisor itself is slated for removal in a follow-up). SuperAuthority
(used by op-supernode) is untouched.

Also rips the now-vestigial InteropRPC() interface and tcpproxy
plumbing out of op-devstack and op-e2e, since op-supernode runs
op-node in-process and no longer needs an external interop RPC.
* feat(op-node): load dependency set from superchain-registry

When --interop.dependency-set is not set, fall back to
depset.FromRegistry using the rollup config's L2 chain ID, mirroring how
the rollup config itself is loaded from the registry. The registry
synthesises a self-only depset for chains without explicit interop info,
so any registry-known chain produces a usable value; unknown chains
return nil and the existing config.Check still errors iff InteropTime is
set.

* fix(op-node): update depset import for op-core move
)

* fix(op-devstack): stabilize pre-genesis super game L1 block race

The dispute game creation receipt could land on the same L1 block selected
as the rollup start block (issue #20869). plannedRollupStartBlockNumber is
tied to plannedGenesisTime via the anchor committed during OPCM migration,
so neither can be re-derived from the receipt. Widen the budget to 20 L1
blocks and gate the create submission on a post-migration head check that
fails fast with a clear setup error before the late-stage assertion can
flake.

* fix(op-devstack): simplify pre-genesis super game flake fix

Address review feedback on #20874: drop the post-migration head check (the
existing strict-less-than assertion already explains the failure mode) and
trim the L1 block delay from 20 to 10 so the test doesn't idle 2x longer
than needed to clear the worst-case migration time.

* docs: shorten preGenesisRollupStartBlockDelay comment
…nonce (#20942)

The recover-mode inner loop in TestSequenceWindowExpired re-builds Bob's
L2 transaction every block while RecoverMode forces empty blocks, so the
on-chain nonce never advances and every iteration must produce a tx with
a fresh (incrementing) nonce in order to be accepted by the pool. The
test previously derived that nonce from PendingNonceAt, which ties
correctness to whatever the txpool's pending-nonce table happens to
report between chain-head events. In CI the test occasionally fails
with SendTransaction returning "already known" — i.e. the next nonce
the test computed collided with a tx still tracked by the pool.

Track Bob's pool nonce locally and pin it via a new ActSetTxNonce
helper. Each tx's nonce is then unique by construction, regardless of
how the pool's pending-nonce bookkeeping evolves under chain-head churn.

Refs #20941
* feat(op-chain-ops,op-acceptance-tests): share post-Karst test logic via check-karst

Add `op-chain-ops/cmd/check-karst`, a CLI that runs post-Karst EIP
conformance checks against an external L2 RPC, and factor the
EIP-7823 modexp upper-bound check into a new `karsttest` package
consumed by both the CLI and the acceptance test.

`karsttest.NewBasePlan` builds the same `txplan.Option` an op-devstack
EOA composes (`dsl.EOA.Plan`) but from a plain `*ethclient.Client` and
private key — `WithAgainstLatestBlockEthClient` substitutes for
`WithAgainstLatestBlock`, and `WithBlockInclusionInfo` is omitted
because the per-tx checks only read receipt fields. Per-tx gas limits
via `txplan.WithGasLimit` reset the estimator, so reverting txs flow
through the same base plan as successful ones.

`TestEIP7823UpperBoundModExp`'s post-Karst sub-test now delegates to
`karsttest.CheckEIP7823`, which returns the two block numbers it
exercised; the test runs `sys.RunKonaNative` over the resulting
range. Pre-Karst sub-tests stay inline. The MODEXP/P256VERIFY
precompile addresses and `BuildModExpInput` move into `karsttest` so
both consumers reference one source of truth, setting up the
remaining EIPs (7883, 7825, 7951, 7939) to be ported by adding one
`Check…` function and one CLI subcommand each.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(op-chain-ops,op-acceptance-tests): port EIP-7883 to karsttest

Add `karsttest.CheckEIP7883`, wire it into `CheckAll`, and expose a new
`eip-7883` subcommand on `check-karst`, mirroring the structure used for
EIP-7823.

`CheckEIP7883` sends two empty-calldata MODEXP txs — 21,300 gas to land
exactly on the EIP-2565 200-gas floor (which OOGs against EIP-7883's
500-gas floor) and 21,600 gas to land within the new floor — and
returns the two block numbers it exercised.

`TestEIP7883ModExpGasCostIncrease`'s post-Karst sub-test now delegates
to `CheckEIP7883` and feeds the returned range to `sys.RunKonaNative`.
The pre-Karst sub-test inlines the formerly-shared `planUnderGas` since
it's the only remaining caller; the gas-floor math comment moves there,
with the post-Karst rationale captured in `CheckEIP7883`'s docstring.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(op-chain-ops,op-acceptance-tests): port EIP-7825 to karsttest

Add `karsttest.CheckEIP7825`, wire it into `CheckAll`, and expose a new
`eip-7825` subcommand on `check-karst`, mirroring the structure used for
EIP-7823 and EIP-7883.

EIP-7825 is a tx-validity rule (not an EVM rule), so post-Karst
op-reth's RPC rejects a tx with gas > 2^24 at submission time and the
tx never lands on chain. `CheckEIP7825` therefore returns just `error`
— there is no block range to feed to kona-host.

`TestEIP7825TxGasLimitCap`'s post-Karst sub-test now delegates to
`CheckEIP7825` instead of inlining the high-gas submission. The
pre-Karst sub-test is unchanged: it still table-tests kona's rollup
config selection (Jovian accepts, Karst rejects).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(op-chain-ops,op-acceptance-tests): port EIP-7951 to karsttest

Add `karsttest.CheckEIP7951`, wire it into `CheckAll`, and expose a new
`eip-7951` subcommand on `check-karst`, mirroring the structure used for
EIP-7823 and EIP-7883.

`CheckEIP7951` sends two empty-calldata P256VERIFY txs — 24,500 gas
(21,000 + 3,500) to land within RIP-7212's 3,450-gas budget but OOG
against EIP-7951's 6,900-gas cost, and 28,000 gas (21,000 + 7,000) to
land within the new cost — and returns the two block numbers it
exercised.

`TestEIP7951P256VerifyGasCostIncrease`'s post-Karst sub-test now
delegates to `CheckEIP7951` and feeds the returned range to
`sys.RunKonaNative`. The pre-Karst sub-test is still a kona table test
(jovian/karst configs), but switches from the local `p256VerifyPrecompile`
var to the shared `karsttest.P256VerifyPrecompile`; the var and the
formerly-shared `planUnderGas` are removed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(op-chain-ops,op-acceptance-tests): port EIP-7939 to karsttest

Add `karsttest.CheckEIP7939`, wire it into `CheckAll`, and expose a new
`eip-7939` subcommand on `check-karst`, mirroring the structure used for
the other Karst-EIP checks.

`CheckEIP7939` deploys a contract whose init code computes CLZ(1) and
returns the 32-byte result, asserts the deployment receipt is
successful, and verifies the deployed code equals the 32-byte
left-padded CLZ(1) = 255 value. The deployed-code check needs an
`eth_getCode` call, so the function (and `CheckAction`/`CheckAll`)
take an `apis.EthCode` — the smallest interface satisfied by both the
CLI's `*ethclient.Client` and the acceptance test's `apis.EthClient`.

The CLZ init bytecode also moves into karsttest as `CLZBytecode`, used
both by `CheckEIP7939` and by the acceptance test's pre-Karst sub-test
(which still asserts that the same bytecode reverts on Jovian because
the CLZ opcode is invalid). `TestEIP7939CLZ`'s post-Karst sub-test now
delegates to `CheckEIP7939` and feeds its returned block to
`sys.RunKonaNative`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(op-chain-ops,op-acceptance-tests): port EIP-7825 deposit-bypass to karsttest

Add `karsttest.CheckEIP7825DepositBypass`, port the deposit-bypass
acceptance test to use it, and expose a new `eip-7825-deposit`
subcommand on `check-karst` (with new `--l1`, `--l1-account`, and
`--portal` flags).

`CheckEIP7825DepositBypass` submits an L1 `depositTransaction` call
with gas limit MaxTxGas+1 (above EIP-7825's 2^24 cap), asserts the
`TransactionDeposited` event carries the requested gas, polls the L2
side for the resulting deposit receipt, and asserts it succeeded.
Unlike the EVM-level checks, this one needs both L1 and L2 access; the
L1 plan (`txplan.Option`) carries the L1 client/key/nonce, so the only
explicit L2 dependency is `apis.ReceiptFetcher` for receipt polling.
The bindings package is used for ABI encoding only — `WithClient` and
`WithTest` aren't needed since `contractio.Plan` only reads `To()` and
`EncodeInput()`.

The CLI subcommand is not included in `CheckAll`: most karsttest
invocations don't have L1 access configured.

`TestEIP7825DepositBypassesTxGasLimitCap` shrinks to a setup +
`CheckEIP7825DepositBypass` call + `RunKonaNative`. Drops imports of
`dsl/contract`, `op-node/rollup/derive`, and `txintent/bindings` since
they're no longer used in the test file.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(op-chain-ops,op-acceptance-tests): port EIP-7934 to karsttest

Add `karsttest.CheckEIP7934BlockSizeDisabled`, port the
block-size-disabled acceptance test to use it, and expose a new
`eip-7934` subcommand on `check-karst` (with a `--poll-interval` flag).

The check polls the unsafe head until it finds a block whose total
transaction-data size exceeds `params.MaxBlockSize` (8 MiB). Tx data
size is a strict lower bound on RLP-encoded block size, so observing
txData > 8 MiB proves the block exceeds the EIP-7934 limit.

Producing such a block requires sustained traffic. The acceptance test
keeps its `spamTxs` setup to drive the chain; the karsttest function
itself only does the polling. There is no internal timeout — callers
control the deadline via the context (the acceptance test framework's
deadline; Ctrl+C in the CLI). The CLI subcommand is not wired into
`CheckAll`, since on a quiet chain it would block forever.

To minimize coupling, `CheckEIP7934BlockSizeDisabled` takes a tiny
`LatestBlockFetcher` interface (just `InfoAndTxsByLabel`). The
acceptance test passes `apis.EthClient` directly; the CLI provides a
small adapter around `*ethclient.Client` that maps `Unsafe` →
`BlockByNumber(nil)`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* eliminate dead code

* consistent timeout philosophy

* share code

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(deps): bump op-rbuilder and rollup-boost dependencies

Bumps the vendored op-rbuilder and rollup-boost workspaces onto:

- reth at rev `81c026181` (paradigmxyz/reth main HEAD after the FCU
  fix in paradigmxyz/reth#24159), matching the parent rust/ workspace
  pin so the path deps on `../op-reth/crates/*` resolve consistently
- alloy 1.x -> 2.x (alloy-primitives 1.5.6, others 2.0.4)
- revm 31.x -> 38.x
- op-alloy 0.22 -> 2.0, repointed onto the in-monorepo
  `../op-alloy/crates/*` path deps
- op-* crates (op-reth, op-revm, alloy-op-*) repointed onto their
  in-monorepo path deps in both workspace tomls

Updates the cargo-chef Dockerfile, justfile, and rust-toolchain pin
(now 1.94.0 to match the parent workspace) to support the bumped
graph. `rust/op-rbuilder/crates/op-rbuilder/Cargo.toml` adds a
`docker-tests` default feature so the testcontainers-based
integration tests can be opt-out in CI environments without a
docker socket; the parent `rollup-boost` workspace dep is declared
`default-features = false` so cargo's feature unification doesn't
re-enable it via `flashblocks-rpc` under
`cargo test --workspace --no-default-features`. Direct
`cargo test -p rollup-boost` (and upstream `make test`) still pick
up the crate's own defaults.

Source-level adaptations to make this compile, the new CI gates,
and the unrelated proxy-test flake fix are split into follow-up
commits to keep this one purely manifest/lock churn.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(rust): adapt op-rbuilder and rollup-boost to bumped deps

Source-level changes required to make op-rbuilder and rollup-boost
build and pass the upstream test suite against the bumped reth
(rev 81c026181), alloy 2.x, revm 38.x and op-alloy 2.0 pins from
the preceding commit.

Notable behaviour-bearing pieces (everything else is import-path
renames, new required trait methods, and mechanical fallout):

- bundle-eviction semantics (`op-rbuilder/src/tests/revert.rs`,
  `txpool.rs`): a bundle is now dropped when
  `current_block > block_number_max` (strict greater-than) to match
  the bumped pool semantics. Test assertions rewritten to walk
  through the off-by-one explicitly. `tests/txpool.rs` also bumps
  `max_account_slots` back up to 50 because the bumped `TxPoolArgs`
  default caps single-sender at 16 and the test deliberately
  saturates the pending pool from one signer.

- `new_payload_job` signature + `PayloadConfig.payload_id`
  (`payload_handler.rs` etc): `new_payload_job(attributes)` ->
  `new_payload_job(BuildNewPayload { parent_hash, attributes,
  cache, trie_handle }, id)`.

- `BuiltPayloadExecutedBlock.hashed_state` / `.trie_updates` are
  now `Arc<...>` directly (no `Either` wrapping): drop the
  `either::Either::Left(...)` wrappers in both
  `builders/flashblocks/payload.rs` and
  `builders/standard/payload.rs`. Required by the FCU-rev bump.

- `OpEngineApi` methods now require `BalProvider` on the Provider
  bound (per the upstream `block_access_list_hash` plumbing): add
  the import and the bound in
  `primitives/reth/engine_api_builder.rs`.

- flashblocks-rpc joined the bump: matching code adaptations in
  `cache.rs`, `flashblocks.rs`, `rpc.rs`, `tests/mod.rs`.

- `OpTypedTransaction::PostExec(_)` match arm added in
  `tx_signer.rs` purely for exhaustiveness — the bumped
  `op-alloy-consensus` added a new enum variant. No new SDM /
  PostExec semantics are introduced.

- `payload_tx.send(...).await` -> `try_send(...)` in flashblocks
  payload builder: a slow consumer now drops the new payload
  instead of stalling the builder. Same approach upstream takes.

- Test-flake fix in `rollup-boost/src/proxy.rs`: bump the realistic
  client timeout used by the `MockHttpServer`-backed forward tests
  so they don't intermittently fail on slow CI machines.

- Defensive test-scaffolding cleanup in
  `rollup-boost/src/flashblocks/inbound.rs`: name the previously-`_`
  ping_rx receiver bindings (`_ping_rx`) so the spawned server task
  doesn't panic when it tries to forward a Ping while the test is
  still running.

- `dynamic_with_full_block_lag` (`op-rbuilder/src/tests/flashblocks.rs`)
  assertions relaxed to lower bounds (`>= 2 txs`, `!flashblocks.is_empty()`).
  The bumped reth/alloy builder is fast enough to pack a full flashblock
  when the FCU arrives in the slot's last millisecond, so the original
  `== 2 txs, == 1 flashblock` invariant no longer holds. Mirrors
  upstream's `late_fcu_reduces_flashblocks` bound-based style.

- Rollup-boost source is reformatted via
  `cargo +nightly-2026-02-20 fmt` from the `rust/rollup-boost/`
  workspace. The empty `rust/rollup-boost/rustfmt.toml` sentinel makes
  rustfmt fall back to defaults (deliberately, to avoid inheriting the
  parent kona-tuned `rust/rustfmt.toml`), so source must be formatted
  with defaults applied from inside the vendored workspace — which is
  exactly what `make lint` in CI checks.

After this commit, `cargo check --workspace --all-targets` builds
clean for both `rust/op-rbuilder/` and
`rust/rollup-boost/ --no-default-features`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(rust): gate op-rbuilder and rollup-boost via vendored-checks

Adds a parameterised CircleCI job `rust-vendored-checks` and two
instantiations (`op-rbuilder-checks`, `rollup-boost-checks`) that
mirror the lint + test gates the upstream GitHub Actions enforce,
so the in-monorepo copies don't drift silently from their dep
contracts. Both invoke each crate's `make lint` and `make test`
targets so the lint toolchain pin and feature flags live in the
Makefile, not the CI config.

Three CI-specific accommodations are needed because the CircleCI
Docker executor differs from upstream's Warp 16-vCPU runner:

- `op-rbuilder-checks` runs on `2xlarge` (32 GB) and caps cargo
  test to `RUST_TEST_THREADS=4`. Each parallel test spawns an
  in-process op-reth via `LocalInstance`, so the default 16-thread
  fanout overruns 32 GB and SIGKILL's the binary. Upstream's Warp
  box has ~64 GB and doesn't need the cap.

- `rollup-boost-checks` uses `cargo test --no-default-features`.
  The `docker-tests` default feature added in the deps commit
  gates the 11 testcontainers-based integration tests under
  `src/tests/`, which require `/var/run/docker.sock` (not exposed
  by the CircleCI Docker executor).

- The workspace dep on `rollup-boost` in
  `rust/rollup-boost/Cargo.toml` is declared with
  `default-features = false` so cargo's feature unification
  doesn't transitively re-enable `docker-tests` via
  `flashblocks-rpc` under `--workspace --no-default-features`.
  Direct `cargo test -p rollup-boost` (and upstream `make test`)
  still pick up the crate's own defaults, so upstream behaviour
  is unchanged.

`rust/op-rbuilder/Makefile` and `rust/rollup-boost/Makefile` are
adjusted to pre-build the `rollup-boost` binary before
`cargo test` (because `test_invalid_args` shells out to
`target/debug/rollup-boost` via `assert_cmd::cargo_bin`, which
under the new larger compile graph no longer races the test
correctly) and to expose the same lint/test entry points the
CircleCI jobs invoke.

Also removes the now-redundant `rust-lint-op-rbuilder` /
`rust-lint-rollup-boost` jobs (and the `rust-lint-vendored`
template) from `.circleci/continue/main.yml`. Linting is now
exclusively driven by the new `*-checks` jobs above, which run
the canonical `make lint` — single source of truth, no
duplicated toolchain pin or feature flags.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…0835)

* docs: add interop prep notice for OP Sepolia and Unichain Sepolia

Adds notices/interop-prep.mdx as the node-operator action checklist for
the OP Sepolia + Unichain Sepolia interop activation (late June 2026).
The notice walks operators through standing up op-reth execution clients
per chain, an op-supernode covering both chains, and migrating their
op-node fleet to Light CL mode pointed at the supernode.

Updates the specialized op-node topology notice to reflect that the
source tier can be op-node or op-supernode depending on whether the
chain is interop-active, and reframes its top Warning as a concrete
requirement for OP Sepolia / Unichain Sepolia at activation.

Adds back-links from the supernode explainer, supernode configuration
guide, and specialized-topology notice to the new interop-prep notice,
and inserts the new notice at the top of the Notices sidebar group in
docs.json.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Update docs/public-docs/notices/interop-prep.mdx

Co-authored-by: soyboy <85043086+sbvegan@users.noreply.github.com>

* Update docs/public-docs/notices/interop-prep.mdx

Co-authored-by: soyboy <85043086+sbvegan@users.noreply.github.com>

* Update docs/public-docs/notices/interop-prep.mdx

Co-authored-by: soyboy <85043086+sbvegan@users.noreply.github.com>

* docs: apply review fixups to interop prep notice

Addresses sbvegan's review feedback on PR #20835:

- Simplify "follow the supernode tier's safe view" to "follow the
  supernode" — cleaner opening, less jargon.
- Drop the "recommended there, required here" framing on the
  topology-notice pointer — the topology notice's top Warning now
  calls the interop activation a hard requirement directly, so the
  contrast phrasing is redundant.
- Broaden the "Who this affects" line from "anyone running an op-node"
  to "anyone running OP Sepolia or Unichain Sepolia nodes" so the
  scope covers the full stack (op-node + op-reth + op-supernode).
- Refine the "other OP Stack chains aren't affected" line into a
  scoped statement that points forward at the mainnet activation and
  notes other chains will move to the supernode topology as they're
  added to interop sets.
- Rename `interop_time` rollup-config field references to `lagoon_time`
  and add a brief inline note that Lagoon is the hardfork name that
  carries the interop activation. The supernode CLI flags
  (`--interop.activation-timestamp`, env vars) stay feature-named
  pending any further rename downstream.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs: address review feedback on interop prep notice

Applies the unresolved review threads from ajsutton and jelias2, assuming
the supernode bug fixes ajsutton tracked land before this notice ships
(superchain-registry will deliver activation timestamp and dependency
set via the rollup config).

- Reverse the EL/supernode bring-up framing: ELs start unsynced and wait
  for the supernode to connect, because ELs have no chain head to target
  on their own. The supernode then drives each EL through its initial
  sync and tolerates one chain being synced before the other.
- Reframe JWT-sharing wording: each EL has exactly one consensus client
  (the supernode's virtual node for that chain); sharing the JWT secret
  value across both ELs is independently fine for operational simplicity.
- Drop the OP_SUPERNODE_INTEROP_ACTIVATION_TIMESTAMP override block from
  the YAML and the log-backfill Warning callout. The post-fix supernode
  loads both pieces from the rollup config automatically; documenting
  the workaround for a transient bug would age poorly.
- Drop the observability env vars from the minimum-viable YAML to honor
  the action-checklist intro promise about minimal configs.
- Mark the beacon archive fallback as optional in both the intro prose
  and the YAML; comment in the YAML explains when it's required.
- Add a conditional pointer in Step 1 to the reth-historical-proofs
  tutorial for operators who also run op-challenger; pure node-operator
  fleets (RPC, exchange, indexer) skip it.
- Tighten the action-checklist intro to "Start the ELs first, then the
  supernode" and shorten the Light CL sequencing note.
- Drop the redundant "recommended there, required here" phrasing on the
  topology-notice pointer (topology notice now also says required).
- Broaden "Who this affects" line to "OP Sepolia or Unichain Sepolia
  nodes" rather than narrowly "an op-node" to cover the full stack.
- Refine the other-OP-Stack-chains framing into a scoped statement that
  points forward to mainnet and notes other chains move to the supernode
  topology only when they join an interop set.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs(interop-prep): point at op-reth config reference for retention flags

ajsutton clarified that the supernode needs ~7 days of block + receipt
history but not a full archive — jelias2's RETH_FULL=false framing was
overkill (RETH_FULL=false is the archive flag).

The notice already described the requirement correctly. This change adds
a pointer to the op-reth configuration reference where operators look up
the granular --prune.* flags, instead of baking a specific flag name
into the notice (which would couple to op-reth's flag surface).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs(interop-prep): clarify EL sizing rule

The previous wording asked operators to "match the supernode tier and
Light CL pool" — too compressed; the "match" verb forced the reader to
guess whether the EL count was 1:1 with the tier, 1:1 with the pool,
or some sum.

Replaces with a direct sizing rule: one EL per consensus client, with
both consumers (supernode virtual node + Light CL) named explicitly.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs(supernode-config): EL retention + HA pool recommendations

Two new entries in the supernode configuration guide's Recommendations
section, distilled from the interop-prep notice retrospective:

- "Configure EL retention for supernode backfill" — 7 days of block +
  receipt history is the supernode's minimum requirement for backfill
  after restarts or extended downtime. Full archive is not required.
  Conditional pointer to historical-proofs tutorial for operators who
  also run op-challenger.

- "Run an HA pool of supernodes behind a consensus-aware proxyd" —
  OP Labs recommends at least three op-supernode instances in the HA
  pool. Explains the failure-mode behavior: individual supernode failure
  is masked by proxyd; full tier outage leaves Light CLs on unsafe head
  via P2P until restored.

Both surfaced in the interop-prep notice (#20835) but apply to any
supernode deployment, so they belong as durable Recommendations on the
config guide rather than only in a single notice.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs(interop-prep): drop fleet-wide JWT-sharing recommendation

Stating the architectural invariant (one CL per EL) covers the whole
fleet, including each Light CL's EL. The supernode's `vn.all.l2.jwt-secret`
inheritance is documented as a mechanic, not as a fleet-wide
recommendation — defense-in-depth across CL-EL pairs is left to the
operator.

* docs(interop-prep): drop fleet-wide JWT trailing editorial

The "operator choice" clause was itself prescriptive — it put
fleet-wide sharing on the menu by mentioning it. State the
architectural invariant and the two configuration mechanics,
then stop.

* docs(interop-prep): fix EL-startup troubleshooting bullet

ajsutton flagged that "still syncing" is not a failure mode — the
supernode drives the EL through sync. The actual failure is the EL
being unreachable at startup; the supernode retries a few times then
exits if the EL never comes up. Rewording the bullet to reflect that
and pulling "still syncing" out of the failure causes.

* docs(interop-prep): drop "chain container fails to start" bullet

The bullet diagnosed a basic connectivity error that the supernode
surfaces clearly in startup logs, and labelled one cause as "most
common" without data to back it. The other three bullets in the
section diagnose non-obvious failure modes; this one was the odd
one out. Removing it.

* docs(interop-prep): hedge dependency-set auto-load with manual flag

Activation timestamp loads from rollup config today; dependency set
does not yet. Reframe line 160 to state each fact separately and
document the manual --vn.all.interop.dependency-set flag plus the
JSON shape for the two-chain OP Sepolia + Unichain Sepolia set.

* Update docs/public-docs/notices/interop-prep.mdx

Co-authored-by: soyboy <85043086+sbvegan@users.noreply.github.com>

* Update docs/public-docs/notices/interop-prep.mdx

Co-authored-by: soyboy <85043086+sbvegan@users.noreply.github.com>

* Update docs/public-docs/notices/interop-prep.mdx

Co-authored-by: soyboy <85043086+sbvegan@users.noreply.github.com>

* docs(interop-prep): document per-chain JWT override alongside vn.all

ajsutton noted the supernode supports per-VN JWT secrets via
vn.<chainID>.l2.jwt-secret; previous wording implied vn.all was the
only path. Default to vn.all for simplicity, document the per-chain
override for operators who want different secrets per VN.

* docs(interop-prep): drop manual dependency-set workaround block

ajsutton confirmed the supernode will load dep set from
superchain-registry transparently before any chain in the registry
schedules interop. Users are already used to hardfork timestamps
being picked up from the rollup config, so no explicit callout is
needed for interop either.

* docs(interop-prep): reframe op-challenger callout as data requirement

ajsutton noted the previous wording ("additionally need historical
proofs enabled") understated the difference — permissionless proofs
need substantially more retention than the 7-day supernode-backfill
baseline. Restating as "additional historical data requirements that
go beyond the 7-day baseline above" so the contrast is explicit, and
the historical-proofs tutorial owns the actual numbers.

* docs(interop-prep): note authrpc bind-addr for Docker/cross-host

ajsutton flagged that the example's --authrpc.addr=127.0.0.1 silently
breaks for op-reth running in Docker or on a different host from the
supernode. Adding a sibling note covering both cases — 0.0.0.0 (or a
specific interface IP) plus network-layer access control — without
changing the conservative same-host default in the example itself.

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: soyboy <85043086+sbvegan@users.noreply.github.com>
…20927)

* circleci: complete optimism Gen2 migration (22 jobs across 3 files)

Closes the Gen2 migration work in optimism by swapping the final 22 jobs
with explicit pre-Gen2 resource_class. Same-size mechanical replacement.

By file:
  main.yml     (16 jobs): 13 docker + 3 machine
  rust-ci.yml  ( 5 jobs):  1 docker + 4 machine
  rust-e2e.yml ( 1 job ):  1 docker

By executor:
  15 docker swaps — same pattern validated by #20917 (41 jobs, all green)
  7  machine swaps — machine.gen2 is GA per CircleCI changelog

Previously held items now included:
  - 7 machine jobs (kona-*, contracts-bedrock-upload, preimage-
    reproducibility, generate-flaky-report) — machine.gen2 GA validated
  - op-reth-compact-codec  — same-size swap orthogonal to prior OOM concern
  - op-reth-e2e-sysgo-tests — separate-review hold was administrative

Note: bedrock-go-tests (originally in the 23-job plan) was already migrated
in #20917 and is dropped from this batch; final count is 22, not 23.

Follows #20917 (41 jobs) and #20899 (pilot).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci: drop zstd version pin (1.4.8*) so apt install works on Ubuntu 24.04

apt-get install zstd=1.4.8* worked on the Gen1 machine executor (Ubuntu
22.04 / jammy, which ships zstd 1.4.8+dfsg-3build1) but fails on the
Gen2 machine executor's default image (Ubuntu 24.04 / noble, which
ships zstd 1.5.5+dfsg2-2 — no 1.4.8 available).

zstd CLI is backward-compatible across 1.4 and 1.5 for archive
extraction, so removing the pin is safe. This unblocks the Gen2
machine.gen2 migration in #20927.

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(op-node): support post-exec span batches

* feat(op-node): gate PostExec span batch txs on SDM toggle

Add an IsSDM rollup toggle and reject span batches that carry PostExec
transactions in any block where SDM is not active. Refactor
DeriveSpanBatch to take *rollup.Config so the gate has access to per-block
timestamps and the toggle.

The Interop association in IsSDM is a placeholder until the activation
fork is decided; replace with return false to disable during development.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor: gather all config from generate nut bundle script to fix sync issues

* feat: unified predeploys source of truth

* fix: include deprecated predeploys deployments

* fix: proxy all predeploys

* refactor: use lib string to cmp strings

* fix: argument order in ImplementationConfig

* refactor: add useInterop to L2CM tests

* fix: remove skipStandardDeploy natspec

* test: add cgt and interop tests for l2genesis fork upgrade

* fix: set correct name for l1 block attributes

* feat: add getUpgradeableRecords to predeploys lib

* feat: add L2CM feature to ci matrix

* fix: deploy config comment

* fix: remove l2cm from ci matrix

* fix: update L2CM version

* fix: replace sysfeature gate for isCGT and isInterop booleans

* fix: remove console import

* fix: remove gas profiling comments

* fix: rename buildImplementationDeploymentConfigs without underscore

* fix: remove claude docs

* feat: replace notProxied hardcoded logic for queries to the authoritative registry

* refactor: remove hardcoded address from getName.

A Predeploy name can always be derived from the Predeploy registry,
no matter it has a variant.

* test: add getName test for proxy variants

* test: updates helper function _customGasTokenCodeDiffer to be derived from the Predeploy Registry and removes _interopCodeDiffer

* fix: add skipIfUnoptimized in L2GenesisForkUpgrade test

* fix: fork toml

* docs: add optimized compilation need for bundle execution comment on scripts

* refactor: replace implementations struct by implementations record array

* docs: update natspec

* test: check impl record length and values

* refactor: move helper findImpl into utils lib

* chore: dev note for places to update when adding a new predeploy

* chore: just pr

* chore: restore op-core nuts files to develop state

* chore: add L2CM to code exemptions to deal with bytecode size limits

* chore: adjust gas limits

* chore: just pr

* chore: just pr

* chore: gas adjustments

* test: fix override for _executeCurrentBundle

* chore: just pr

* test: add OPTIMISM_MINTABLE_ERC721_FACTORY to _requireInitialization helper

* test: check impl records have no zero address

* test: add test that enforces Predeploy ordering for isProxied a isDeprecated

* chore: just pr

* revert: restore superchain-registry submodule to develop pointer

* chore: gas limit adjustments

* refactor: expose internal _buildImplementationDeploymentConfigs through harness

* refactor: remove repeated code

* chore: just pr

---------

Co-authored-by: IamFlux <175354924+0xiamflux@users.noreply.github.com>
* docs: update kona prestate build instructions

* remove prestate variant section
…0947)

Deletes the in-process supervisor RPC client and its API interface
definitions. Nothing outside op-supervisor itself dials the supervisor
from Go anymore — the three pieces below formed a closed loop with no
external consumer:

- op-service/sources/SupervisorClient (+ test)
- op-service/dial.DialSupervisorClientWithTimeout (zero call sites)
- op-service/apis.SupervisorAPI/SupervisorAdminAPI/SupervisorQueryAPI
  (used only by the deleted client and by the supervisor's own
  frontend)

The three API interfaces are moved into op-supervisor/supervisor/frontend
alongside their only remaining user. The small isNotFound helper that
op-service/sources/interop_filter_client.go was borrowing from the
deleted file is inlined into interop_filter_client.go.

Reduces the external dependency surface on op-supervisor/supervisor/types
by two callers (op-service/{apis,sources}).
…art (#20945)

The supernode CL's safety.Finalized advances in-memory before the
corresponding forkchoiceUpdated is delivered to and persisted by the EL.
If RestartWithFreshDataDir is called inside that window, op-reth still
reports finalized=L2 genesis to the fresh op-node, which then correctly
resets the pipeline back to L2 genesis. The resulting "L1=0 / L2=genesis"
SafeDB pin makes FirstSafeHeadTimestamp return the genesis time, the
cold-start backfill window collapses to empty, and the assertion
first sealed timestamp < FirstVerifiableTimestamp fires.

Also wait on each EL's finalized label so op-reth has persisted the
advance before we wipe the supernode data dir.

L2ELNode.AdvancedFn now defaults to block+30 polling attempts (was
block+3) and takes a varargs WithTimeout option so callers like this
test can extend the budget to match the CL wait (180 attempts = 360s).

Refs: #20944
…unners, verbose go-test streaming (#20965)

* chore(ci): switch gotestsum format to standard-verbose

Print every test as it runs so the streamed CircleCI log captures
test-start events. Today --format=testname only emits a line when a
whole package finishes, so when a runner dies mid-job, the log is
truncated at a buffered package boundary and we lose all signal about
which test was actually running at the time of the kill.

This is the cheapest way to get per-test breadcrumbs without changing
where artifacts are written (artifacts are not produced when the runner
itself dies, so off-box capture isn't an option).

* chore(ci): cap go test package concurrency on heavy CI shards

Adds `-p` to bound how many test packages execute simultaneously:

- `go-tests-{short,full,fraud-proofs}`: `-p=4` on the gotestsum
  invocations. Previously unbounded (defaulted to GOMAXPROCS=16-20),
  so heavy e2e packages on `op-e2e/system/*` could all run at once
  on the same shard.
- `op-acceptance-tests`: drop `DEFAULT_JOBS` from `$CPU_COUNT` to a
  fixed `3`. Each acceptance test package launches a full devstack
  (L1 geth + op-node + op-reth/op-geth + batcher + proposer +
  challenger), so 20 concurrent packages saturated CPU and RAM on
  the runner.

Motivation: #20966 and the wave of
"context deadline exceeded" failures on memory-all-* jobs (e.g.
job 5097460, 24 simultaneous failures) both point at resource
contention from unbounded package-level test parallelism. The
resources tab on those runs showed sustained 100% CPU and 100% RAM.

`-parallel` (intra-package `t.Parallel()` cap) is left unchanged.

* chore(ci): rebalance acceptance -p/-parallel toward package concurrency

Previous cap of -p=3 / -parallel=10 made memory-all-* jobs run ~40m
(roughly 2x the develop baseline of ~21m). Wall time was dominated by
serializing 77 acceptance test packages through only 3 slots.

Only 13 of 177 acceptance tests call t.Parallel(), so -parallel is
essentially a no-op for this workload and the total concurrency budget
is dominated by -p. Bump -p to 8 (still well below the ~20 crash
threshold) and pin -parallel=2 to keep the few opt-in parallel tests
from compounding the concurrent-devstack count.

Expected wall time: ~15 min, in line with or slightly below the
pre-cap baseline.

* chore(ci): collapse acceptance concurrency to a single axis (-p only)

Previous -p=8 -parallel=2 (cap 16) was aimed wrong: I had under-counted
acceptance tests as ~7% parallel, but the actual ratio is ~83% parallel
once devstack's ParallelT wrapper (op-devstack/devtest/testing.go:409)
is counted. Each ParallelT test spins its own devstack inside the test
function, so the real concurrency cost is -p * -parallel, not -p alone.

Set -parallel=1 so intra-package parallelism is disabled and the only
knob is -p, which directly equals concurrent test devstacks. Pick -p=12
to land wall time near the ~21min pre-cap baseline while staying well
under the observed ~30-effective-devstack crash threshold.

* chore(ci): shard memory-all acceptance jobs across 2 runners

Apply `parallelism: 2` to all three memory-all-* variants and run each
shard with `-p=8` (down from `-p=12`). With ~205 test-minutes of total
work and a single ~7-minute longest test, a timing-balanced 2-way split
brings per-shard wall to ~13 min (vs ~22-24 min today) while cutting
per-box concurrent devstacks from 12 to 8.

Restores the test-level splitting recipe from #19832: enumerate tests
with `go test -list`, split via `circleci tests split --split-by=timings`,
feed the assigned subset back as a `-run=^(...)$` regex. Local runs and
single-node CI behave identically to today.

The new `acceptance_test_jobs` job parameter exports
ACCEPTANCE_TEST_JOBS only when non-empty, so the justfile's
DEFAULT_JOBS=12 stays in place for any caller that doesn't override.
* fix(op-service/eth): stabilize FuzzEncodeDecodeBlob

Blob.FromData formatted the entire input payload into the
ErrBlobInputTooLarge error message via %v on eth.Data (hexutil.Bytes).
For a single oversize input that produced a ~260KiB error string. Inside
FuzzEncodeDecodeBlob the oversize input flowed straight into
require.NoError, so each oversize execution allocated and formatted that
260KiB message; under coverage-instrumented minimization that work
multiplies across iterations until the harness reports
"fuzzing process hung or terminated unexpectedly while minimizing: EOF"
and throughput collapses to 0 execs/sec for the remainder of fuzztime.

Two minimal changes:

1. Format len(data) instead of data in the error in FromData. Add a
   regression assertion in TestTooLongDataEncoding that the error
   message stays under 1KiB.
2. In FuzzEncodeDecodeBlob, treat ErrBlobInputTooLarge as documented
   out-of-scope and return early, instead of asserting NoError. This
   removes the formatting amplification path entirely and preserves
   full input-size coverage (no caps).

The flake is intermittent because the libFuzzer mutator only
sporadically produces inputs above MaxBlobDataSize; runs that never hit
the oversize branch never trigger the stall.

Refs #20935

* fix(op-service/eth): make FuzzDetectNonBijectivity deterministic

The fuzz function captured a shared *rand.Rand outside the f.Fuzz closure
and used it to pick which bit to flip on each iteration. Go's fuzz engine
requires fuzz targets to be deterministic for the same input — it re-runs
inputs to verify reproducibility, to gather baseline coverage, and during
minimization. A shared rand source meant the same input produced different
behavior on re-execution, which causes worker subprocesses to stall during
minimization and ultimately exit with EOF (#20936).

Derive the bit to flip from a SHA-256 of the input instead, so the target
is a pure function of its input. The shared Blob buffer outside the closure
is fine — within one worker subprocess the fuzz callback runs sequentially.

* ci(op-service): cap fuzz GOMAXPROCS to match the runner's vCPU budget

CircleCI Gen2 docker images expose the full host core count to Go's
runtime, so the fuzz harness was spawning 32 workers on an xlarge.gen2
runner (8 vCPU). The resulting 4x CPU oversubscription collapsed
throughput from ~7000 execs/sec/worker to <10 and broke the Go fuzz
coordinator-worker liveness heartbeats, surfacing as

  fuzzing process hung or terminated unexpectedly while minimizing: EOF

at fuzztime. Pinning GOMAXPROCS=8 restores stable throughput
(~450-550 execs/sec total over 8 workers, full 60s) and clears the
flake.

* Revert "ci(op-service): cap fuzz GOMAXPROCS to match the runner's vCPU budget"

This reverts commit e30866a.

* test(blob): truncate fuzz input and assert decode invariant

Address review feedback on FuzzDetectNonBijectivity: truncate input to
MaxBlobDataSize before hashing so bit selection is derived from the
encoded data, and check NotEqual only on successful decode.
…20963)

The DSL `L2Batcher.Start()` helper retries up to 3 times on error. If
the first `StartBatcher` RPC actually starts the batcher but the client
sees a transient error (e.g. context/RPC hiccup), subsequent retries hit
"batcher is already running" and the helper fails permanently.

Mirror the symmetric tolerance that `Stop()` already has for "batcher is
not running" so transient RPC errors don't fail the test when the
batcher is in the desired state.

Observed as a flake in TestInteropFaultProofs_VariedBlockTimes.
…nterop (#20896)

* refactor(interop): move shared interop error sentinels into op-core/interop

Moves the interop result vocabulary (ErrFuture, ErrConflict, ErrSkipped,
ErrOutOfOrder, ErrDataCorruption, ErrAwaitReplacementBlock, GetErrorCode, ...)
out of op-supervisor/supervisor/types into the top-level op-core/interop
package. These sentinels are used across op-supervisor, op-interop-filter,
op-supernode (activity tracker and raftwallogdb), op-program (FPP
consolidation), so they don't belong in op-supervisor.

op-supervisor/supervisor/types/error.go is deleted; the new content lives
inline in op-core/interop/interop.go. Net result: op-supernode's
raftwallogdb no longer imports any op-supervisor package.

* fix: goimports ordering
Go test coverage adds CPU/memory overhead in CI without being used as
a gating signal. Removes -coverprofile/-coverpkg flags from the Go CI
test recipes and drops the cannon and fraud-proofs Codecov uploads.
Contracts (Solidity) coverage is unchanged.
…0946)

Moves MessageFromLog, DecodeExecutingMessageLog, and LogToLogHash out of
op-supervisor/supervisor/backend/processors and into
op-core/interop/messages. These functions are pure log-decoding helpers
that operate entirely on op-core/interop/messages types; they have no
coupling to supervisor backend state. Several non-supervisor consumers
(op-interop-filter, op-interop-mon, op-supernode, op-program) were
reaching into the supervisor processors package just for these helpers.

After this change there are no external imports of
op-supervisor/supervisor/backend/processors. The processors package
remains as the supervisor's internal log-ingestion pipeline
(chain_processor, log_processor, client) and consumes the helpers via
op-core/interop/messages alongside the rest of its callers.

EventDecoderFn is consolidated into log_processor.go since it is only
used there.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

⤵️ pull merge-conflict Resolve conflicts manually