Skip to content

evm-rpc: don't crash dump on Stable 0x3f transaction type (stable-testnet writer stall)#503

Open
elina-chertova wants to merge 1 commit into
masterfrom
alert-fix/FIyTYe-stable-0x3f-txroot
Open

evm-rpc: don't crash dump on Stable 0x3f transaction type (stable-testnet writer stall)#503
elina-chertova wants to merge 1 commit into
masterfrom
alert-fix/FIyTYe-stable-0x3f-txroot

Conversation

@elina-chertova

Copy link
Copy Markdown
Contributor

Cause (proven)

dump-stable-testnet-0 (image subsquid/evm-dump:cf85ec9c, which is current master HEAD) is in a crash-loop: 51 restarts in 26h, deterministically stuck at block 57807799. Production stack trace from the crashed container:

Error: Unexpected case: 0x3f
    at unexpectedCase (/squid/util/util-internal/lib/misc.js:56:12)
    at encodeTransaction (/squid/evm/evm-rpc/lib/verification.js:331:50)
    at transactionsRoot (/squid/evm/evm-rpc/lib/verification.js:589:21)
    at async Rpc.mapBlock (/squid/evm/evm-rpc/lib/rpc.js:127:26)
  transactionIndex: 1, blockNumber: 57807799

The dump runs with --verify-tx-root (and --verify-tx-sender). Block 57807799 contains a transaction of type 0x3f (a Stable-chain–specific type). encodeTransaction / serializeTransaction only handle known EIP-2718 types and throw unexpectedCase(tx.type) for 0x3f. That throw is fatal → the process crashes → it re-reads the same block on restart → crash-loop. The raw dump never advances past 57807799, so the parquet writer (ingest-stable-testnet-0) has no new data and sqd_last_block_total goes flat → stable-testnet_Writer_Short_Stall.

Cross-check (rule: anomaly must be confirmed against a different node implementation)

eth_getBlockByNumber(57807799, true) against two independent providers — Stable's own RPC (ar-partners-rpc-testnet.stable.xyz) and Alchemy (stable-testnet.g.alchemy.com) — return the identical block: same hash 0xa4d188…, same transactionsRoot 0x8b9d82ba…, and the same tx idx 1 type 0x3f from 0xf5a637… with a real signature (r/s non-zero). So 0x3f is a legitimate Stable transaction type, not provider corruption — and the chain's own transactionsRoot (which we cannot reproduce without the 0x3f encoding spec) is consensus data we can trust. This is why swapping the RPC provider (open infra PR #571) does not fix it: Alchemy serves the exact same block.

Fix

Mirror the existing Polygon PIP-74 (0x7f) handling already in calculateTransactionsRoot:

  • calculateTransactionsRoot: for isStable, if a block contains a 0x3f tx, return the block's own transactionsRoot (skip recomputation we can't perform).
  • recoverTxSender: for isStable, skip sender recovery for 0x3f txs (returns undefined; mapBlock already continues on a null sender).

This stops the fatality so a known-good but unencodable tx type can no longer crash-loop the dump, while still verifying every other block. It covers both stable-testnet (2201) and stable-mainnet (988), which share isStable.

After merge, the evm-dump image must be rebuilt and the evm-archive dump for stable-testnet bumped to it.

Falsification

If the dump still crash-loops after deploying an image built from this commit — e.g. it crashes on a different unhandled tx type, or in receipts-root/logs-bloom rather than tx-root/tx-sender — then this fix is insufficient and the offending type must be inspected. Also: if a future cross-check showed the two providers disagreeing on transactionsRoot for such a block, trusting the block's root would be wrong and the data would need provider escalation instead.

Stable chain (chainId 988 mainnet / 2201 testnet) emits a custom 0x3f
transaction type that encodeTransaction/serializeTransaction cannot
RLP-encode, so they throw unexpectedCase(0x3f). With --verify-tx-root
(and --verify-tx-sender) enabled the evm-dump process crashes on any
block containing such a tx and crash-loops on it forever, stalling the
writer.

Mirror the existing Polygon PIP-74 (0x7f) handling: for Stable, skip
transactions-root verification for blocks containing a 0x3f tx (trust the
block's own transactionsRoot, which is identical across independent
providers) and skip sender recovery for 0x3f txs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant