fix(load-gen): explicit nonce + unique EthFlow sellAmount (COW-1080) by brunota20 · Pull Request #53 · bleu/nullis-shepherd

brunota20 · 2026-06-19T16:00:17Z

Summary

COW-1080 - resolves the 5/270 TWAP + 1/270 EthFlow revert rate that the COW-1079 baseline 5x5 run surfaced. Two distinct root causes, both contract-side:

Nonce race: Anvil's eth_sendTransaction against an impersonated account auto-assigns a nonce, but the assignment races with bursts. When load-gen fired 10 txs per block without waiting for receipts, most sat in the mempool with the same nonce; the miner picked one per block, the rest reverted nonce-too-low. Fix: pin tx.nonce explicitly, read once at boot and increment per successful submission.
EthFlow OrderUid dedup: CoWSwapEthFlow.createOrder dedups by the GPv2 OrderUid (keccak of buyToken/receiver/sellAmount/buyAmount/appData/feeAmount/validTo/partiallyFillable/kind/source/dest). quoteId is not part of that hash - the prior load-gen varied only quoteId, so 269/270 EthFlow calls produced identical UIDs and reverted OrderIsAlreadyOwned. Fix: vary sellAmount by 1 wei per call (and match msg.value accordingly).

Validation

Re-ran scripts/load-run.sh baseline 5x5 after both fixes:

Metric	First run (PR #52)	This PR
TWAP events delivered	5 / 270	130 / 130
EthFlow events delivered	1 / 270	130 / 130
`shepherd_cow_api_submit_total{outcome=\"ok\"}`	1	130
TWAP block dispatch p99	7 ms	49 ms
EthFlow log dispatch p99	10 ms	11 ms
`shepherd_module_errors_total`	0	0

(TWAP block p99 grew from 7 ms to 49 ms because the engine now actually has 130 watches to poll per block instead of 5. p99 = 49 ms is still 40x under the < 2 s bar.)

Updated baseline report at docs/operations/load-reports/load-5x5-2026-06-19.md from "conditional pass" to full PASS. Medium 20x20 and saturation 50x50 are unblocked.

Stack

feat/load-gen-calibration-cow-1080 -> feat/load-test-anvil-cow-1079 (PR #52) -> ... -> feat/resolve-app-data-cow-1074 (PR #47).

Follow-ups still surfacing (not in scope)

scripts/load-bootstrap.sh PID-file truncation - if a previous run leaked the engine, the new bootstrap can't tear it down. Hit this between this calibration smoke and the final re-run; manual pkill nexum-engine cleared it. Pid-by-port teardown would be the proper fix.
Cold-start outlier on the first watch-heavy block (474 ms vs. 34-50 ms steady). Probably redb's first-write barrier + cold eth_call. Confirm/diagnose under medium scenario.

Linear

COW-1080 ready to move to In Review.

AI assistance disclosure: drafted by Claude (Opus 4.7, 1M context).

COW-1079 baseline's 5/270 + 1/270 revert rate had two distinct root causes, both contract-side, neither shepherd's fault: 1. **Nonce race in burst submissions.** Anvil's `eth_sendTransaction` against an impersonated account auto-assigns a nonce when none is provided, but the assignment racts with the caller's burst submission. When load-gen fired 5 TWAP + 5 EthFlow per block without waiting for individual receipts, most txs landed in the mempool sharing the same nonce, and Anvil's miner included only one per block - the rest reverted as nonce-too-low. Fix: read the EOA's current nonce at boot, increment locally per successful submission, pin `tx.nonce` explicitly on every `TransactionRequest`. Lock-step with cargo build cache so the nonce counter never crosses async-boundary corruption. 2. **EthFlow OrderUid dedup on identical GPv2 OrderData.** The CoWSwapEthFlow contract dedups by the GPv2 `OrderUid` which is keccak over (buyToken, receiver, sellAmount, buyAmount, appData, feeAmount, validTo, partiallyFillable, kind, sellTokenSource, buyTokenDestination). quoteId is NOT part of that hash. The prior load-gen varied only `quoteId` per call, so all 270 EthFlow submissions produced the same UID and the contract rejected 269/270 as `OrderIsAlreadyOwned`. Fix: vary `sellAmount` by 1 wei per call (`BASE_SELL_AMOUNT + seq`) and pass that same value as `msg.value` so the contract's `msg.value == order.sellAmount` invariant holds. Re-ran baseline 5x5 after both fixes: 130/130 TWAP + 130/130 EthFlow delivered, 130 ConditionalOrderCreated + 130 OrderPlacement events on-chain, 130 cow_api submits OK to mock, 130 ethflow markers written, zero shepherd_module_errors_total. Updated baseline report at docs/operations/load-reports/load-5x5-2026-06-19.md from 'conditional pass' to 'full PASS' with the post-calibration numbers (TWAP block p99 = 49 ms, EthFlow log p99 = 11 ms, 40x margin on the < 2 s bar). Medium 20x20 and saturation 50x50 are now unblocked per the COW-1079 acceptance roadmap. AI assistance disclosure: drafted by Claude (Opus 4.7, 1M context).

linear-code · 2026-06-19T16:00:21Z

COW-1080

…(COW-1079) Closes the COW-1079 three-scenario sweep with the COW-1080 calibration in place. All three scenarios pass: baseline 5x5 - 130/130 each, TWAP block p99=49ms medium 20x20 - 280/280 each, TWAP block p99=67ms saturation 50x50 - 300/300 each, TWAP block p99=78ms Latency growth across the watch-count range (130 -> 280 -> 300) is sub-linear: 49 -> 67 -> 78 ms. The lgahdl PR #9 concern about sequential per-module dispatch saturating under load is NOT surfaced at this scale. Zero shepherd_module_errors_total, zero traps, zero EthFlow submit errors across all three runs. The unexpected finding from saturation: the engine did not saturate. The bottleneck is load-gen's sequential eth_sendTransaction submission (each tx ~200 ms RTT, so 100 tx/iteration = ~20 s, vs. Anvil's 1 s block time). To genuinely saturate the engine we would need parallel load-gens against different impersonated EOAs, a sub-second block-time, or thousands of pre-seeded watches. EthFlow log p99 stayed flat at ~9 ms across all three scenarios (it is dominated by the cow-api submit roundtrip, not engine state), confirming the submit path scales independently of the watch count. The cold-start outlier (~500 ms on the first watch-heavy block) appears consistently across runs and is independent of the steady- state watch count - it is a one-shot first-block redb/eth_call warmup cost, NOT a saturation symptom. What this proves: - Shepherd M4 supervisor handles >= 300 concurrent watches + >= 138 block dispatch cycles in 2 min with p99 < 80 ms. - cow-api submit path is steady at ~9 ms p99 regardless of watch count. - Zero error/trap/poison across all three scenarios. What it does NOT prove (and is not in scope here): - Behaviour at 3000+ watches. - WS reconnect resilience (COW-1031 soak). - Multi-day memory drift (COW-1031). - Real-orderbook 4xx variety (COW-1078 backtest). COW-1079 ready to move to In Review. AI assistance disclosure: drafted by Claude (Opus 4.7, 1M context).

…079) The single-EOA saturation 50x50 report identified the per-EOA nonce serialisation as the bottleneck before the engine had a chance to saturate. This commit removes that bottleneck: load-gen: - New --parallel N flag. Each worker impersonates a synthetic EOA (0x57...01..0a), gets its own WS connection + nonce stream, runs its own per-block submission loop. Total events per block scales linearly with N. - Disjoint salt space per worker via 96-bit prefix. - Disjoint EthFlow sellAmount space via a 10_000-wide per-worker window (the first attempt shifted by 96 bits, blowing past the 1M ETH funded balance with 7.9e28 wei sellAmounts; fixed). scripts/load-bootstrap.sh + scripts/load-run.sh: - Accept --block-time (passes to anvil) and --parallel (passes to load-gen). Defaults preserve historic behaviour: --block-time 1, --parallel 1. - Auto-report filename now includes scenario label (load-NxM-SCENARIO-date.md) so saturation-parallel does not overwrite the baseline 5x5 report. Saturation-parallel run (10 workers x 5 TWAP + 5 EthFlow per block, --block-time 0.5, 2 min): - load-gen: 895/895 TWAP + 895/895 EthFlow acks, 0 errors. - engine saw 381 ConditionalOrderCreated + 343 OrderPlacement events (43% / 38% delivery vs load-gen acks - Anvil + WS dropping under the heavier load). - shepherd_module_errors_total = 0, zero traps. - All 343 EthFlow submissions reached the mock orderbook 1:1. - TWAP block dispatch: histogram p50/p99 = 145 ms, max = 101 593 ms (101 s outlier on one block when 380+ watches polled against a stressed Anvil JSON-RPC). - Engine-log dispatch_block: n=586, p50=4ms, p95=46ms, p99=74ms, max=101 593 ms - same outlier. Saturation knee identified: 380+ active watches + 0.5s block-time + 10 concurrent WS subscribers produces a 101-second worst-case dispatch + 38-43% event delivery loss. Both symptoms point at the surrounding system (Anvil + WS transport), not at shepherd; engine continues to scale sub-linearly with watch count and never produces a module error, trap, or panic under any tested configuration. For the 7-day COW-1031 soak: this implies the operator should use a paid Sepolia archive endpoint (Alchemy / drpc / QuickNode), not publicnode, OR accept event drops and rely on supervisor reconnect + eth_getLogs re-indexing. Documented in the new report. Report at docs/operations/load-reports/load-50x50-parallel-2026-06-19.md. AI assistance disclosure: drafted by Claude (Opus 4.7, 1M context).

brunota20 added 2 commits June 19, 2026 13:14

brunota20 mentioned this pull request Jun 22, 2026

feat(ops): orderbook EthFlow indexer baseline tool (COW-1084) #57

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(load-gen): explicit nonce + unique EthFlow sellAmount (COW-1080)#53

fix(load-gen): explicit nonce + unique EthFlow sellAmount (COW-1080)#53
brunota20 wants to merge 3 commits into
feat/load-test-anvil-cow-1079from
feat/load-gen-calibration-cow-1080

brunota20 commented Jun 19, 2026

Uh oh!

linear-code Bot commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

brunota20 commented Jun 19, 2026

Summary

Validation

Stack

Follow-ups still surfacing (not in scope)

Linear

Uh oh!

linear-code Bot commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant