fix(event_loop) + docs(m3): block-only subscriptions + M3 testnet runbook (validated) by brunota20 · Pull Request #32 · bleu/nullis-shepherd

brunota20 · 2026-06-18T12:15:27Z

What does this PR do?

Two coupled commits:

`fix(event_loop)` - engines whose modules only declare `[[subscription]] kind = "block"` (or only `kind = "log"`) no longer bail at boot. `select_all` on an empty Vec yielded `None` immediately, tripping the "stream ended → shut down" arm before any event flowed. Replaced empty input with `stream::pending()` so the arm is never selected.
`docs(m3)` - M3 testnet runbook + `engine.m3.toml` + `just run-m3`, sister doc to PR docs(m2): testnet runbook + engine.m2.toml + just run-m2 (validated boot) #31 (M2 runbook). All 3 M3 example modules (price-alert + balance-tracker + stop-loss) boot against Sepolia and exercise their full strategy path on the first block dispatch.

Why

The M3 milestone was unit + integration tested (145 host tests + 6 doctests + 5 supervisor integration tests) but had never been exercised against a real chain. Wiring up `engine.m3.toml` to do that surfaced the event_loop bug - all 3 M3 modules are block-only, which is a config shape M1 had never exercised.

Validated locally on Sepolia

A single block dispatch (~10s wall clock) drove all 3 strategy paths:

Module	Observed
price-alert	`TRIGGERED answer=174553978080 threshold=250000000000 (Below)` — Sepolia ETH/USD Chainlink @ $1745.54 < $2500 trigger
balance-tracker	2 `eth_getBalance` calls (one per configured address), multi-key local-store path
stop-loss	`eth_call` oracle → `OrderCreation::from_signed_order_data` with `Signature::PreSign` → `cow-api::submit-order` 561B → orderbook returns typed `TransferSimulationFailed` → `classify_api_error` correctly classifies as `RetryAction::TryNextBlock` → `retry on next block (0)`

The stop-loss rejection is the SDK retry contract working: the default config's owner does not hold the sell token, so the orderbook simulation fails; the SDK's `classify_api_error` maps that to retriable; the watch is preserved for the next block.

Changes

File	Change
`crates/nexum-engine/src/runtime/event_loop.rs`	Replace `select_all(empty)` with `stream::pending()` for each side; cite the original "bail on WS drop" intent so future readers do not regress it
`crates/nexum-engine/src/supervisor/tests.rs`	New `run_does_not_bail_when_both_stream_kinds_are_empty` regression test (47 nexum-engine tests, was 46)
`engine.m3.toml`	New
`docs/operations/m3-testnet-runbook.md`	New
`justfile`	New `build-m3` + `run-m3` recipes

Breaking changes

None. The fix preserves the bail-on-None semantic for non-empty streams; only the empty-Vec edge case changed.

Testing

`cargo test -p nexum-engine` → 47 passed (was 46).
`cargo test --workspace` → 150 host tests + 6 doctests passing.
`cargo clippy --all-targets --workspace -- -D warnings` clean.
`cargo fmt --all --check` clean.
`just run-m3` boots 3 modules + exercises all 3 strategy paths on first Sepolia block.
0 em-dashes in new files.

AI assistance disclosure

AI Assistance: this fix + docs + description was produced by a Claude Code agent (Claude Opus 4.7 1M context). A human (Bruno) reviewed and is accountable for the result. The Sepolia boot validation was run by the agent.

Linear: no dedicated issue - this is the M3 runbook counterpart to the M2 runbook in #31. The fix is incidental to wiring up the runbook.

Stacks on #31 (M2 runbook) → #30 (COW-1068) → #29 (COW-1067) → #28 (COW-1069) → #27 (COW-1066) → #26 (COW-1063 QA cleanup).

Surfaced wiring up `engine.m3.toml` for the M3 testnet runbook: all 3 M3 example modules (price-alert, balance-tracker, stop-loss) only declare `[[subscription]] kind = "block"`, leaving `log_streams` empty. `select_all` over an empty Vec yields `None` immediately, the `tokio::select!` arm fired, and the loop hit the "log stream ended - shutting down for restart" bail before any block flowed. The engine bailed within ~50 ms of `supervisor ready`. Fix: replace each empty side with `futures::stream::pending()` so the corresponding select arm is never selected. The bail-on-None semantic still fires when a *non-empty* stream actually closes (real WebSocket drop), which is the original intent. The bug was symmetric (log-only configs would also bail) but only the block-only path is exercised by an existing module config. M2 was unaffected because both modules subscribe to at least one log. Regression test in `supervisor::tests:: run_does_not_bail_when_both_stream_kinds_are_empty`: invokes `run` with two empty `Vec`s plus a 50 ms shutdown timer; asserts `run` blocks the full 50 ms instead of returning at 0 ms. The pre-fix binary returns in <5 ms. Verified locally: cargo test -p nexum-engine -> 47 passed (was 46) just run-m3 -> 3 modules boot; first block dispatch fires all 3 strategy paths against live Sepolia (oracle read, balance polls, cow-api submit + retry classification)

… 3-module E2E) Sister doc to `docs/operations/m2-testnet-runbook.md`. Same shape, different modules. Closes the gap "M3 is unit + integration tested but has never been exercised against a real chain", same as the M2 runbook closed for M2. ## New files - `engine.m3.toml` - workspace-root engine config that boots the 3 M3 example modules (price-alert + balance-tracker + stop-loss) against Sepolia public WS. Separate `state_dir = "./data/m3"` so it never collides with M1 / M2 runbook state. - `docs/operations/m3-testnet-runbook.md` - operator runbook mirroring the M2 one: prerequisites, smoke+active run (M3 is active by default since the example modules trigger on every block), optional pre-signature setup for real stop-loss settlement, state inspection, scope boundaries, troubleshooting, references. - `justfile` recipes: `build-m3` + `run-m3`. ## Validated locally A single Sepolia block dispatch (~10 s wall clock) drove all 3 M3 strategy paths through the live testnet: - **price-alert**: `chain::request eth_call` -> Chainlink AggregatorV3Interface -> ABI decode -> `TRIGGERED answer= 174553978080 threshold=250000000000 (Below)` (Sepolia ETH/USD feed reports $1745.54, below the $2500 default threshold). - **balance-tracker**: 2 `chain::request eth_getBalance` calls (one per configured address) - SDK chain helper + multi-key local-store path. - **stop-loss**: `eth_call` oracle -> `from_signed_order_data` `OrderCreation` with `Signature::PreSign` -> `cow-api::submit- order` bytes=561 -> orderbook returns typed `TransferSimulationFailed` -> `classify_api_error` tags as retriable -> `retry on next block`. Full submit path confirmed; the orderbook rejection is the typed-retry contract working as designed (the default config's `owner = 0x70997970...` does not hold the sell token on Sepolia, so simulation correctly fails). This validates everything the SDK BLEU-840 / BLEU-841 / BLEU-851 / -852 / -854 / -855 PR series builds: Host trait surface, chain helpers, cow helpers, MockHost recipe, strategy/lib split. The same code paths that pass 145 unit tests + 6 doctests + 5 supervisor integration tests now also work against live Sepolia. ## What this validates that the M2 runbook does not M2 only exercises the orderbook submit path indirectly (through the EthFlow watcher reacting to swap.cow.fi traffic, and only when app_data is empty - documented limitation). M3 stop-loss submits proactively on every poll, so the orderbook always sees a real `OrderCreation` body even if it rejects. The typed-retry SDK contract (`classify_api_error` mapping `TransferSimulationFailed` -> `RetryAction::TryNextBlock`) is exercised end-to-end with a real orderbook response, not a fixture. ## Stacks on - `fix(event_loop)` commit immediately preceding this one - the bug surfaced wiring up `engine.m3.toml` (block-only subscriptions bailed the engine pre-fix). - PR #31 (M2 runbook) - same operator-doc shape, same conventions.

brunota20 added 2 commits June 18, 2026 09:14

This was referenced Jun 18, 2026

docs(m3): testnet edge-case validation - 5 scenarios run, all pass #33

Open

fix(supervisor): mark module alive=false when init returns Err (COW-1070) #34

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(event_loop) + docs(m3): block-only subscriptions + M3 testnet runbook (validated)#32

fix(event_loop) + docs(m3): block-only subscriptions + M3 testnet runbook (validated)#32
brunota20 wants to merge 2 commits into
feat/m2-runbook-and-smoke-configfrom
feat/m3-runbook-and-smoke-config

brunota20 commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

brunota20 commented Jun 18, 2026

What does this PR do?

Why

Validated locally on Sepolia

Changes

Breaking changes

Testing

AI assistance disclosure

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant