Skip to content

feat(sdk + twap-monitor): resolve non-empty app_data via orderbook lookup (COW-1074)#47

Open
brunota20 wants to merge 3 commits into
feat/e2e-run-config-cow-1064from
feat/resolve-app-data-cow-1074
Open

feat(sdk + twap-monitor): resolve non-empty app_data via orderbook lookup (COW-1074)#47
brunota20 wants to merge 3 commits into
feat/e2e-run-config-cow-1064from
feat/resolve-app-data-cow-1074

Conversation

@brunota20

Copy link
Copy Markdown
Collaborator

Summary

Closes the gap surfaced by the COW-1064 dry run (2026-06-18). TWAP orders created through the cow-swap UI sign with a non-empty appData hash pointing at a richer JSON document (partner-id, slippage settings, quote-id). twap-monitor hard-coded EMPTY_APP_DATA_JSON when assembling OrderCreation, so the orderbook rejected every submit with invalid OrderCreation: app_data JSON digest does not match signed app_data hash and the watch sat in retry-loop forever.

Stacks on #46 (COW-1064 E2E run-config). Linear: COW-1074.

Layers touched (single PR, no WIT bump)

The WIT already exposes cow-api::request(method, path, body) as a generic REST passthrough. We surface that on the SDK trait, wrap it in a typed helper, and use the helper from the strategy. No new host imports, no WIT ABI change, no forced rebuild of unrelated modules.

Layer File Change
SDK trait crates/shepherd-sdk/src/host.rs CowApiHost::cow_api_request(chain_id, method, path, body)
SDK helper crates/shepherd-sdk/src/cow/app_data.rs (new) resolve_app_data(host, chain_id, hash) — short-circuits EMPTY_APP_DATA_HASH, otherwise GETs /api/v1/app_data/{hex} + extracts fullAppData
Mock crates/shepherd-sdk-test/src/lib.rs MockCowApi::respond_to_request_for(method, path, ...) + respond_to_request(...) catch-all + request_calls() recorder
Adapters modules/{twap-monitor, ethflow-watcher, examples/price-alert, examples/stop-loss}/src/lib.rs 8-line forwarder from WitBindgenHost::cow_api_request to the existing wit-bindgen cow_api::request binding
Strategy modules/twap-monitor/src/strategy.rs submit_ready calls resolve_app_data before build_order_creation; graceful Warn-and-skip on 404 or other resolve errors

Tests

  • SDK (crates/shepherd-sdk/src/cow/app_data.rs): 5 new tests pinning the short-circuit (no host call), the success path (orderbook envelope parsed correctly), the unexpected-shape error wrapping, the 404 propagation verbatim, and the hex encoder.
  • twap-monitor strategy: 2 new integration tests +
    1 updated test:
    • poll_ready_resolves_non_empty_app_data_then_submits — full happy path
    • poll_ready_skips_submit_when_app_data_hash_not_mirrored — 404 graceful skip
    • build_order_creation_accepts_matching_non_empty_app_data — pinned direct unit test of build_order_creation with a keccak256-matched JSON

Workspace impact

  • cargo test --workspace → 181 tests passing (was 174 pre-PR; +5 SDK +2 twap-monitor).
  • cargo clippy --all-targets --workspace -- -D warnings clean.
  • cargo fmt --all --check clean.
  • All 4 production module wasm artefacts build cleanly against the extended trait.

What this does NOT change

  • No WIT changes. Modules built against the prior SDK trait fail to compile (the new method is required), but the WIT-generated wasm-side surface is bit-identical to pre-PR.
  • No host-impl changes (crates/nexum-engine/src/host/impls/cow_api.rs). The host already implements request for the wit-bindgen binding; the new SDK helper goes through that.
  • No metric surface drift. The orderbook lookup goes through the same shepherd_cow_api_* counters via the existing request path.

Live validation

In the next COW-1064 dry run with this PR included:

  • Submit a TWAP via cow-swap UI (non-empty appData) → orderbook indexes both the order and the JSON document.
  • twap-monitor observes ConditionalOrderCreated → indexes watch: (unchanged).
  • On the next block, poll watch: -> Ready.
  • Strategy resolves the appData via orderbook → builds OrderCreation with the matching JSON.
  • Orderbook accepts → INFO twap-monitor submitted 0x<uid> in the log (vs. twap submit skipped: invalid OrderCreation: app_data JSON digest does not match signed app_data hash pre-PR).

Validation will happen on the COW-1064 re-run after this PR lands.

Follow-ups filed in Linear

Four anomalies were surfaced by the same COW-1064 dry run. This PR closes #1; the other three are tracked:

  • COW-1074 — this PR (app_data resolve)
  • COW-1075classify_api_error should map DuplicatedOrderDrop
  • COW-1076 — EthFlow.createOrder validTo=uint32::MAX rejected by Sepolia orderbook (upstream)
  • COW-1077e2e-onchain.sh TWAP t0=0 makes the order permanently AFTER_TWAP_FINISHED (caller-side bug)

AI assistance disclosure

Claude (Opus 4.7, 1M-context) drafted the full layered change (trait extension, SDK helper, mock surface, module adapters, strategy refactor, tests) during the live COW-1064 debug. I reviewed every file diff before commit; the test suite expansion (5 + 3 new tests) directly pins the new behaviour against MockHost. Live-orderbook validation deferred to the next COW-1064 dry run.

…okup (COW-1074)

Closes the gap surfaced by the COW-1064 dry run (2026-06-18):
TWAP orders created through cow-swap UI sign with a non-empty
`appData` hash pointing at a richer JSON document (partner-id,
slippage settings, quote-id). twap-monitor hard-coded
`EMPTY_APP_DATA_JSON` when assembling `OrderCreation`, so the
orderbook rejected every submit with `invalid OrderCreation:
app_data JSON digest does not match signed app_data hash` and
the watch sat in retry-loop forever.

## Layers touched (single PR, no WIT bump)

The WIT already exposes `cow-api::request(method, path, body)`
as a generic REST passthrough. We surface that capability on
the SDK trait, wrap it in a typed helper, and use the helper
from the strategy. No new host imports, no WIT ABI change, no
forced rebuild of unrelated modules.

### 1. `crates/shepherd-sdk/src/host.rs`

Extended `CowApiHost` with:
```rust
fn cow_api_request(
    &self,
    chain_id: u64,
    method: &str,
    path: &str,
    body: Option<&str>,
) -> Result<String, HostError>;
```

404 responses surface as `HostError { code: 404, kind:
Unavailable }` so callers can distinguish "orderbook does not
have this resource" from genuine upstream failures without
introducing a new `HostErrorKind` variant (the existing enum
is `non_exhaustive`, but adding a variant on the WIT side
would still need an ABI bump).

### 2. `crates/shepherd-sdk/src/cow/app_data.rs` (new file)

`resolve_app_data(host, chain_id, hash) -> Result<String,
HostError>` with:
- Short-circuits `EMPTY_APP_DATA_HASH` (`keccak256("{}")`)
  to `EMPTY_APP_DATA_JSON` (`"{}"`) — no host call needed.
- Otherwise GETs `/api/v1/app_data/{hex_hash}` and pulls
  the `fullAppData` field out of the orderbook's envelope
  shape (`{"fullAppData": "<JSON string>", ...}`).
- 5 unit tests pinning the short-circuit, the success path,
  the unexpected-shape fall-through, the 404 propagation,
  and the hex encoder.

### 3. `crates/shepherd-sdk-test/src/lib.rs`

Extended `MockCowApi` with:
- `respond_to_request_for(method, path, result)`: per-key
  programmable response.
- `respond_to_request(result)`: catch-all default.
- `request_calls()`: records the (chain_id, method, path,
  body) tuple for every invocation.

The existing `respond` / `calls()` / `submit_order` surface
is unchanged.

### 4. `WitBindgenHost` in every module that imports cow-api

`modules/twap-monitor/src/lib.rs`,
`modules/ethflow-watcher/src/lib.rs`,
`modules/examples/price-alert/src/lib.rs`, and
`modules/examples/stop-loss/src/lib.rs` each gained the
trivial 8-line forwarder to the generated
`cow_api::request` binding. Example modules implement
`CowApiHost` purely for the `Host` blanket-impl supertrait
even though some don't actively submit orders — the impl
is symmetrically extended.

### 5. `modules/twap-monitor/src/strategy.rs`

`build_order_creation` now takes the resolved
`app_data_json` as an explicit parameter (was hard-coded to
`EMPTY_APP_DATA_JSON`). The resolution itself is lifted into
the caller `submit_ready`, which calls
`shepherd_sdk::cow::resolve_app_data` before assembling the
`OrderCreation`. Two graceful-fallback branches:
- `err.code == 404` → log Warn "appData hash not mirrored
  on orderbook" + leave the watch in place. Operators can
  re-trigger by pinning the document via a future
  orderbook PUT or by re-creating the order with empty
  appData.
- Any other resolver error → log Warn "appData resolve
  failed" + leave the watch. Future retry on the next
  block re-attempts the lookup.

Two new strategy tests:
- `poll_ready_resolves_non_empty_app_data_then_submits`:
  programs MockHost with a known JSON + its hash on the
  order, asserts the full resolve → submit → `submitted:`
  marker flow.
- `poll_ready_skips_submit_when_app_data_hash_not_mirrored`:
  programs MockHost to 404, asserts no submit attempt, no
  `submitted:` / `dropped:` markers, Warn log line present.

Plus one updated test
(`build_order_creation_accepts_matching_non_empty_app_data`)
that pins the new "matching hash → JSON" success path
directly on `build_order_creation`.

## Workspace impact

- `cargo test --workspace` → 13 + 12 + 16 + 32 + 8 + 8 +
  61 (engine) + 23 (twap-monitor) + 7 doctests + 1
  integration = 181 tests passing (was 174; +5 SDK +2
  twap-monitor).
- `cargo clippy --all-targets --workspace -- -D warnings`
  clean.
- `cargo fmt --all --check` clean.
- All 4 production module .wasm artefacts build cleanly
  with the new SDK trait.

## What this does NOT change

- No WIT changes. Modules built against the prior SDK
  trait will fail to compile (the new method is required),
  but the WIT-generated wasm-side surface is bit-identical.
- No host-impl changes (`crates/nexum-engine/src/host/
  impls/cow_api.rs`). The host already implements `request`
  for the wit-bindgen binding.
- No metric surface drift. The orderbook lookup goes
  through the same `shepherd_cow_api_*` counters via the
  existing `request` path.

Linear: COW-1074. Stacks on the COW-1064 run-config branch
(#46). Validated locally end-to-end via `cargo test
--workspace`; live validation against the running engine
will happen on the next COW-1064 dry run (engine restart
required to pick up the rebuilt modules).
@linear-code

linear-code Bot commented Jun 18, 2026

Copy link
Copy Markdown

COW-1074

…-1074)

Symmetric extension of the twap-monitor fix in this PR. The
ethflow-watcher strategy's `build_eth_flow_creation` hard-coded
`EMPTY_APP_DATA_JSON` exactly like twap-monitor did; any
OrderPlacement event whose embedded `GPv2OrderData.appData`
hash differs from `keccak256("{}")` (i.e. every cow-swap UI
EthFlow swap) would hit "app_data JSON digest does not match
signed app_data hash" and be silently skipped client-side.

The COW-1064 dry run didn't surface this for the EthFlow tx I
fired via `scripts/e2e-onchain.sh` — because that script's
helper sets `appData = EMPTY_APP_DATA_HASH` — but a cow-swap UI
EthFlow swap (which is the realistic production path) would.

## Changes

- `build_eth_flow_creation` now takes `app_data_json: String`
  alongside `chain_id` and `placement`. Docstring updated to
  reference COW-1074.
- `submit_placement` calls `shepherd_sdk::cow::resolve_app_data`
  before `build_eth_flow_creation`; on 404 logs a Warn
  "ethflow submit skipped (sender=...): appData hash not
  mirrored on orderbook" and returns Ok (no marker written, no
  submit attempt).
- 6 test call sites updated to pass
  `cowprotocol::EMPTY_APP_DATA_JSON.to_string()` explicitly,
  preserving the existing assertions verbatim.
- 2 new integration tests:
    `placement_with_non_empty_app_data_resolves_then_submits`
    `placement_skips_submit_when_app_data_hash_not_mirrored`
  mirror the twap-monitor pair, programming MockHost with a
  synthetic appData JSON + hash, asserting the resolve →
  build → submit chain produces a `submitted:{uid}` marker
  and that 404 produces a Warn-only skip.

## Workspace impact

- `cargo test -p ethflow-watcher` → 14 tests passing
  (was 12; +2 from this commit).
- `cargo test --workspace` → 183 tests passing total
  (was 181 after the twap-monitor commit; +2 ethflow-watcher).
- `cargo clippy --all-targets --workspace -- -D warnings`
  clean.
- `cargo fmt --all --check` clean.

Linear: COW-1074 (extended scope — same gap in ethflow-watcher).
Captures the 2026-06-18 COW-1064 dry run + live in-flight
validation of PR #47 (resolve_app_data fix).

## Acceptance summary

5 of 6 rows green; the only [ ] is `block delta ≥ 1500`
(got 415) because the run was intentionally interrupted twice
to validate PR #47 against the same data/e2e local-store
across pre-PR-47 + PR-47-twap-monitor + PR-47-ethflow-watcher
commits.

| Row | Result |
|---|---|
| block delta ≥ 1500 | [ ] (got 415; 3 engine restarts for PR #47 mid-run validation) |
| all 5 modules have a terminal marker | [x] |
| shepherd_module_errors_total{trap} == 0 | [x] |
| no module poisoned | [x] |
| 0 ERROR lines from nexum_engine | [x] |
| TWAP + EthFlow tx submitted | [x] |

## 4 anomalies filed in Linear, fully documented in §6

- COW-1074 — twap-monitor + ethflow-watcher hardcoded
  EMPTY_APP_DATA_JSON. **Fixed in-run via PR #47**;
  live-validated for both modules (§6.5).
- COW-1075 — SDK classify_api_error should map
  `DuplicatedOrder` -> `Drop` (stop-loss retry loop).
- COW-1076 — ethflow on-chain `validTo=uint32::MAX` rejected
  by Sepolia orderbook (`ExcessiveValidTo`; upstream issue).
- COW-1077 — scripts/e2e-onchain.sh TWAP `t0=0` produces
  permanently-finished order (caller-side encoding bug).

## Live PR #47 validation (§6.5 — the key methodology note)

Three engine binaries exercised on the same redb local-store:

1. `5bcd47b` (pre-PR-47): surfaces the digest-mismatch
   client-side skip for both twap-monitor + ethflow-watcher
   on non-empty appData orders.
2. `acc9654` (PR #47 twap-monitor): existing cow-swap UI TWAP
   re-polls to Ready -> resolve_app_data resolves the JSON
   from `/api/v1/app_data/{hash}` -> submit reaches orderbook
   -> DuplicatedOrder (server-side reject only). Client-side
   digest check bypassed.
3. `cd68de0` (PR #47 ethflow-watcher): new cow-swap UI EthFlow
   swap (`0x82da5ced...`) observed -> appData =
   `0xe46e7d0c...` (NON-empty rich JSON: appCode="CoW Swap",
   slippageBips=857, smartSlippage=true) -> resolve_app_data
   calls orderbook -> JSON extracted from `fullAppData` field
   -> build produces matching-digest body -> submit reaches
   orderbook -> ExcessiveValidTo (server-side reject only,
   tracked separately in COW-1076).

The PR #47 fix is therefore live-validated end-to-end against
the real Sepolia orderbook in **both** affected modules.

## What this report unblocks

COW-1031 (7-day soak) is technically unblocked: the engine
+ 5-module dispatch is proven correct under live conditions;
PR #47 closes the only blocking SDK gap for the soak's TWAP
+ EthFlow coverage. The remaining 3 follow-ups
(COW-1075/1076/1077) are quality-of-output rather than
correctness regressions and do not block the soak.

Operator sign-off pending in §8.

Linear: COW-1064 (closes).
brunota20 added a commit that referenced this pull request Jun 23, 2026
…redesign, COW-1076)

The original BLEU-833 design lifted the on-chain GPv2OrderData into an
OrderCreation body and POSTed it to /api/v1/orders with the EthFlow
contract as the EIP-1271 owner. Empirical evidence collected during a
2026-06-22 Sepolia soak shows that path cannot succeed:

  - The 3 OrderPlacement events the watcher observed during the run
    (creationDate 18:43, 18:53, 21:19 UTC, UIDs ending in
    ba3cb449...EAdec|ffffffff) are *fulfilled* on the orderbook with
    validTo = u32::MAX — the same shape the orderbook rejects when
    shepherd POSTs it.
  - The orderbook entries carry onchainUser, onchainOrderData and
    ethflowData.userValidTo (a real unix timestamp), and these fields
    do not exist on the OrderCreation request body in
    cowprotocol-orderbook. They are populated by the orderbook backend
    itself, which indexes the OrderPlacement event server-side and
    creates the order with dual-validTo bookkeeping.

So /api/v1/orders is structurally the wrong endpoint for on-chain
EthFlow placements: the orderbook's native indexer is the one that
creates the order entry, and the public POST endpoint applies the
generic validTo cap and rejects with ExcessiveValidTo every time.
Continuing to submit produces only garbage outcomes per real EthFlow
trade (misleading dropped/backoff markers and a misleading
shepherd_cow_api_submit_total{outcome="err"} increment).

This commit replaces the submit path with the contract the orderbook
actually offers: shepherd decodes the OrderPlacement event, computes
the canonical UID, GETs /api/v1/orders/{uid}, and records the outcome:

  - 200 → write observed:{uid}, log Info ("orderbook indexed").
  - 404 → log Info ("not yet indexed; will recheck on re-delivery")
    and do not write the marker, so a future log re-delivery or a
    block-tick poll (out of scope here) can re-verify.
  - other err → log Warn with the host's code and message.

Idempotency stays event-driven: once observed:{uid} is written, log
re-delivery is a no-op. Block-tick-driven retries for stuck indexer
lag are intentionally out of scope and tracked separately.

Test surface: the existing BLEU-832 decode tests are preserved verbatim
and the GPv2OrderData fixture is reused with the canonical EthFlow
markers (OrderKind::SELL, SellTokenSource::ERC20, BuyTokenDestination::ERC20).
Two new tests cover UID computation invariants: the owner suffix pins
to the EthFlow contract and the validTo suffix pins to the on-chain
order's validTo. A third asserts compute_uid returns None on unsupported
chains so callers fall back to a Warn log rather than panic.

This invalidates the design that PRs #10 (BLEU-833), #47 (COW-1074),
#48 (COW-1075) and #49 (COW-1076) were all patching. After review,
those PRs should be closed or substantially rewritten — none of the
appData resolution / error-envelope forwarding / log-downgrade work is
needed once the broken endpoint is no longer called.

AI assistance disclosure: drafted by Claude (Opus 4.7, 1M context)
following the soak inspection that surfaced the divergence.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant