feat(logging): JSON formatter + structured dispatch fields (COW-1035)#37
Open
brunota20 wants to merge 1 commit into
Open
Conversation
Closes the COW-1035 structured-logging audit. The engine now ships
JSON-formatted logs by default, with consistent field shapes across
every dispatch, host call, and order submission. A single `jq` /
Loki / Grafana stream over the engine output reconstructs the full
timeline of any module event.
## Changes
### Formatter
- New `--pretty-logs` CLI flag (parsed in `cli.rs`). When set, the
engine uses the historical 0.1 human-readable formatter; otherwise
emits JSON with flattened event fields + no current-span noise.
- `main.rs` `tracing_subscriber::fmt` builder picks the formatter from
the flag. `EnvFilter` (`RUST_LOG` / `engine.toml::[engine].log_level`)
applies to both.
- `tracing-subscriber` feature set gains `json`.
### Dispatch logs (supervisor.rs)
Every `dispatch_block` and `dispatch_log` invocation now emits a
structured log line:
- `dispatch ok` (DEBUG) on success with `module`, `chain_id`,
`event_kind` ("block" / "log"), `block_number`, `latency_ms`.
- Existing host-error WARN + trap ERROR paths gain the same fields
for cross-correlation.
Live Sepolia validation:
```json
{"timestamp":"2026-06-18T13:56:48.587125Z","level":"DEBUG","message":"dispatch ok",
"module":"price-alert","chain_id":11155111,"event_kind":"block",
"block_number":11087508,"latency_ms":134,"target":"nexum_engine::supervisor"}
{"timestamp":"2026-06-18T13:56:48.857911Z","level":"DEBUG","message":"dispatch ok",
"module":"balance-tracker","chain_id":11155111,"event_kind":"block",
"block_number":11087508,"latency_ms":270,"target":"nexum_engine::supervisor"}
{"timestamp":"2026-06-18T13:56:50.193531Z","level":"DEBUG","message":"dispatch ok",
"module":"stop-loss","chain_id":11155111,"event_kind":"block",
"block_number":11087508,"latency_ms":1335,"target":"nexum_engine::supervisor"}
```
The latency-per-module distribution is now observable per-block:
price-alert ~135 ms (1 eth_call), balance-tracker ~270 ms (2
eth_getBalance), stop-loss ~1.3 s (oracle read + OrderCreation +
cow-api submit + retry classify).
### Host-call logs (pre-existing, kept as-is)
The M1 host backends already emit structured DEBUG with `chain_id`,
`method`, `bytes`, `latency_ms` on every `chain::request` /
`cow-api::submit-order` / `cow-api::request`. The audit confirmed
they already satisfy the COW-1035 contract; no changes needed.
### Runbook ergonomics
`just run-m2` and `just run-m3` pass `--pretty-logs` so the
runbook output samples (M2 + M3 runbooks) keep matching what the
operator sees locally. Production deploys (`cargo run -p
nexum-engine -- --engine-config engine.toml` directly, e.g. from
a Docker entrypoint) get JSON by default.
## Validation
- `cargo test --workspace` -> 154 host tests + 6 doctests passing.
- `cargo clippy --all-targets --workspace -- -D warnings` clean.
- `cargo fmt --all --check` clean.
- Live Sepolia smoke: JSON output captured; per-module dispatch
latencies plausible against expected work per module.
- Pretty-logs flag verified to opt back into the 0.1 formatter.
## What this still doesn't do
- Per-order timeline aggregation (would need a `uid` field on the
stop-loss / twap-monitor submit logs). Currently the order UID
is logged by each module's `host.log` call (module-side) but
not joined with the supervisor dispatch line. Acceptable today;
the JSON shape supports adding the field later without breaking
consumers.
- Trace IDs / spans across the dispatch -> host-call boundary.
Worth a follow-up once Prometheus (COW-1034) lands and we know
the metric labels we want to correlate against.
Linear: COW-1035. Third M4 issue landed; stacks on #36 (COW-1036).
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Closes the COW-1035 structured-logging audit. The engine now ships JSON-formatted logs by default with consistent field shapes across every dispatch, host call, and order submission. A single `jq` / Loki / Grafana stream reconstructs the full timeline of any module event.
Third M4 issue landed.
Changes
Live Sepolia validation
{"timestamp":"2026-06-18T13:56:48.587125Z","level":"DEBUG","message":"dispatch ok", "module":"price-alert","chain_id":11155111,"event_kind":"block", "block_number":11087508,"latency_ms":134,"target":"nexum_engine::supervisor"} {"timestamp":"2026-06-18T13:56:48.857911Z","level":"DEBUG","message":"dispatch ok", "module":"balance-tracker","chain_id":11155111,"event_kind":"block", "block_number":11087508,"latency_ms":270,"target":"nexum_engine::supervisor"} {"timestamp":"2026-06-18T13:56:50.193531Z","level":"DEBUG","message":"dispatch ok", "module":"stop-loss","chain_id":11155111,"event_kind":"block", "block_number":11087508,"latency_ms":1335,"target":"nexum_engine::supervisor"}Per-module dispatch latency now observable: price-alert ~135ms (1 eth_call), balance-tracker ~270ms (2 eth_getBalance), stop-loss ~1.3s (oracle read + OrderCreation + cow-api submit + retry classify).
Breaking changes
Default log format changed from human-readable to JSON. Existing operators who parse log output by-line will need to pass `--pretty-logs` to restore the old format, OR pipe through `jq` to extract the same information. The runbooks already opt into `--pretty-logs`; only direct `cargo run -p nexum-engine` invocations need to add the flag.
Out of scope (deferred)
Testing
AI assistance disclosure
AI Assistance: this change + description was produced by a Claude Code agent (Claude Opus 4.7 1M context). The agent audited the existing logging call sites, ran the JSON output live against Sepolia for verification, and wrote the runbook ergonomics tweaks. A human (Bruno) reviewed and is accountable for the result.
Linear: COW-1035. Stacks on #36 (COW-1036 resource-limit tests).