M4.1d — Observability: per-session cost and failure report

## Parent
Spawned from #8 (M4.1 tracking issue).

## Objective
Surface per-session cost, failure reasons, and AI error rates so benchmark health can be monitored.

## Definition of done
- `results/*.json` per-result rows include an optional `error` field when `result` starts with `error:`
- `build_index.py` aggregates error counts into the index entry (`error_count`, `skip_count`)
- `make test-unit` still passes

## Dependencies
Depends on #3, #4.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

M4.1d — Observability: per-session cost and failure report #13

Parent

Objective

Definition of done

Dependencies

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

M4.1d — Observability: per-session cost and failure report #13

Description

Parent

Objective

Definition of done

Dependencies

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions