docs: add benchmark segmentation guidance to prevent agent timeouts

LeeCampbell · claude · LeeCampbell · commit fdea5d3352f4 · 2026-03-23T11:33:13.000+10:00
Full benchmark suites (3 classes × 3 runtimes) exceed the 30-minute
agent iteration timeout. Document segmentation strategies so agents
run benchmarks by class, by runtime, or both.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
diff --git a/spec/README.md b/spec/README.md
@@ -41,7 +41,7 @@ This file provides guidance on where to find standards and specifications for th
 | encoding, LEB128, DEFLATE, compression | [Histogram Encoding](./tech-standards/histogram-encoding.md) |
 | log format, V2, persistence | [Histogram Encoding](./tech-standards/histogram-encoding.md) |
 | xUnit, test, FluentAssertions | [Testing Standards](./tech-standards/testing-standards.md) |
-| benchmark, performance, allocation, BenchmarkDotNet | [Testing Standards](./tech-standards/testing-standards.md) |
+| benchmark, performance, allocation, BenchmarkDotNet, timeout | [Testing Standards](./tech-standards/testing-standards.md) |
 | naming convention, XML docs, style | [Coding Standards](./tech-standards/coding-standards.md) |
 | build, NuGet, AppVeyor, CI/CD | [Build System](./tech-standards/build-system.md) |
 | milestone, issue, PR, GitHub | [GitHub CLI Reference](./tech-standards/github.md) |
diff --git a/spec/tech-standards/testing-standards.md b/spec/tech-standards/testing-standards.md
@@ -224,12 +224,30 @@ Both levels of benchmark are required because:
 
 ### Running Benchmarks
 
+> **Timeout warning:** A full benchmark run (all classes, all runtimes) can take **over 30 minutes**.
+> Automated agents have a 30-minute iteration timeout and **must not** attempt a full suite in a single run.
+> Always segment benchmark runs as described below.
+
+**Segmentation strategies** (pick one or combine):
+
+- **By benchmark class** — run one category at a time (e.g. encoding, recording, leading-zero-count)
+- **By runtime** — restrict to a single target framework per run (e.g. `net8.0` only)
+- **By filter** — use `--filter` to select specific benchmark methods
+
 ```bash
 # Build in Release mode (required)
 dotnet build HdrHistogram.Benchmarking/ -c Release
 
-# Run specific benchmarks
-dotnet run -c Release --project HdrHistogram.Benchmarking/ -- --filter '*ClassName*'
+# Run a SINGLE benchmark class (recommended for agents)
+dotnet run -c Release --project HdrHistogram.Benchmarking/ -- --filter '*Recording32BitBenchmark*'
+dotnet run -c Release --project HdrHistogram.Benchmarking/ -- --filter '*LeadingZeroCount64BitBenchmark*'
+
+# Run benchmarks for a SINGLE runtime only
+dotnet run -c Release --project HdrHistogram.Benchmarking/ -- --filter '*ClassName*' --runtimes net8.0
+dotnet run -c Release --project HdrHistogram.Benchmarking/ -- --filter '*ClassName*' --runtimes net9.0
+
+# Combine both: one class, one runtime (fastest, safest for agents)
+dotnet run -c Release --project HdrHistogram.Benchmarking/ -- --filter '*Recording32BitBenchmark*' --runtimes net8.0
 
 # Export results as JSON for comparison
 dotnet run -c Release --project HdrHistogram.Benchmarking/ -- --filter '*ClassName*' --exporters json