Skip to content

[e2e] event-log-race-repro: actionable summary, slow≠stuck, fetch retries#2195

Merged
VaguelySerious merged 5 commits into
stablefrom
peter/repro-ci-summary-improvements
Jun 1, 2026
Merged

[e2e] event-log-race-repro: actionable summary, slow≠stuck, fetch retries#2195
VaguelySerious merged 5 commits into
stablefrom
peter/repro-ci-summary-improvements

Conversation

@VaguelySerious
Copy link
Copy Markdown
Member

Standalone CI-only changes for the event-log-race-repro job, extracted so they can be merged independently of any core fix. All on top of the classification work already on stable (#2194).

Actionable result summary

  • 🚨 Event-Log Regressions table lists every gating run in full (never truncated), each with duration, a synthesised detail line, and a direct dashboard link.
  • Infra (non-gating) section groups harness noise by error code with a plain-language explanation and example run links, instead of flooding one table with thousands of rows.
  • Headline names the regression count and digests the infra noise.

Slow ≠ stuck

A run flagged at the poll budget can simply be slow on a loaded preview deployment (observed: a stuck run that actually completed shortly after). Added a generous post-budget grace window: a run that reaches a terminal state during grace is classified by its real outcome — completed → non-gating SLOW_COMPLETION, failed → its error class. Only a run still non-terminal after budget + grace is a genuine wedge (gating stuck). Stuck runs also record where they wedged (latest event/step).

Retry transient fetch failures

Investigating two HARNESS_ERRORs (fetch failed, Hook not found): both came from harness-side network calls to the deployment, not the SDK. Added withRetry (linear backoff, transient-network detection) around the harness network calls (getWorkflowMetadata, start, resumeHook, run-status poll); the poll no longer aborts on a flaky GET. On final failure the error is prefixed with the call site (e.g. start: fetch failed) so the infra breakdown says where it happened.

Renderer is unit-tested (node:test, run by the CI Scripts Tests job).

🤖 Generated with Claude Code

VaguelySerious and others added 5 commits June 1, 2026 12:12
…ist + infra breakdown

Rework the PR-comment renderer so a human can immediately see what gates the
job and inspect every failing run:

- 🚨 Event-Log Regressions table lists *every* gating run in full (never
  truncated), each with its duration, a synthesised detail line, and a direct
  dashboard link. Stuck runs render "no terminal state after <ms>".
- Infra (non-gating) section groups harness noise by error code with a
  plain-language explanation and example run links, instead of flooding one
  table with thousands of rows.
- Headline names the regression count and digests the infra noise
  (e.g. "904 HOOK_RESUME_FAILED, 61 NO_WAKE_BRANCH").

Adds unit coverage for the breakdown, message synthesis and the
never-truncate-regressions guarantee.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
(cherry picked from commit 8f41186)
On run-poll timeout, fetch the run's event log and record the latest event
(type, step name, elapsed) as the stuck run's errorMessage. The summary's
regression table then shows "stuck after N events; latest step_started (foo)
at +12.3s" with a dashboard link, instead of only a duration — so a human can
see where the run wedged without opening every link. Best-effort; falls back
to the duration-only note if the event fetch fails.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
(cherry picked from commit dee4370)
A run flagged at the 150s poll budget can simply be slow on a loaded preview
deployment — observed wrun_…EFDZ9 completed shortly after the harness gave up
and was wrongly gated as `stuck`.

Add a generous post-budget grace window: a run that reaches a terminal state
during grace is classified by its real outcome (completed → non-gating
`SLOW_COMPLETION` infra, surfaced for visibility; failed → its error class).
Only a run still non-terminal after budget + grace is a genuine wedge (gating
`stuck`). Renderer gains notes for SLOW_COMPLETION/CANCELLED and singular/plural
agreement fixes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
(cherry picked from commit 31d5b99)
…e they occur

Investigating HARNESS_ERRORs on a repro run: a `fetch failed` and a `Hook not
found`. Both came from harness-side network calls to the deployment, not the
SDK. A single dropped connection should never abort tracking an otherwise
healthy run.

- Add `withRetry` (linear backoff, transient-network detection) and apply it to
  the harness network calls: getWorkflowMetadata, start, resumeHook, and the
  run-status poll. On final failure the error is prefixed with the call site
  (e.g. "start: fetch failed", "poll runs.get: fetch failed"), so the infra
  breakdown says *where* it happened.
- pollTerminalRun no longer aborts on a flaky GET: a transient error just
  retries/continues until the deadline.
- waitForHook labels its surfaced error ("waitForHook: Hook not found") so the
  hook-propagation timeout is identifiable in the summary.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
(cherry picked from commit a9b68c0)
@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented Jun 1, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
example-nextjs-workflow-turbopack Ready Ready Preview, Comment Jun 1, 2026 10:17am
example-nextjs-workflow-webpack Ready Ready Preview, Comment Jun 1, 2026 10:17am
example-workflow Ready Ready Preview, Comment Jun 1, 2026 10:17am
workbench-astro-workflow Ready Ready Preview, Comment Jun 1, 2026 10:17am
workbench-express-workflow Ready Ready Preview, Comment Jun 1, 2026 10:17am
workbench-fastify-workflow Ready Ready Preview, Comment Jun 1, 2026 10:17am
workbench-hono-workflow Ready Ready Preview, Comment Jun 1, 2026 10:17am
workbench-nitro-workflow Ready Ready Preview, Comment Jun 1, 2026 10:17am
workbench-nuxt-workflow Ready Ready Preview, Comment Jun 1, 2026 10:17am
workbench-sveltekit-workflow Ready Ready Preview, Comment Jun 1, 2026 10:17am
workbench-tanstack-start-workflow Ready Ready Preview, Comment Jun 1, 2026 10:17am
workbench-vite-workflow Ready Ready Preview, Comment Jun 1, 2026 10:17am
workflow-docs Ready Ready Preview, Comment, Open in v0 Jun 1, 2026 10:17am
workflow-swc-playground Ready Ready Preview, Comment Jun 1, 2026 10:17am
workflow-tarballs Ready Ready Preview, Comment Jun 1, 2026 10:17am
workflow-web Ready Ready Preview, Comment Jun 1, 2026 10:17am

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Jun 1, 2026

🦋 Changeset detected

Latest commit: b726540

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 0 packages

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

🧪 E2E Test Results

Some tests failed

Summary

Passed Failed Skipped Total
✅ ▲ Vercel Production 901 0 67 968
✅ 💻 Local Development 970 0 86 1056
✅ 📦 Local Production 970 0 86 1056
✅ 🐘 Local Postgres 901 0 67 968
✅ 🪟 Windows 88 0 0 88
❌ 🌍 Community Worlds 134 88 0 222
✅ 📋 Other 492 0 36 528
Total 4456 88 342 4886

❌ Failed Tests

🌍 Community Worlds (88 failed)

mongodb (12 failed):

  • hookWorkflow is not resumable via public webhook endpoint | wrun_01KT1AXZWZYGRHV86RHXK1K527
  • webhookWorkflow | wrun_01KT1AY72F9A1W4AYGSMR6657C
  • sleepingWorkflow | wrun_01KT1AYDGJBH8V0RSK9W75Z07R
  • outputStreamWorkflow no startIndex (reads all chunks)
  • outputStreamWorkflow negative startIndex (reads from end)
  • outputStreamWorkflow - getTailIndex and getStreamChunks getTailIndex returns correct index after stream completes
  • outputStreamWorkflow - getTailIndex and getStreamChunks getTailIndex returns -1 before any chunks are written
  • outputStreamWorkflow - getTailIndex and getStreamChunks getStreamChunks returns same content as reading the stream
  • outputStreamInsideStepWorkflow - getWritable() called inside step functions | wrun_01KT1B1QFSNCQBBE7GN3H4EKZC
  • concurrent hook token conflict - two workflows cannot use the same hook token simultaneously | wrun_01KT1B67Y90J0DV6YPWGEKNW78
  • pages router sleepingWorkflow via pages router
  • resilient start: addTenWorkflow completes when run_created returns 500 | wrun_01KT1BCF4ZB5BPQ6P6M8WGGYMY

redis (9 failed):

  • hookWorkflow is not resumable via public webhook endpoint | wrun_01KT1AXZWZYGRHV86RHXK1K527
  • sleepingWorkflow | wrun_01KT1AYDGJBH8V0RSK9W75Z07R
  • outputStreamWorkflow negative startIndex (reads from end)
  • outputStreamWorkflow - getTailIndex and getStreamChunks getTailIndex returns correct index after stream completes
  • outputStreamWorkflow - getTailIndex and getStreamChunks getTailIndex returns -1 before any chunks are written
  • outputStreamWorkflow - getTailIndex and getStreamChunks getStreamChunks returns same content as reading the stream
  • concurrent hook token conflict - two workflows cannot use the same hook token simultaneously | wrun_01KT1B67Y90J0DV6YPWGEKNW78
  • pages router sleepingWorkflow via pages router
  • resilient start: addTenWorkflow completes when run_created returns 500 | wrun_01KT1BCF4ZB5BPQ6P6M8WGGYMY

turso-dev (1 failed):

  • dev e2e should rebuild on imported step dependency change

turso (66 failed):

  • addTenWorkflow | wrun_01KT1AWT51YNJGFZN56XX8B36Z
  • addTenWorkflow | wrun_01KT1AWT51YNJGFZN56XX8B36Z
  • wellKnownAgentWorkflow (.well-known/agent) | wrun_01KT1AZ04ST8YGJ9XGJYW52Q1F
  • should work with react rendering in step
  • promiseAllWorkflow | wrun_01KT1AX29BCSSES15F4Z7R0A7Q
  • promiseRaceWorkflow | wrun_01KT1AX5MC849XVT57H8BPRWYQ
  • promiseAnyWorkflow | wrun_01KT1AX7TDPZG9T68P6D12W12P
  • importedStepOnlyWorkflow | wrun_01KT1AZFNG12191X94VF6AGA8B
  • readableStreamWorkflow | wrun_01KT1AXA0FJRX591CGQGEA7JWY
  • hookWorkflow | wrun_01KT1AXQ5K6BY9QW56JA5M8XMS
  • hookWorkflow is not resumable via public webhook endpoint | wrun_01KT1AXZWZYGRHV86RHXK1K527
  • webhookWorkflow | wrun_01KT1AY72F9A1W4AYGSMR6657C
  • sleepingWorkflow | wrun_01KT1AYDGJBH8V0RSK9W75Z07R
  • parallelSleepWorkflow | wrun_01KT1AYY79G4DZ1DFD1NA15PPS
  • nullByteWorkflow | wrun_01KT1AZ3YNZNBYS8422EJXBVKZ
  • workflowAndStepMetadataWorkflow | wrun_01KT1AZ71KTSX3YS3CRD9A5Z7C
  • outputStreamWorkflow no startIndex (reads all chunks)
  • outputStreamWorkflow positive startIndex (skips first chunk)
  • outputStreamWorkflow negative startIndex (reads from end)
  • outputStreamWorkflow - getTailIndex and getStreamChunks getTailIndex returns correct index after stream completes
  • outputStreamWorkflow - getTailIndex and getStreamChunks getTailIndex returns -1 before any chunks are written
  • outputStreamWorkflow - getTailIndex and getStreamChunks getStreamChunks returns same content as reading the stream
  • outputStreamInsideStepWorkflow - getWritable() called inside step functions | wrun_01KT1B1QFSNCQBBE7GN3H4EKZC
  • fetchWorkflow | wrun_01KT1B27HX1ADK51QT0KEHHG3S
  • promiseRaceStressTestWorkflow | wrun_01KT1B2B3HZWV4FNM3S5SVZ5RH
  • error handling error propagation workflow errors nested function calls preserve message and stack trace
  • error handling error propagation workflow errors cross-file imports preserve message and stack trace
  • error handling error propagation step errors basic step error preserves message and stack trace
  • error handling error propagation step errors cross-file step error preserves message and function names in stack
  • error handling retry behavior regular Error retries until success
  • error handling retry behavior FatalError fails immediately without retries
  • error handling retry behavior RetryableError respects custom retryAfter delay
  • error handling retry behavior maxRetries=0 disables retries
  • error handling catchability FatalError can be caught and detected with FatalError.is()
  • error handling not registered WorkflowNotRegisteredError fails the run when workflow does not exist
  • error handling not registered StepNotRegisteredError fails the step but workflow can catch it
  • error handling not registered StepNotRegisteredError fails the run when not caught in workflow
  • hookCleanupTestWorkflow - hook token reuse after workflow completion | wrun_01KT1B5V6V77H5NJE832HY1NB2
  • concurrent hook token conflict - two workflows cannot use the same hook token simultaneously | wrun_01KT1B67Y90J0DV6YPWGEKNW78
  • hookDisposeTestWorkflow - hook token reuse after explicit disposal while workflow still running | wrun_01KT1B6PWR05ZAV9NMGQ7EYNXW
  • stepFunctionPassingWorkflow - step function references can be passed as arguments (without closure vars) | wrun_01KT1B76HJMYN3XQ1X6T9EKMGG
  • stepFunctionWithClosureWorkflow - step function with closure variables passed as argument | wrun_01KT1B7GG8HVQZZ9YA97QBHSTQ
  • closureVariableWorkflow - nested step functions with closure variables | wrun_01KT1B7PGVGXWSAD6JREW2WPSK
  • spawnWorkflowFromStepWorkflow - spawning a child workflow using start() inside a step | wrun_01KT1B7RST4B3S774KY3N2D30K
  • health check (queue-based) - workflow and step endpoints respond to health check messages
  • health check (CLI) - workflow health command reports healthy endpoints
  • pathsAliasWorkflow - TypeScript path aliases resolve correctly | wrun_01KT1B87WEW8F0C8YRCFY4Q76R
  • Calculator.calculate - static workflow method using static step methods from another class | wrun_01KT1B8DQAN9NY288MQEX8TNDC
  • AllInOneService.processNumber - static workflow method using sibling static step methods | wrun_01KT1B8MM9NQ34X8VCFHKAN7VA
  • ChainableService.processWithThis - static step methods using this to reference the class | wrun_01KT1B8VF7HCMC4ACQS06TN1KK
  • thisSerializationWorkflow - step function invoked with .call() and .apply() | wrun_01KT1B92Q380VFNTQDJ7CV7SQT
  • customSerializationWorkflow - custom class serialization with WORKFLOW_SERIALIZE/WORKFLOW_DESERIALIZE | wrun_01KT1B9AB7SKYGPWP47Y9B3MVJ
  • instanceMethodStepWorkflow - instance methods with "use step" directive | wrun_01KT1B9HF30Y76H26DVDDWV6PP
  • crossContextSerdeWorkflow - classes defined in step code are deserializable in workflow context | wrun_01KT1B9YGDRABKCMGWT48TS5ZB
  • stepFunctionAsStartArgWorkflow - step function reference passed as start() argument | wrun_01KT1BA7TMXK18A968S1Y2NY09
  • cancelRun - cancelling a running workflow | wrun_01KT1BAF0WJTS36QE9E7R5RHH9
  • cancelRun via CLI - cancelling a running workflow | wrun_01KT1BATFDPMSXDVEAF8TPYSJG
  • pages router addTenWorkflow via pages router
  • pages router promiseAllWorkflow via pages router
  • pages router sleepingWorkflow via pages router
  • hookWithSleepWorkflow - hook payloads delivered correctly with concurrent sleep | wrun_01KT1BB73CKGRPQWQFH307BKHT
  • sleepInLoopWorkflow - sleep inside loop with steps actually delays each iteration | wrun_01KT1BBR12C7E60RPPN6EBWXPY
  • sleepWithSequentialStepsWorkflow - sequential steps work with concurrent sleep (control) | wrun_01KT1BC3PRNEVTNXDJ92DY54HK
  • importMetaUrlWorkflow - import.meta.url is available in step bundles | wrun_01KT1BCAJNRBGQJGT92EAWSQWC
  • metadataFromHelperWorkflow - getWorkflowMetadata/getStepMetadata work from module-level helper (#1577) | wrun_01KT1BCCRGF28ESF56K9HR6SEH
  • resilient start: addTenWorkflow completes when run_created returns 500 | wrun_01KT1BCF4ZB5BPQ6P6M8WGGYMY

Details by Category

✅ ▲ Vercel Production
App Passed Failed Skipped
✅ astro 81 0 7
✅ example 81 0 7
✅ express 81 0 7
✅ fastify 81 0 7
✅ hono 81 0 7
✅ nextjs-turbopack 86 0 2
✅ nextjs-webpack 86 0 2
✅ nitro 81 0 7
✅ nuxt 81 0 7
✅ sveltekit 81 0 7
✅ vite 81 0 7
✅ 💻 Local Development
App Passed Failed Skipped
✅ astro-stable 82 0 6
✅ express-stable 82 0 6
✅ fastify-stable 82 0 6
✅ hono-stable 82 0 6
✅ nextjs-turbopack-canary 69 0 19
✅ nextjs-turbopack-stable 88 0 0
✅ nextjs-webpack-canary 69 0 19
✅ nextjs-webpack-stable 88 0 0
✅ nitro-stable 82 0 6
✅ nuxt-stable 82 0 6
✅ sveltekit-stable 82 0 6
✅ vite-stable 82 0 6
✅ 📦 Local Production
App Passed Failed Skipped
✅ astro-stable 82 0 6
✅ express-stable 82 0 6
✅ fastify-stable 82 0 6
✅ hono-stable 82 0 6
✅ nextjs-turbopack-canary 69 0 19
✅ nextjs-turbopack-stable 88 0 0
✅ nextjs-webpack-canary 69 0 19
✅ nextjs-webpack-stable 88 0 0
✅ nitro-stable 82 0 6
✅ nuxt-stable 82 0 6
✅ sveltekit-stable 82 0 6
✅ vite-stable 82 0 6
✅ 🐘 Local Postgres
App Passed Failed Skipped
✅ astro-stable 82 0 6
✅ express-stable 82 0 6
✅ fastify-stable 82 0 6
✅ hono-stable 82 0 6
✅ nextjs-turbopack-stable 88 0 0
✅ nextjs-webpack-canary 69 0 19
✅ nextjs-webpack-stable 88 0 0
✅ nitro-stable 82 0 6
✅ nuxt-stable 82 0 6
✅ sveltekit-stable 82 0 6
✅ vite-stable 82 0 6
✅ 🪟 Windows
App Passed Failed Skipped
✅ nextjs-turbopack 88 0 0
❌ 🌍 Community Worlds
App Passed Failed Skipped
✅ mongodb-dev 5 0 0
❌ mongodb 57 12 0
✅ redis-dev 5 0 0
❌ redis 60 9 0
❌ turso-dev 4 1 0
❌ turso 3 66 0
✅ 📋 Other
App Passed Failed Skipped
✅ e2e-local-dev-nest-stable 82 0 6
✅ e2e-local-dev-tanstack-start-stable 82 0 6
✅ e2e-local-postgres-nest-stable 82 0 6
✅ e2e-local-postgres-tanstack-start-stable 82 0 6
✅ e2e-local-prod-nest-stable 82 0 6
✅ e2e-local-prod-tanstack-start-stable 82 0 6

📋 View full workflow run


Some E2E test jobs failed:

  • Vercel Prod: success
  • Local Dev: success
  • Local Prod: success
  • Local Postgres: failure
  • Windows: success

Check the workflow run for details.

⚠️ Community world tests failed (non-blocking):

  • Community Worlds: failure

Check the workflow run for details.

@VaguelySerious VaguelySerious marked this pull request as ready for review June 1, 2026 10:28
@VaguelySerious VaguelySerious requested a review from a team as a code owner June 1, 2026 10:28
@VaguelySerious VaguelySerious merged commit bdca8fc into stable Jun 1, 2026
92 of 97 checks passed
@VaguelySerious VaguelySerious deleted the peter/repro-ci-summary-improvements branch June 1, 2026 10:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant