Skip to content

fix: skip retries for non-retryable errors via NonRetryable marker interface#1157

Open
qozle wants to merge 1 commit intoanthropics:mainfrom
qozle:fix/non-retryable-errors
Open

fix: skip retries for non-retryable errors via NonRetryable marker interface#1157
qozle wants to merge 1 commit intoanthropics:mainfrom
qozle:fix/non-retryable-errors

Conversation

@qozle
Copy link
Copy Markdown
Contributor

@qozle qozle commented Apr 4, 2026

Summary

Fixes #1081. retryWithBackoff retried all errors unconditionally, including WorkflowValidationSkipError (thrown on 401 workflow-not-on-default-branch failures), wasting ~35 seconds per occurrence across 3 retry attempts.

Approach

This PR introduces a NonRetryable marker interface rather than requiring callers to pass an explicit shouldRetry predicate (the approach in #1082). Errors self-declare their retryability:

// src/utils/errors.ts (new)
export interface NonRetryable {
  readonly retryable: false;
}

WorkflowValidationSkipError now implements NonRetryable:

export class WorkflowValidationSkipError extends Error implements NonRetryable {
  readonly retryable = false as const;
  // ...
}

retryWithBackoff defaults to skipping retry for any NonRetryable error, so no changes are needed at existing call sites. Future non-retryable errors (e.g. an HttpError for 4xx status codes, tracked in #1156) get the same behavior automatically just by implementing the interface.

The shouldRetry option is also added to RetryOptions for callers that need explicit control.

Additional fixes

  • prepare.ts gap: WorkflowValidationSkipError was not caught in prepare.ts, causing the action to setFailed instead of silently skipping. run.ts already handled this correctly — this PR brings prepare.ts in line.
  • Spurious log message: The non-retryable check is now guarded inside attempt < maxAttempts, so "aborting retries" is never printed on the final attempt of a normal retryable error.
  • Misleading comments in github-file-ops-server.ts: Two call sites had comments claiming non-403 errors would "fail immediately without retry" — they did not. Comments updated to reflect actual behavior. The underlying fix for those call sites is tracked in retryWithBackoff: non-retryable intent in github-file-ops-server.ts is unenforced #1156.

Tests

17 new tests in test/retry.test.ts covering:

  • Baseline retry behavior (preserved)
  • NonRetryable default (no explicit shouldRetry needed)
  • Explicit shouldRetry option
  • Edge cases: maxAttempts=0, non-Error thrown values, shouldRetry called-with assertion
  • Integration: real WorkflowValidationSkipError is not retried

All 670 existing tests continue to pass.

…terface

retryWithBackoff retried all errors unconditionally, including deterministic
failures like WorkflowValidationSkipError (401 workflow-not-on-default-branch),
wasting ~35 seconds per occurrence.

Introduces a NonRetryable marker interface in src/utils/errors.ts. Any error
class that sets `retryable = false as const` is automatically skipped by
retryWithBackoff without requiring callers to pass a custom shouldRetry option.
The shouldRetry option is also added to RetryOptions for explicit control.

WorkflowValidationSkipError now implements NonRetryable, fixing the primary
reported case. The shouldRetry check is guarded inside `attempt < maxAttempts`
to prevent spurious "aborting retries" log messages on the final attempt.

Also fixes a gap in prepare.ts: WorkflowValidationSkipError was not caught
there, causing the action to fail instead of skip (run.ts already handled
this correctly). Updates misleading comments in github-file-ops-server.ts
that claimed non-403 errors would not be retried — they were. See anthropics#1156 for
the follow-up fix.

Closes anthropics#1081
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Don't retry non-retryable errors (401, 4xx) in retryWithBackoff

1 participant