feat(core): add retry for transient HTTP and stream-level errors#13
Conversation
Add retry mechanism driven by pi settings.json retry.provider config (timeoutMs, maxRetries, maxRetryDelayMs). HTTP-level retries handle 429/5xx with exponential backoff and jitter, respecting Retry-After headers (seconds and HTTP-date formats). Stream-level retries handle cases where the API returns 200 OK but sends an error event in the stream body. Retries only when no content has been emitted yet. Per-attempt timeout via AbortController with automatic retry. Clean abort propagation through the retry loop.
patlux
left a comment
There was a problem hiding this comment.
Thanks for the retry work and the broad test coverage. I’m requesting changes for a few blockers before merge: CI is currently red, the per-attempt timeout is cleared before stream consumption so it does not actually protect hanging streams, maxRetryDelayMs: 0 behaves opposite to the documented contract, and the default retry count diverges from pi’s provider retry default/visibility model.
|
Thanks for the detailed review — pushed
|
patlux
left a comment
There was a problem hiding this comment.
Thanks for the follow-up. Most of the previous points are addressed in the final tree, but I still see two blockers: CI is still red because Gitleaks scans the PR commit history, and the stream timeout retry path can now retry after content has already been emitted, which mixes partial output from multiple attempts.
Align provider maxRetries default with pi (0), treat maxRetryDelayMs 0 as no cap, keep per-attempt timeout through stream reads, and fix gitleaks test fixtures.
|
Thanks again for the second review — pushed an update on Stream timeout + partial output: Regression test: Re Gitleaks / commit history: noted your “ignore that” — CI is green on our side with the allowlist entry. Let me know if anything else should change before merge. |
patlux
left a comment
There was a problem hiding this comment.
All requested changes addressed. CI is green.
|
Shipped in |
|
Thank you for the contribution! This is included in |
Summary
Add configurable retry mechanism for transient failures in
streamCommandCode, responding totimeoutMs,maxRetries, andmaxRetryDelayMsfrom pisettings.jsonretry provider config.Retry-Afterheader (seconds and HTTP-date formats).AbortController; timed-out attempts retry immediately without backoff delay.finishedflag, clearsoutput.content, and distinguishes timeout from user-initiated abort viaattemptTimedOutflag to prevent "Agent is already processing" errors on retry.Changes
src/core.tssrc/types.tsStreamOptionsfields:timeoutMs,maxRetries,maxRetryDelayMstests/test-retry.tstests/helpers.tsresponseDelaysupport for timeout testingCHANGELOG.mdTest plan
npm test— all 68 tests passnpm run typecheck— no type errorsnpm run format:check— Prettier clean