Skip to content

[DB-1951] Treat writes as idempotent if all events were previously written, even with check-only streams#5547

Merged
timothycoleman merged 1 commit intomasterfrom
timothycoleman/check-only-idempotency
Apr 13, 2026
Merged

[DB-1951] Treat writes as idempotent if all events were previously written, even with check-only streams#5547
timothycoleman merged 1 commit intomasterfrom
timothycoleman/check-only-idempotency

Conversation

@timothycoleman
Copy link
Copy Markdown
Contributor

@timothycoleman timothycoleman commented Mar 12, 2026

An idempotent multi-stream write retry should succeed even if a
check-only stream's expected version no longer matches. The events are
already written and the check-only stream's state may have changed
since, ultimately we resend the same response to the client as they
should have received the first time they sent the request.

Previously, a check-only stream (failing or not) caused the entire retry
to fail with WrongExpectedVersion.

@timothycoleman timothycoleman changed the base branch from master to timothycoleman/check-only-stream-sentinels March 12, 2026 11:19
@timothycoleman timothycoleman force-pushed the timothycoleman/check-only-stream-sentinels branch from 3faeb90 to 10a2369 Compare March 17, 2026 08:21
@timothycoleman timothycoleman force-pushed the timothycoleman/check-only-idempotency branch from a13f6d4 to a32b0cc Compare March 17, 2026 08:22
@timothycoleman timothycoleman changed the title Timothycoleman/check only idempotency [DB-1951] Timothycoleman/check only idempotency Mar 17, 2026
@linear
Copy link
Copy Markdown

linear Bot commented Mar 17, 2026

Base automatically changed from timothycoleman/check-only-stream-sentinels to master March 25, 2026 06:36
@timothycoleman timothycoleman force-pushed the timothycoleman/check-only-idempotency branch from a32b0cc to 66de44f Compare March 25, 2026 07:07
@timothycoleman timothycoleman changed the base branch from master to timothycoleman/msa-idempotency-tests March 30, 2026 06:59
@timothycoleman timothycoleman changed the title [DB-1951] Timothycoleman/check only idempotency [DB-1951] Treat writes as idempotent if all events were previously written, even with check-only streams Mar 30, 2026
Base automatically changed from timothycoleman/msa-idempotency-tests to master April 7, 2026 07:29
An idempotent multi-stream write retry should succeed even if a
check-only stream's expected version no longer matches. The events are
already written and the check-only stream's state may have changed
since, ultimately we resend the same response to the client as they
should have received the first time they sent the request.

Previously, a check-only stream (failing or not) caused the entire retry
to fail with WrongExpectedVersion.
@timothycoleman timothycoleman force-pushed the timothycoleman/check-only-idempotency branch from 66de44f to a5b911e Compare April 7, 2026 07:46
@timothycoleman timothycoleman marked this pull request as ready for review April 7, 2026 07:47
@timothycoleman timothycoleman requested a review from a team as a code owner April 7, 2026 07:47
Copilot AI review requested due to automatic review settings April 7, 2026 07:47
@qodo-code-review
Copy link
Copy Markdown
Contributor

Review Summary by Qodo

Treat idempotent writes as successful with failing check-only streams

🐞 Bug fix

Grey Divider

Walkthroughs

Description
• Add EventCount field to CommitCheckResult to distinguish check-only streams
• Treat idempotent writes as successful even if check-only streams fail version checks
• Update verification logic to only require streams with events to be idempotent
• Remove HasEventsToWrite from WritePreparesStreamInfo struct
Diagram
flowchart LR
  A["Multi-stream write retry"] --> B["Check all streams"]
  B --> C["Streams with events"]
  B --> D["Check-only streams"]
  C --> E["All idempotent?"]
  D --> F["Version mismatch?"]
  E -->|Yes| G["Success: AlreadyCommitted"]
  E -->|No| H["Failure: WrongExpectedVersion"]
  F -->|Yes| G
  F -->|No| H
Loading

Grey Divider

File Changes

1. src/KurrentDB.Core/Services/Storage/ReaderIndex/CommitCheckResult.cs ✨ Enhancement +10/-3

Add EventCount field to distinguish check-only streams

• Add EventCount parameter to constructor to track number of events being written
• Add EventCount property to store the event count
• Add HasEventsToWrite property that returns true when EventCount > 0
• Update constructor parameter order to place EventCount before EventStreamId
• Add XML documentation explaining the purpose of each constructor parameter

src/KurrentDB.Core/Services/Storage/ReaderIndex/CommitCheckResult.cs


2. src/KurrentDB.Core/Services/Storage/ReaderIndex/IndexWriter.cs ✨ Enhancement +18/-18

Update CommitCheckResult instantiations with EventCount

• Update all CommitCheckResult constructor calls to include EventCount parameter
• Pass eventIds.Length as the event count in all check commit methods
• Pass 0 for event count when returning error results
• Maintain consistent parameter ordering across all result instantiations

src/KurrentDB.Core/Services/Storage/ReaderIndex/IndexWriter.cs


3. src/KurrentDB.Core/Services/Storage/StorageWriterService.cs 🐞 Bug fix +36/-17

Refactor idempotency verification for check-only streams

• Remove HasEventsToWrite field from WritePreparesStreamInfo struct
• Use checkResult.HasEventsToWrite property instead of struct field
• Update VerifyCommitChecks to count streams with events separately
• Allow idempotent check-only streams without throwing exception
• Modify idempotency success condition to only require streams with events to be idempotent
• Return EventNumber.CheckOnlyFirst/Last for check-only streams in idempotent responses

src/KurrentDB.Core/Services/Storage/StorageWriterService.cs


View more (2)
4. src/KurrentDB.Core.Tests/Services/Storage/FakeIndexWriter.cs 🧪 Tests +2/-2

Update FakeIndexWriter with EventCount parameter

• Update CheckCommitStartingAt to pass 0 as event count parameter
• Update CheckCommit to pass eventIds.Length as event count parameter
• Align with new CommitCheckResult constructor signature

src/KurrentDB.Core.Tests/Services/Storage/FakeIndexWriter.cs


5. src/KurrentDB.Core.XUnit.Tests/TransactionLog/MultiStreamWrites/MultiStreamWritesTests.cs 🧪 Tests +18/-18

Update idempotency test expectations for check-only streams

• Update test expectations for WithFailedCheck idempotency kind from failure to success
• Update test expectations for WithSuccessfulCheck idempotency kind from failure to success
• Update test comments to reflect new behavior where check-only streams don't cause failures
• Apply changes across all applicable test cases with different expected versions and stream states

src/KurrentDB.Core.XUnit.Tests/TransactionLog/MultiStreamWrites/MultiStreamWritesTests.cs


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown
Contributor

qodo-code-review Bot commented Apr 7, 2026

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0)

Grey Divider

Great, no issues found!

Qodo reviewed your code and found no material issues that require review

Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates multi-stream write idempotency handling so that a retry can still succeed (returning an idempotent/AlreadyCommitted response) even when one or more check-only streams fail their expected-version checks, as long as all streams that actually had events to write were previously written.

Changes:

  • Add EventCount (and derived HasEventsToWrite) to CommitCheckResult so “check-only” vs “write” streams are carried through commit checking.
  • Update StorageWriterService.VerifyCommitChecks to treat retries as idempotent when all event-writing streams are idempotent, ignoring failures on check-only streams in that scenario.
  • Update multi-stream write tests to expect success for idempotent retries that include check-only streams.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/KurrentDB.Core/Services/Storage/StorageWriterService.cs Adjusts commit-check verification and response shaping to ignore check-only failures when all write-streams are idempotent.
src/KurrentDB.Core/Services/Storage/ReaderIndex/IndexWriter.cs Threads event-count through commit-check results so callers can identify check-only streams.
src/KurrentDB.Core/Services/Storage/ReaderIndex/CommitCheckResult.cs Extends commit-check result with EventCount/HasEventsToWrite.
src/KurrentDB.Core.XUnit.Tests/TransactionLog/MultiStreamWrites/MultiStreamWritesTests.cs Updates expectations so idempotent retries with check-only streams succeed.
src/KurrentDB.Core.Tests/Services/Storage/FakeIndexWriter.cs Updates test fake to match the new CommitCheckResult constructor signature.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/KurrentDB.Core/Services/Storage/StorageWriterService.cs
Comment thread src/KurrentDB.Core/Services/Storage/StorageWriterService.cs
@timothycoleman timothycoleman merged commit 9dfd1ae into master Apr 13, 2026
90 checks passed
@timothycoleman timothycoleman deleted the timothycoleman/check-only-idempotency branch April 13, 2026 13:26
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timothycoleman 👉 Created pull request targeting release/v26.0: #5577

alexeyzimarev added a commit that referenced this pull request Apr 16, 2026
* [DB-1964]: Added migration engine for secondary index tables (#5561)

* Initial version of the index DB migration engine

* Fixed regression

* Row must by passed by ref because its instance method mutates the struct fields

* Create schema from script only when fresh setup

* Removed version setup from the script file

* Fixed regression

* Removed redundant migration action

* Fixed occasional NRE

* Align guide with the source code

* Fixed typo

* Review feedback

* Fixed indentation

* Fixed identation

* [DB-1951] Add MSA idempotency tests that correspond to the existing behaviour (#5565)

* Add MSA idempotency tests that correspond to the existing behaviour

* chore: restructure CLAUDE.md with progressive disclosure and add architect review agent (#5572)

Restructure AI documentation for better context efficiency:
- CLAUDE.md reduced from 812 to 150 lines, keeping only rules that affect every coding decision
  (threading model, C# conventions, log levels, parameter design, naming)
- Extracted reference docs to .claude/docs/ with clear "when to fetch" pointers:
  architecture.md, api-v2-patterns.md, testing.md, protocol-v2.md, patterns-and-conventions.md
- New content from Tim Coleman's projections-v2 review (PR #5562):
  - TcsEnvelope<> vs CallbackEnvelope threading rule (was showing anti-pattern as correct)
  - Non-optional parameters / no-fallbacks convention
  - ISystemClient preference over raw IPublisher
  - ClaimsPrincipal threading-through requirement
  - Projections V2 engine architecture and threading model
- Add architect-review agent (.claude/agents/architect-review.md) encoding
  Tim Coleman's review methodology for future agentic code reviews

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Treat writes as idempotent if all events were previously written (#5547)

An idempotent multi-stream write retry should succeed even if a
check-only stream's expected version no longer matches. The events are
already written and the check-only stream's state may have changed
since, ultimately we resend the same response to the client as they
should have received the first time they sent the request.

Previously, a check-only stream (failing or not) caused the entire retry
to fail with WrongExpectedVersion.

* Fix incorrect result stream name for root partition in V2 projections

The double-dash `$projections-{name}--result` was produced when the
partition key was empty. Now uses `ProjectionNamesBuilder.MakeResultStreamName`
— consistent with V1's `MakePartitionResultStreamName`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* [DB-1912] Persistent subscriptions: Fix wrong checkpoint when using pinned strategy (#5497)

* Add failing test

Co-Authored-By: Timothy Coleman <timothy.coleman@gmail.com>

* Fixed: Wrong persistent subscription checkpoints can be produced when the consumer strategy skips events

* When a consumer strategy skips an event, don't keep it in the buffer but add it to the retry list instead.
The checkpointing mechanism considers the lowest message in `_retry` - not the lowest message in `_buffer`.
Thus leaving it in the buffer may result in a wrong checkpoint being produced.

* Always update the last known event when a new sequence number has been assigned
A new sequence number being assigned means that we are seeing an event that was not processed before.
In other words, it's the last event we've read from the source stream,
so we update the last known sequence number / last known event position accordingly

* fix: only update the _lastKnown vars when event leaves the buffer

the newSequenceNumberAssigned check isn't enough, because in the NoMoreCapacity case we discard the message that the _lastKnown vars are based on.
then next time we call OutstandingMessage.ForPushedEvent the _lastKnownMessage we pass in will be wrong, potentially leading to an incorrect checkpoint (see test)

* safer and more efficient

- safer because Scan() says we shouldn't add to each linked list while scanning it. before this commit we could add to _retry while scanning through it. now we only add to _retry while scanning _buffer
- more efficient because before this commit adding to _retry involved searching through it, now we only add to _retry when skipping an event in _buffer in which case we know the event needs to go to the back of retry.

---------

Co-authored-by: Timothy Coleman <timothy.coleman@gmail.com>

* Use $ProjectionState event type for V2 state checkpoints instead of Result

because what we are doing here is checkpointing not outputting results.

"Result" event type can be used when we implement outputState() which would emit the state after every event processed (at least, as long as the state changed)

* [DB-1872] Improve pinned persistent subscription performance under burst load (#5576)

* Improve persistent subscription performance under burst load

When events arrive in a burst, each NotifyLiveSubscriptionMessage call
was scanning the entire buffer to push events to clients. Client acks
are queued behind the event flood in the same FIFO queue, so clients
appear full and every scan is fruitless. With N events, this results
in O(N²) total work.

Instead of pushing synchronously on event arrival, we now schedule a
push message on the PS queue. This lets events accumulate in the
buffer cheaply, gives acks a chance to interleave, and batches the
buffer scan.

All PushToClients call sites use the deferred path for consistency,
since the buffer scan cost applies equally to acks, nacks, timeouts,
and client changes.

* update package ref for cve CVE-2026-26171

* [DB-2006]: Migration to major .NEXT version (#5581)

* Migration to major .NEXT version

* Fixed dependency

* Fixed regression

* Fixed regression

* Removed item index

* review: fix regression

* restore previous order

---------

Co-authored-by: Timothy Coleman <timothy.coleman@gmail.com>

* Write V2 state checkpoints to -state streams instead of -result streams

Result streams are for user-visible output (outputState()), not internal
state persistence. V2 was writing $ProjectionState events to -result
streams, conflating checkpointing with output.

State now goes to dedicated -state streams:
- $projections-{name}-state (root partition)
- $projections-{name}-{partition}-state (per partition)

This keeps -result streams clean for future outputState() support and
avoids collisions with the -checkpoint stream used for position.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Rename V2 state event type to $ProjectionState.V2

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Rename V2 checkpoint event type to $ProjectionCheckpoint.V2

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Remove V1-to-V2 migration test that was silently passing

V2 can't parse V1's checkpoint events (V1 stores state in data,
position in metadata; V2 expects position in data), so it falls
back to starting from the beginning. The test passed because V2
reprocessed everything from scratch, not because migration worked.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Rename ProjectionsStateStreamSuffix to ProjectionsResultStreamSuffix

The constant was confusingly named — it holds "-result", not "-state".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* better name...

---------

Co-authored-by: Roman Sakno <roman.sakno@kurrent.io>
Co-authored-by: Alexey Zimarev <alex@zimarev.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Shaan Nobee <sniper111@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants