Skip to content

Include failure context in splice events#4514

Open
jkczyz wants to merge 11 commits intolightningdevkit:mainfrom
jkczyz:2026-03-splice-rbf-fail-event
Open

Include failure context in splice events#4514
jkczyz wants to merge 11 commits intolightningdevkit:mainfrom
jkczyz:2026-03-splice-rbf-fail-event

Conversation

@jkczyz
Copy link
Copy Markdown
Contributor

@jkczyz jkczyz commented Mar 26, 2026

When a splice negotiation round fails, the user needs to know why it failed and what they can do about it. Previously, SpliceFailed gave no indication of the failure cause and didn't return the contribution, making it difficult to retry.

This PR adds failure context to the splice events so users can make informed retry decisions:

  • NegotiationFailureReason indicates what went wrong — peer disconnect, counterparty abort, invalid contribution, insufficient feerate, channel closing, etc. Each variant documents how to resolve it.
  • The FundingContribution from the failed round is returned in the event. Users can feed it back to funding_contributed to retry as-is, or inspect it to understand which inputs, outputs, and feerate were used when constructing a new contribution.
  • SplicePending and SpliceFailed are renamed to SpliceNegotiated and SpliceNegotiationFailed to reflect that each negotiation round (initial or RBF) independently resolves to one of these two outcomes.
  • DiscardFunding is now emitted before SpliceNegotiationFailed so wallet inputs are unlocked before the failure handler runs. Otherwise, a retry during SpliceNegotiationFailed handling would leave inputs locked, and the subsequent DiscardFunding would incorrectly unlock inputs committed to the new attempt.

Additionally, SpliceFundingFailed internals are simplified:

  • Dead funding_txo and channel_type fields are removed.
  • An unreachable was_negotiated check is removed from the contribution pop.
  • Inputs and outputs are derived from FundingContribution via a unified splice_funding_failed_for! macro, replacing the old approach of extracting them from FundingNegotiation variants. Unused conversion methods are removed.

Also fixes a bug in into_unique_contributions where outputs were compared by full TxOut equality instead of script_pubkey. This could fail to filter change outputs that reused the same address but had different values across RBF rounds.

@ldk-reviews-bot
Copy link
Copy Markdown

ldk-reviews-bot commented Mar 26, 2026

👋 Thanks for assigning @TheBlueMatt as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@jkczyz jkczyz self-assigned this Mar 26, 2026
@jkczyz jkczyz force-pushed the 2026-03-splice-rbf-fail-event branch 2 times, most recently from 333980d to c2cad50 Compare March 28, 2026 00:08
@jkczyz jkczyz requested review from TheBlueMatt and wpaulino March 28, 2026 00:10
Comment thread lightning/src/events/mod.rs Outdated
/// sent an invalid message). Retry by calling [`ChannelManager::splice_channel`].
///
/// [`ChannelManager::splice_channel`]: crate::ln::channelmanager::ChannelManager::splice_channel
NegotiationError,
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note we could have this wrap AbortReason -- we had an internal NegotiationError struct that already wrapped this along with contributions -- but it would require making AbortReason public. Not sure if we want to expose all its variants.

fn with_negotiation_failure_reason(mut self, reason: NegotiationFailureReason) -> Self {
match self {
QuiescentError::FailSplice(_, ref mut r) => *r = reason,
_ => debug_assert!(false, "Expected FailSplice variant"),
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Little unfortunate we need to do this, but I couldn't find a better way other than passing NegotiationFailureReason to abandon_quiescent_action and quiescent_action_into_error, which doesn't seem right since they could be for future QuiescentAction variants.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does feel like quiescent_action_into_error could take the NegotiationFailureReason and we could also use NegotiationFailureReason for dual-funding (the DiscardFunding variant) but this seems fine.

Comment on lines -1646 to -1649
/// The outpoint of the channel's splice funding transaction, if one was created.
abandoned_funding_txo: Option<OutPoint>,
/// The features that this channel will operate with, if available.
channel_type: Option<ChannelTypeFeatures>,
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't feel like there was much value to these.

  • abandoned_funding_txo would be set only if there's a failure after the transaction was negotiated but before signed.
  • In the future, when channel_type can change, maybe it is just part of FundingContribution? Could also keep it if you prefer.

@ldk-reviews-bot
Copy link
Copy Markdown

🔔 1st Reminder

Hey @TheBlueMatt @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

1 similar comment
@ldk-reviews-bot
Copy link
Copy Markdown

🔔 1st Reminder

Hey @TheBlueMatt @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link
Copy Markdown

🔔 2nd Reminder

Hey @TheBlueMatt @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

1 similar comment
@ldk-reviews-bot
Copy link
Copy Markdown

🔔 2nd Reminder

Hey @TheBlueMatt @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@jkczyz jkczyz marked this pull request as ready for review April 2, 2026 17:04
Comment on lines +3185 to 3193
impl QuiescentError {
fn with_negotiation_failure_reason(mut self, reason: NegotiationFailureReason) -> Self {
match self {
QuiescentError::FailSplice(_, ref mut r) => *r = reason,
_ => debug_assert!(false, "Expected FailSplice variant"),
}
self
}
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: The debug_assert!(false) path is only reachable in non-production builds if a caller passes a non-FailSplice variant. Currently the only caller passes through quiescent_action_into_error which always produces FailSplice for production code (only DoNothing in test builds). But debug_assert!(false) will silently return self unchanged in release builds — consider using debug_assert! with a more specific message about which variant was unexpected, or restructuring so the method only accepts FailSplice to make this impossible.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +6943 to +6947
let is_initiator = pending_splice
.funding_negotiation
.take()
.map(|negotiation| negotiation.is_initiator())
.unwrap_or(false);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The is_initiator fallback when funding_negotiation is None defaults to false, which means splice_funding_failed_for! will return None for non-initiators (since the macro has None if !$is_initiator => None). This means if the funding negotiation was already taken before reset_pending_splice_state is called and the node was actually the initiator, the splice failure event would be silently dropped.

Is funding_negotiation ever None when reset_pending_splice_state is called? If so, defaulting to false may suppress failure events for the initiator. If not, this should be a debug_assert or .expect() to catch violations of that invariant.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added debug assertions.

Comment on lines +6989 to +6993
let is_initiator = pending_splice
.funding_negotiation
.as_ref()
.map(|negotiation| negotiation.is_initiator())
.unwrap_or(false);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same concern here: funding_negotiation.as_ref() being None causes is_initiator to default to false, which will cause the splice_funding_failed_for! macro to return None for non-initiators. Since maybe_splice_funding_failed is the read-only version used during serialization, silently returning None means the splice failure events won't be persisted and the user won't learn about the failure after restart.

Consider whether funding_negotiation can actually be None in this path, and if not, using expect to enforce the invariant.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added debug assertions.

Comment thread lightning/src/ln/funding.rs
@ldk-claude-review-bot
Copy link
Copy Markdown
Collaborator

ldk-claude-review-bot commented Apr 2, 2026

I've completed a thorough review of the entire PR diff. All previously flagged issues remain applicable, and I found no additional significant issues.

Review Summary

Previously flagged issues (still applicable):

  1. lightning/src/ln/channel.rs:3188with_negotiation_failure_reason silently no-ops in release for non-FailSplice variants.
  2. lightning/src/ln/channel.rs:7267is_initiator defaults to false when funding_negotiation is None, potentially suppressing initiator failure events.
  3. lightning/src/ln/channel.rs:7317 — Same is_initiator defaulting issue in maybe_splice_funding_failed (serialization path).
  4. lightning/src/ln/channel.rs:14023 — Duplicate quiescent action returns NegotiationFailureReason::Unknown because .with_negotiation_failure_reason() is not chained.
  5. lightning/src/ln/channelmanager.rs:11856 — Unnecessary .clone() on contribution.
  6. lightning/src/ln/funding.rs:714 — Changelog incomplete (missing outputs(), change_output(), value_added() public accessors).
  7. lightning/src/ln/channelmanager.rs:4118 — Inconsistent DiscardFunding emission for empty contributions vs. handle_quiescent_error.

Cross-cutting concern (still applicable):

  • Forward compatibility of NegotiationFailureReason: All 9 variant IDs are even (0, 2, 4, ..., 16). In impl_writeable_tlv_based_enum_upgradable!, unknown even variant IDs return Err(UnknownRequiredFeature). Since reason uses upgradable_option in the reader, MaybeReadable is invoked — but an unknown even variant ID still produces an Err that propagates, failing event 52 deserialization and preventing ChannelManager loading when a newer writer adds a new even variant. Future variants should use odd IDs for graceful degradation to Unknown.

Items verified as correct:

  • Event 52 TLV backward/forward compatibility for existing fields.
  • DiscardFunding -> SpliceNegotiationFailed ordering is consistently maintained at all emission sites via SpliceFundingFailed::into_parts().
  • The adjacency assertion in get_and_clear_pending_events is sound — each pair is emitted atomically under a single lock, and multi-channel interleaving preserves per-channel adjacency.
  • The splice_funding_failed_for! macro correctly filters current-round contributions against prior rounds using into_unique_contributions.
  • The promotion path (splice_locked) correctly derives DiscardFunding from contributions filtered against the promoted transaction.
  • The new can_initiate_rbf check in the stfu handler at line 14155 properly prevents sending tx_init_rbf when RBF is impossible and fails with the correct CannotInitiateRbf reason.
  • The serialization path correctly injects temporary events and truncates after write.
  • The into_unique_contributions bug fix (comparing by script_pubkey instead of full TxOut) correctly handles change outputs that reuse the same address with different values across RBF rounds.
  • NegotiationFailureReason assignment is correct at all sites (ChannelClosing, PeerDisconnected, CounterpartyAborted, NegotiationError, LocallyAbandoned, ContributionInvalid, FeeRateTooLow, CannotInitiateRbf).

No new issues found beyond those previously reported.

@jkczyz jkczyz force-pushed the 2026-03-splice-rbf-fail-event branch from c2cad50 to c469423 Compare April 3, 2026 01:25
@jkczyz
Copy link
Copy Markdown
Contributor Author

jkczyz commented Apr 3, 2026

Rebased

Comment thread lightning/src/ln/channelmanager.rs Outdated
@@ -4128,6 +4118,16 @@ impl<
},
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Unlike handle_quiescent_error (line ~6619), this site unconditionally emits DiscardFunding even when contributed_inputs and contributed_outputs are both empty (which happens for the initiator when all inputs are committed to prior rounds). The same inconsistency exists at lines ~4417, ~4920, ~11853, ~12012, ~12180, ~12329, ~15448, and ~18122.

Consider adding the same guard here:

if !splice_funding_failed.contributed_inputs.is_empty() || !splice_funding_failed.contributed_outputs.is_empty() {
    // emit DiscardFunding
}

Or, equivalently, have the macro return None for the initiator when there are no unique contributions (matching the non-initiator behavior) and emit SpliceNegotiationFailed without DiscardFunding in those cases.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by making SpliceFundingFailed consumable into Option-al parts rather than exposing fields. The into_parts method now checks for empty contributions.

Comment thread lightning/src/ln/channelmanager.rs Outdated
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 3, 2026

Codecov Report

❌ Patch coverage is 73.36245% with 61 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.99%. Comparing base (2adb690) to head (62847a1).
⚠️ Report is 59 commits behind head on main.

Files with missing lines Patch % Lines
lightning/src/ln/channelmanager.rs 65.45% 35 Missing and 3 partials ⚠️
lightning/src/events/mod.rs 25.00% 18 Missing ⚠️
lightning/src/ln/channel.rs 93.97% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4514      +/-   ##
==========================================
- Coverage   87.09%   86.99%   -0.11%     
==========================================
  Files         163      162       -1     
  Lines      108856   108969     +113     
  Branches   108856   108969     +113     
==========================================
- Hits        94808    94795      -13     
- Misses      11563    11690     +127     
+ Partials     2485     2484       -1     
Flag Coverage Δ
fuzzing 38.29% <21.14%> (-1.91%) ⬇️
tests 86.08% <73.36%> (-0.12%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jkczyz jkczyz force-pushed the 2026-03-splice-rbf-fail-event branch 2 times, most recently from 4f6b1a6 to 0609565 Compare April 3, 2026 15:33
@jkczyz
Copy link
Copy Markdown
Contributor Author

jkczyz commented Apr 3, 2026

  • Forward compatibility of NegotiationFailureReason: Uses impl_writeable_tlv_based_enum! with all-even variant IDs. A future variant added by a newer LDK version would cause UnknownRequiredFeature on deserialization, blocking ChannelManager loading on older versions. Consider impl_writeable_tlv_based_enum_upgradable! or documenting the constraint.

Changed to use impl_writeable_tlv_based_enum_upgradable and changed the reason from default_value to upgradable_option, defaulting to NegotiationFailureReason::Unknown.

@jkczyz jkczyz force-pushed the 2026-03-splice-rbf-fail-event branch from 0609565 to 44ecc06 Compare April 3, 2026 15:39
@ldk-reviews-bot
Copy link
Copy Markdown

🔔 3rd Reminder

Hey @TheBlueMatt @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

1 similar comment
@ldk-reviews-bot
Copy link
Copy Markdown

🔔 3rd Reminder

Hey @TheBlueMatt @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link
Copy Markdown

🔔 4th Reminder

Hey @TheBlueMatt @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@jkczyz jkczyz force-pushed the 2026-03-splice-rbf-fail-event branch from 48f950f to 4c64e31 Compare April 7, 2026 16:54
@jkczyz
Copy link
Copy Markdown
Contributor Author

jkczyz commented Apr 7, 2026

Regarding the earlier fuzz failure in CI, it appears that #4536 fixes it.

@jkczyz jkczyz force-pushed the 2026-03-splice-rbf-fail-event branch from 4c64e31 to 0287ceb Compare April 7, 2026 20:01
@jkczyz jkczyz requested a review from TheBlueMatt April 7, 2026 20:01
@jkczyz
Copy link
Copy Markdown
Contributor Author

jkczyz commented Apr 7, 2026

Latest push looks identical but fixes a rebase screw-up causing tests to not pass on each commit.

@ldk-reviews-bot
Copy link
Copy Markdown

🔔 5th Reminder

Hey @TheBlueMatt @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

jkczyz and others added 7 commits April 8, 2026 14:31
Reverse the event ordering at all emission sites so that
Event::DiscardFunding is emitted before Event::SpliceFailed. If the
user retries the splice when handling SpliceFailed, the contributed
inputs would still be locked. A subsequent DiscardFunding would then
incorrectly unlock inputs that are now committed to the new attempt.
Emitting DiscardFunding first avoids this by ensuring inputs are
unlocked before any retry occurs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rename Event::SplicePending to Event::SpliceNegotiated and
Event::SpliceFailed to Event::SpliceNegotiationFailed. These names
better reflect the per-round semantics: each negotiation attempt
resolves to one of these two outcomes, independent of the overall
splice lifecycle.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The was_negotiated check is unnecessary because reset_pending_splice_state
only runs when funding_negotiation is present, meaning
on_tx_signatures_exchange hasn't been called yet. Since the feerate is
only recorded in last_funding_feerate_sat_per_1000_weight during
on_tx_signatures_exchange, the current round's feerate can never match
it. So the contribution can always be unconditionally popped.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Filter outputs by script_pubkey rather than full TxOut equality. Outputs
reusing the same address as a prior round are still considered committed
even if the value differs (e.g., different change amounts across RBF
rounds with different feerates).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the maybe_create_splice_funding_failed! macro and
splice_funding_failed_for method with a unified splice_funding_failed_for!
macro that derives contributed inputs and outputs from the
FundingContribution rather than extracting them from the negotiation
state.

Callers pass ident parameters for which PendingSplice filtering methods
to use: contributed_inputs/contributed_outputs when the current round's
contribution has been popped or was never pushed, and
prior_contributed_inputs/prior_contributed_outputs for the read-only
persistence path where the contribution is cloned instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…hods

Now that splice_funding_failed_for! derives inputs and outputs from
FundingContribution directly, remove the unused NegotiationError struct
and into_negotiation_error methods from the interactive tx types, along
with the into/to_contributed_inputs_and_outputs methods on
ConstructedTransaction.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@ldk-reviews-bot
Copy link
Copy Markdown

🔔 1st Reminder

Hey @TheBlueMatt @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@jkczyz
Copy link
Copy Markdown
Contributor Author

jkczyz commented Apr 9, 2026

I was looking into some of the fuzz failures in #4504, and it seems we attempt to send tx_init_rbf after splice_locked. Here's the scenario:

  • Node 1 initiates and negotiates a splice with Node 0
  • Node 1 initiates an RBF and sends stfu
  • Node 1 sees the splice tx confirm and sends splice_locked
  • Node 0 responds with stfu
  • Node 1 sends tx_init_rbf because pending_splice is still Some given it hasn't received splice_locked from Node 0 yet
  • Node 0 rejects the tx_init_rbf since Node 1 already sent splice_locked

So question is what do we do in this scenario?

We'll at least need to emit a SpliceNegotiationFailed event though we may need another NegotiationFailureReason.

But we are still left in quiescence with nothing to do. We shouldn't WarnAndDisconnect because it's not Node 0's fault. We could:

  • do nothing and let quiescence time out
  • send tx_init_rbf anyway for our counterparty to disconnect us
  • something else?

Fuzz input is 784009b5f5f500a10f1a5cac23f584fff500a10f1a5cac23f584fffd80.

@ldk-reviews-bot
Copy link
Copy Markdown

🔔 6th Reminder

Hey @TheBlueMatt @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link
Copy Markdown

🔔 7th Reminder

Hey @TheBlueMatt @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link
Copy Markdown

🔔 2nd Reminder

Hey @TheBlueMatt @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@TheBlueMatt
Copy link
Copy Markdown
Collaborator

We shouldn't WarnAndDisconnect because it's not Node 0's fault

Discussed offline but imo yes we should. warnings aren't just for error conditions that are the peer's fault, warns are supposed to be ignored and are just human/dev-readable debug info - we can freely send something even if its our fault.

@jkczyz
Copy link
Copy Markdown
Contributor Author

jkczyz commented Apr 13, 2026

Discussed offline but imo yes we should. warnings aren't just for error conditions that are the peer's fault, warns are supposed to be ignored and are just human/dev-readable debug info - we can freely send something even if its our fault.

Done. PTAL.

@ldk-reviews-bot
Copy link
Copy Markdown

🔔 8th Reminder

Hey @TheBlueMatt @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link
Copy Markdown

🔔 3rd Reminder

Hey @TheBlueMatt @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

jkczyz and others added 2 commits April 15, 2026 21:07
If splice_locked is sent between our outgoing STFU and the
counterparty's STFU response, the stfu() handler would proceed to
send tx_init_rbf for an already-confirmed splice. Guard against this
by re-checking can_initiate_rbf when entering quiescence. Disconnect
because there is no way to cancel quiescence after both sides have
exchanged STFU.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a splice funding is promoted, produce FundingInfo::Contribution
instead of FundingInfo::Tx for the discarded funding events. Each
contribution is filtered against the promoted funding transaction's
inputs and outputs, so only inputs and outputs unique to the discarded
round are reported.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jkczyz jkczyz force-pushed the 2026-03-splice-rbf-fail-event branch from ee5e7a8 to 62847a1 Compare April 16, 2026 02:20
@jkczyz
Copy link
Copy Markdown
Contributor Author

jkczyz commented Apr 16, 2026

Pushed changes here addressing #4536 (comment) and #4536 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

4 participants