Splice fixes for chanmon_consistency fuzz target#4655
Conversation
If the input ends immediately after `tx_signatures`, the corresponding `SpliceNegotiated` event may still be pending, leaving the valid interactive funding transaction in the test broadcaster.
Now that the fuzz target supports canceling splice funding attempts, we may see failed signing attempts due to the cancellation.
LDK and the chanmon_consistency fuzz target have grown in complexity recently and thus require more iterations than previously assumed to fully settle the state of all active channels.
This replicates what a node in production would see, and is necessary to replay certain actions within LDK that happened prior to the reload.
|
👋 Thanks for assigning @TheBlueMatt as a reviewer! |
| for (idx, node) in nodes.iter().enumerate() { | ||
| if node.broadcaster.txn_broadcasted.borrow().is_empty() { | ||
| continue; | ||
| } | ||
|
|
||
| let pending_events = node.get_and_clear_pending_events(); |
There was a problem hiding this comment.
The verification is one-directional: for each broadcast tx, you check a matching SpliceNegotiated event exists, but you don't verify the reverse — that every pending SpliceNegotiated event has a corresponding broadcast tx. A node with no broadcasts (skipped at line 2017) but with dangling SpliceNegotiated events passes silently.
Consider also checking nodes that have no broadcasts but do have pending SpliceNegotiated events, as that would indicate the splice tx was not broadcast despite an event being generated.
| for (idx, node) in nodes.iter().enumerate() { | |
| if node.broadcaster.txn_broadcasted.borrow().is_empty() { | |
| continue; | |
| } | |
| let pending_events = node.get_and_clear_pending_events(); | |
| for (idx, node) in nodes.iter().enumerate() { | |
| let pending_events = node.get_and_clear_pending_events(); | |
| if node.broadcaster.txn_broadcasted.borrow().is_empty() { | |
| assert!( | |
| !pending_events.iter().any(|e| matches!(e, events::Event::SpliceNegotiated { .. })), | |
| "node {} has pending SpliceNegotiated event(s) but no broadcast tx", | |
| idx, | |
| ); | |
| continue; | |
| } |
Review Summary1 inline comment posted:
No other issues found. The remaining changes are correct:
|
|
🔔 1st Reminder Hey @TheBlueMatt! This PR has been waiting for your review. |
|
🔔 2nd Reminder Hey @TheBlueMatt! This PR has been waiting for your review. |
|
🔔 3rd Reminder Hey @TheBlueMatt! This PR has been waiting for your review. |
| continue; | ||
| } | ||
|
|
||
| let pending_events = node.get_and_clear_pending_events(); |
There was a problem hiding this comment.
Is it not a cleaner fix to just handle the events here? We could filter the events by FundingNegotiated if we want to retain previous behavior, though not sure it matters much.
| use std::sync::{Arc, Mutex}; | ||
|
|
||
| const MAX_FEE: u32 = 10_000; | ||
| const MAX_SETTLE_ITERATIONS: usize = 256; |
| assert!(nodes[0].broadcaster.txn_broadcasted.borrow().is_empty()); | ||
| assert!(nodes[1].broadcaster.txn_broadcasted.borrow().is_empty()); | ||
| assert!(nodes[2].broadcaster.txn_broadcasted.borrow().is_empty()); | ||
| // Broadcast transactions are handled explicitly. If the input ends immediately after |
There was a problem hiding this comment.
For context, #4657 is changing chanmon_consistency broadcast handling so broadcast transactions are routed through an explicit modeled mempool, with finish draining remaining broadcasts through relay/mining cleanup.
There was a problem hiding this comment.
Should we fix this failure there, then?
There was a problem hiding this comment.
Something needs to be re-aligned either way, depending on merge order. Not sure what parts of this PR can be omitted with the more realistic mempool.
Prefer to keep #4657 a pure refactor though, and not include splice bug fixes. It's big enough already.
| if !self.nodes[node_idx].deferred { | ||
| self.nodes[node_idx].checkpoint_manager_persistence(); | ||
| } | ||
| let pre_reload_height = self.nodes[node_idx].height; |
There was a problem hiding this comment.
I think this pins the node's replay height to the manager height after reload, but can't this skip part of startup sync if an older monitor was reloaded below that height?
While this target found several issues in our production code, there were also issues with the fuzz target itself, which this PR addresses. It fixes the following payloads from #4363: