Skip to content

fuzz: model chanmon mempool mining#4657

Open
joostjager wants to merge 2 commits into
lightningdevkit:mainfrom
joostjager:chanmon-mempool-mining
Open

fuzz: model chanmon mempool mining#4657
joostjager wants to merge 2 commits into
lightningdevkit:mainfrom
joostjager:chanmon-mempool-mining

Conversation

@joostjager
Copy link
Copy Markdown
Contributor

This prepares chanmon_consistency for force-close fuzzing by making its chain model closer to the environment LDK sees in normal operation.

Force-close scenarios depend heavily on transaction timing: claims may be broadcast, replaced, confirmed, followed by additional claims, and later become spendable only after more blocks. The previous harness mostly folded transaction confirmation into sync-style actions, which made it harder to express those flows accurately and made future force-close coverage depend on shortcuts in the test harness.

The updated model gives the harness an explicit mempool and block-mining path. Broadcast transactions can be relayed into the modeled mempool, mined into harness blocks, and then replayed to both monitors and managers through chain callbacks. The harness also tracks confirmed UTXOs and wallet change so later splice, anchor, and claim transactions have a realistic view of what can be spent.

This should make upcoming force-close fuzzing changes easier to review: first establish a more faithful chain environment, then add the force-close-specific scenarios and invariants on top of it.

@ldk-reviews-bot
Copy link
Copy Markdown

ldk-reviews-bot commented Jun 2, 2026

👋 Thanks for assigning @jkczyz as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@joostjager
Copy link
Copy Markdown
Contributor Author

@wpaulino FYI, this may intersect with the existing splice fuzzing failures you’re looking at.

This PR makes splice transaction mining in chanmon_consistency more realistic: negotiated splice transactions no longer get implicitly confirmed through the old sync-style path. Instead, broadcasts have to pass through the harness’s modeled mempool and are confirmed only when the fuzz input mines blocks.

@joostjager joostjager force-pushed the chanmon-mempool-mining branch from b9a40e5 to bbbd224 Compare June 3, 2026 14:59
Restore cfg(splicing) to the fuzz check-cfg allow list and gate
chanmon consistency splice opcodes on that cfg again. Without the
cfg, those inputs stop before executing splice-specific operations.
@joostjager joostjager force-pushed the chanmon-mempool-mining branch 3 times, most recently from 4e6f262 to d02194d Compare June 4, 2026 05:19
@joostjager
Copy link
Copy Markdown
Contributor Author

Fuzz failure is pre-existing

@joostjager joostjager force-pushed the chanmon-mempool-mining branch 2 times, most recently from 0ab5882 to ef0be5b Compare June 4, 2026 09:24
@joostjager joostjager marked this pull request as ready for review June 4, 2026 10:04
@joostjager joostjager removed the request for review from valentinewallace June 4, 2026 10:04
Comment thread fuzz/src/chanmon_consistency.rs
Comment thread fuzz/src/chanmon_consistency.rs Outdated
Comment thread fuzz/src/chanmon_consistency.rs Outdated
Comment on lines +1184 to +1236
// Connects this node from its tracked height to target_height, delivering
// each relevant chain callback to both ChainMonitor and ChannelManager.
fn connect_chain_range(&mut self, chain_state: &ChainState, target_height: u32) {
// Mining commands can advance the harness chain by more than one block.
// Transaction blocks must be connected explicitly so LDK learns about
// on-chain spends, while empty depth blocks still need best-block
// updates so CSV/CLTV-sensitive logic can run. This is the normal sync
// path, so both the raw ChainMonitor and the ChannelManager receive the
// callbacks and the node's tracked height advances to the target.
let mut height = self.height;
while height < target_height {
let mut next_height = height + 1;
while next_height <= target_height && chain_state.block_at(next_height).1.is_empty() {
next_height += 1;
}
if next_height > target_height {
// The rest of the range is empty. One best-block update to the
// final height is enough because LDK's Confirm API explicitly
// allows best_block_updated to skip intermediary blocks, and
// empty blocks have no transactions_confirmed calls whose chain
// order must be preserved.
height = target_height;
let (header, _) = chain_state.block_at(height);
self.monitor.best_block_updated(header, height);
self.node.best_block_updated(header, height);
break;
}
if next_height > height + 1 {
// Advance across the empty prefix before the next transaction
// block. Confirm::best_block_updated may skip intermediary
// blocks, so this compressed update still lets height-triggered
// LDK work run at the correct tip before the transaction
// confirmations are connected.
height = next_height - 1;
let (header, _) = chain_state.block_at(height);
self.monitor.best_block_updated(header, height);
self.node.best_block_updated(header, height);
}
height = next_height;
let (header, txn) = chain_state.block_at(height);
let txdata: Vec<_> = txn.iter().enumerate().map(|(i, tx)| (i + 1, tx)).collect();
if !txdata.is_empty() {
// Transaction blocks need the explicit confirmation callback
// before the best-block update so watched spends are delivered
// in chain order before the node advances to that tip.
self.monitor.transactions_confirmed(header, &txdata, height);
self.node.transactions_confirmed(header, &txdata, height);
}
self.monitor.best_block_updated(header, height);
self.node.best_block_updated(header, height);
}
self.height = target_height;
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good improvement: the old sync_with_chain_state only notified self.node (ChannelManager), never self.monitor (ChainMonitor). This means ChannelMonitors weren't receiving transactions_confirmed / best_block_updated callbacks during normal fuzz-loop sync, which would have hidden any bugs in monitor chain-tracking logic. The new connect_chain_range fixing this is an important correctness improvement.

One behavioral change worth noting: the old code called best_block_updated for every individual block height. The new code batches consecutive empty blocks into a single best_block_updated call at the last empty height before the next tx block (or at the target). This is allowed by the Confirm trait ("May be skipped for intermediary blocks"), but it means height-triggered logic inside ChainMonitor/ChannelManager runs at fewer checkpoints than before. For a fuzzer seeking maximum coverage of height-sensitive paths (e.g., HTLC timeout detection), individual block delivery would exercise more code paths.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also argue that skipping exercises different paths. I think in practice, we'll get a bit from both.

Comment thread fuzz/src/chanmon_consistency.rs
@ldk-claude-review-bot
Copy link
Copy Markdown
Collaborator

ldk-claude-review-bot commented Jun 4, 2026

After a thorough re-examination of every hunk in this diff, I found no new issues beyond those already covered in my prior review pass.

Summary

Prior issues (still applicable):

  1. fuzz/src/chanmon_consistency.rs:344 — Off-by-one in locktime validation: self.tip_height() < lock_time should be self.tip_height() + 1 < lock_time to match real mempool semantics (IsFinalTx evaluates against tip + 1). Will cause false assertion failures for HTLC-timeout transactions when force-close fuzzing is added.

  2. fuzz/src/chanmon_consistency.rs:2706finish loop doesn't call process_events between relay/mine rounds. Fine for the current scope but will need addressing for force-close fuzzing (anchor fee-bumping requires event processing to produce broadcast transactions).

  3. fuzz/src/chanmon_consistency.rs:3671break 'fuzz_loop on splice commands when !cfg!(splicing) discards the entire remainder of fuzz input, reducing corpus utility when splicing is disabled. continue would preserve coverage of the post-splice portion.

  4. fuzz/src/chanmon_consistency.rs:1236 — Empty-block batching in connect_chain_range reduces height-trigger coverage compared to block-by-block delivery (allowed by Confirm API, but trades fuzz coverage for speed).

Verification notes from re-review:

  • RBF conflict resolution with descendant eviction is correct: topological mempool ordering ensures the cascade propagates properly through the single-pass iteration.
  • UTXO tracking between ChainState and wallet sources is consistent; wallet.remove_utxo is a no-op for unknown outpoints (uses retain).
  • connect_chain_range correctly gates per-component callbacks with independent start heights, enabling proper startup sync after reload where ChainMonitor and ChannelManager can have different persisted best blocks.
  • The is_ldk_commitment_obscured_locktime exemption is correctly scoped to the time-lock assertion only.
  • The safe_mine_block_count clamping and the finish loop's interaction with HTLC expiry deadlines are sound.
  • Wallet seeding (3 wallets x 50 UTXOs) with distinct coinbase txids (different output scripts) is correct.
  • The settle_all addition of sync_with_chain_state at the top is a good improvement ensuring all nodes see the latest chain state before attempting settlement.

Comment thread fuzz/src/chanmon_consistency.rs Outdated
@jkczyz jkczyz self-requested a review June 4, 2026 14:06
@joostjager joostjager force-pushed the chanmon-mempool-mining branch from 4036d6a to 57960f1 Compare June 4, 2026 14:42
@joostjager joostjager self-assigned this Jun 4, 2026
@joostjager
Copy link
Copy Markdown
Contributor Author

When we are good with the changes, I can tack on one more commit that moves the chain/mempool code into fuzz/src/chanmon_consistency/chain_mempool.rs. With force close fuzzing, we are moving towards a 5000 line file, so no harm in already splitting things out if we are make changes anyway.

Comment thread fuzz/src/chanmon_consistency.rs Outdated
const DEFAULT_TX_CONFIRMATION_BLOCKS: u32 = 6;
// Single fuzz bytes can mine more than one block so a corpus entry does not
// need long runs of identical "mine one block" commands to reach CSV or CLTV
// boundaries. Mining is clipped below if unresolved HTLCs are near expiry.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is meant by "clipped below"?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That mining stops before anything would expire. Changed it to 'capped'

Comment thread fuzz/src/chanmon_consistency.rs Outdated
// A mined transaction is considered deeply confirmed after this many blocks.
// This confirms the transaction in one block and then mines five empty depth
// blocks.
const DEFAULT_TX_CONFIRMATION_BLOCKS: u32 = 6;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use ANTI_REORG_DELAY?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍

Comment thread fuzz/src/chanmon_consistency.rs Outdated
Comment on lines +3110 to +3114
events::Event::PaymentClaimed { payment_hash, .. } => {
if payments.payment_preimages.contains_key(&payment_hash) {
payments.claimed_payment_hashes.insert(payment_hash);
}
},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this (moved from line 1943) be a separate commit?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm yes, that is already living in #4635. Removed code from this PR.

Comment on lines +3381 to +3387
assert!(
current_tip < timeout_deadline,
"pending HTLC with expiry {} and timeout deadline {} is already unsafe at tip {}",
expiry,
timeout_deadline,
current_tip
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fails with the following input:

printf '\x20\xdf\xdf\xb0\x0e\x10\x18\x10\x18\x10\x18\x30\xd8' | ./target/release/chanmon_consistency_target

Claude's succinct summary:

The persisted view can drift arbitrarily behind chain_state.tip because Harness::mine_blocks never re-checkpoints, so after a deferred reload the node builds an HTLC against a stale view with a deadline already below the tip. Fix it by either soft-capping the past-deadline branch in safe_mine_block_count to return 0, or — better — also capping mining against the lowest persisted-manager view so a future reload-then-send can't produce an immediately-unsafe HTLC.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this was the missing startup-sync piece from the FC branch that I forgot to cherry-pick. Reload now catches ChainMonitor up from the oldest monitor height and ChannelManager from its own best block before returning to the fuzz loop.

The repro passes now

Comment on lines +2676 to +2679
assert!(
self.mine_blocks(DEFAULT_TX_CONFIRMATION_BLOCKS) > 0,
"finish cannot mine pending mempool transactions without crossing an unresolved HTLC timeout deadline"
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be reached using:

printf '\xc9\xa6\xff\xde\xdf' | ./target/release/chanmon_consistency_target

Copy link
Copy Markdown
Contributor Author

@joostjager joostjager Jun 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one the the ones below contain splice opcodes. As that is currently not yet stable, I don't want to go into debugging that preferably 😬 Without the splicing cfg flag, those strings pass.

@@ -2818,13 +3128,7 @@
.funding_transaction_signed(&channel_id, &counterparty_node_id, signed_tx)
.unwrap();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now reachable:

printf '\xa7\xa0\xff\xd5\xd8\xa0\xff' | ./target/release/chanmon_consistency_target

Comment on lines 3162 to 3164
panic!(
"It may take may iterations to settle the state, but it should not take forever"
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Able to hit this now but looks like it was preexisting, too.

printf '\xa7\xa0\xff\xd5\xd8\xa0\xff' | ./target/release/chanmon_consistency_target

let is_quiescent_msg = msg.data.contains("already sent splice_locked, cannot RBF");
if !msg.data.contains("Disconnecting due to timeout awaiting response") && !is_quiescent_msg
{
panic!("Unexpected disconnect case: {}", msg.data);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this can be hit prior to your change.

printf '\xc7\x3a\xa2\x32\x3a\xff\xff\xa7\x35\x33\x45\xff' | ./target/release/chanmon_consistency_target

We may need to expand the allowed reason.

@joostjager joostjager force-pushed the chanmon-mempool-mining branch from 57960f1 to 771e8a5 Compare June 8, 2026 09:24
Comment thread fuzz/src/chanmon_consistency.rs
@joostjager joostjager force-pushed the chanmon-mempool-mining branch from 771e8a5 to 534bb75 Compare June 8, 2026 09:36
@joostjager
Copy link
Copy Markdown
Contributor Author

joostjager commented Jun 8, 2026

Review comments addressed: https://github.com/lightningdevkit/rust-lightning/compare/57960f1..3b8a78e1d9c61e184e4ccb7ea9279b9f832df580

  • Use ANTI_REORG_DELAY for setup confirmations.
  • Derive node height from ChannelManager instead of cached state.
  • Sync restarted monitors and managers from their own persisted heights.
  • Restore claimed-payment tracking for the final PaymentSent invariant.
  • Move relay/mining opcodes to start at 0xd6, leaving 0xd3..0xd5 free for fuzz: add chanmon holder signer fuzz ops #4660

Comment thread fuzz/src/chanmon_consistency.rs
@joostjager joostjager force-pushed the chanmon-mempool-mining branch 2 times, most recently from a2335d0 to 9458dc7 Compare June 8, 2026 10:02
Route chanmon broadcasts through an explicit harness mempool so relay,
mining, wallet updates, and chain delivery share one path. This lets
splice, anchor, and claim transactions enter the mempool before mining.

On restart, sync loaded monitors and managers from their own persisted
best blocks so raw monitors catch up without rewinding ChannelManager
state. Cap modeled mining before unresolved HTLC timeout deadlines
and use the LDK anti-reorg depth for setup confirmations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

4 participants