Skip to content

wallet: Ensure best block matches wallet scan state#30221

Merged
ryanofsky merged 7 commits intobitcoin:masterfrom
achow101:wallet-no-chainstateflushed
May 19, 2025
Merged

wallet: Ensure best block matches wallet scan state#30221
ryanofsky merged 7 commits intobitcoin:masterfrom
achow101:wallet-no-chainstateflushed

Conversation

@achow101
Copy link
Member

@achow101 achow101 commented Jun 3, 2024

Implements the idea discussed in #29652 (comment)

Currently, m_last_block_processed and m_last_block_processed_height are not guaranteed to match the block locator stored in the wallet, nor do either of those fields actually represent the last block that the wallet is synced up to. This is confusing and unintuitive.

This PR changes those last block fields to be updated whenever the wallet makes a change to the db for new transaction state found in new blocks. Whenever a block is received that contains a transaction relevant to the wallet, the last block locator will now be written to disk. Furthermore, every block disconnection will now write an updated locator.

To ensure that the locator is relatively recent and loading rescans are fairly quick in the event of unplanned shutdown, it is also now written every 144 blocks (~1 day). Additionally it is now written when the wallet is unloaded so that it is accurate when the wallet is loaded again.

Lastly, the chainstateFlushed notification in the wallet is changed to be a no-op. The best block locator record is no longer written when chainstateFlushed is received from the node since it should already be mostly up to date.

@DrahtBot
Copy link
Contributor

DrahtBot commented Jun 3, 2024

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/30221.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK rkrux, mzumsande, ryanofsky
Concept ACK fjahr

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #bitcoin-core/gui/872 (gui: Menu action to export a watchonly wallet by achow101)
  • #32523 (wallet: Remove watchonly behavior and isminetypes by achow101)
  • #32489 (wallet: Add exportwatchonlywallet RPC to export a watchonly version of a wallet by achow101)
  • #29680 (wallet: fix unrelated parent conflict doesn't cause child tx to be marked as conflict by Eunovo)
  • #28333 (wallet: Construct ScriptPubKeyMans with all data rather than loaded progressively by achow101)
  • #27865 (wallet: Track no-longer-spendable TXOs separately by achow101)
  • #27286 (wallet: Keep track of the wallet's own transaction outputs in memory by achow101)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

LLM Linter (✨ experimental)

Possible typos and grammar issues:

  • completeley -> completely [typo]
  • in case the case -> in case of a [redundant words]

@ryanofsky
Copy link
Contributor

I'm not sure, but it seems like a potentially good thing to me that last_block_processed and BESTBLOCK are two distinct concepts.

The last block processed is the block that CWallet in-memory state has been synced to (particularly CWalletTx::m_state which includes mempool / abandoned information not serialized to disk).

The BESTBLOCK record is the last block whose data has been flushed to disk, that the wallet should begin syncing from next time it is reloaded.

It seems like the first commit ae3b8280a7dbb6cc87814ed97dbea5a122cf8215 could lead to a performance regression if it is writing the flushing the BESTBLOCK record on every single blockConnected event from the node. For example during reindexing it could write and sync the database each time a block is connected, even if the block is before the wallets birthday, or doesn't have any relevant transactions?

I think it probably makes sense to write BESTBLOCK in a smarter way, and probably write it more frequently, but it seems unnecessary to have to write it every block, and it might be bad for performance now or make optimizations harder in the future.

I'm also curious about the idea of saving height with the best block record. I could imagine this being useful, since IIRC parts of wallet loading are complicated by not knowing heights of transactions before the chain is attached, but it doesn't really seem like the height is used for much yet. Was this intended to be used by something else later?

@achow101
Copy link
Member Author

achow101 commented Jun 4, 2024

The last block processed is the block that CWallet in-memory state has been synced to (particularly CWalletTx::m_state which includes mempool / abandoned information not serialized to disk).

But none of that state is relevant to the last block processed. Any state related to blocks (confirmed or conflicted) is written to disk.

The BESTBLOCK record is the last block whose data has been flushed to disk, that the wallet should begin syncing from next time it is reloaded.

The point is that this discrepancy can result in skipping blocks. It doesn't make sense to me that we wouldn't store the block the wallet's state has been synced to, and it doesn't make sense that we should rely on a field that means something else to determine what point to start rescanning from on the next load.

If BESTBLOCK doesn't match the chainstate, that's fine because it's a locator and we will just find the fork and sync from there. I don't think it's necessary to record which block we think the chainstate is synced to.

@ryanofsky
Copy link
Contributor

It doesn't make sense to me that we wouldn't store the block the wallet's state has been synced to

You may be right this is the better approach. But I think the previous approach does make some sense, too. You might not want to write to the wallet database each time a block is connected if the block doesn't contain any relevant transactions, especially during reindexing.

@achow101
Copy link
Member Author

achow101 commented Jun 4, 2024

You might not want to write to the wallet database each time a block is connected if the block doesn't contain any relevant transactions

I think you would since not writing it would result in possibly rescanning that block at the next loading which takes a little bit of time, regardless of whether any relevant transactions are in the block.

especially during reindexing.

Perhaps an easy solution to this is to just write the best block every 1000 blocks (or some other interval) when we are in IBD?

I'm also curious about the idea of saving height with the best block record. I could imagine this being useful, since IIRC parts of wallet loading are complicated by not knowing heights of transactions before the chain is attached, but it doesn't really seem like the height is used for much yet. Was this intended to be used by something else later?

This was mainly done to avoid looking up the height every time we load since that was being problematic and causing a tests to fail. But it definitely could be more useful in the future.

@ryanofsky
Copy link
Contributor

ryanofsky commented Jun 5, 2024

The idea of saving heights is interesting to me because wallet code assumes it know block heights many places, but it doesn't actually store heights anywhere. So for example, if the wallet stored a mapping of block hashes to heights (or other block information) it might be useful for allowing the wallet to work in an offline mode or letting the wallet CLI tool have more capabilities. This is all pretty abstract though, so I'm not suggesting something specific.

Perhaps an easy solution to this is to just write the best block every 1000 blocks (or some other interval) when we are in IBD?

Yes but I think that just adds back the complexity you were trying to get rid of, in a form that seems worse than the status quo. If you implement that, the wallet will be back to tracking the last block processed separately from the best block written to the database. And now, instead of the node determining when data should be flushed to disk and having all wallets flush that data simultaneously, each wallet will have internal logic deciding when to flush. This would be more complicated, and could be worse for power consumption and performance if there are multiple wallets and flushes are happening more frequently at different times.

I think overall this PR is doing some good things, but the goals seem either not clear, or not clearly good, so I wonder if maybe you could take these changes and make more targeted PRs for the things you want to achieve?

Or, if this PR is definitely the direction you want to go, I'm happy to review it. I don't like some things it is doing, but overall I think it is a reasonable change.

Copy link
Contributor

@fjahr fjahr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

I can confirm that best block locator and last_processed_block being different is confusing, see also #30455 (comment)

Currently, this needs a rebase and I'm curious if @achow101 plans to make further changes based on @ryanofsky 's comments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking that it might make sense to keep this backup where it is and add second backup created at 199. Then this could test that both cases work as expected, i.e. 199 errors below and 299 doesn't. 299 is an interesting edge case in general (snapshot height == backup height).

@ryanofsky
Copy link
Contributor

I can confirm that best block locator and last_processed_block being different is confusing

I wonder if something else could be done to resolve this confusion other than writing to every wallet file every time a new block is connected, even if the block doesn't contain any relevant transactions. To me, "last block processed" and "last block flushed" seem like different concepts, and we could force them to be the same but only if we:

  • Give up flexibility of not needing to flush each wallet each time a block is connected (which seems inefficient during sync)
  • Give up ability to do coordinated flushes across the chainstate database, index databases and wallet databases time (which seems like it could waste resources and hurt system performance)

@ryanofsky
Copy link
Contributor

ryanofsky commented May 8, 2025

re: #30221 (comment)

🚧 At least one of the CI tasks failed. Task fuzzer,address,undefined,integer, no depends: https://github.com/bitcoin/bitcoin/runs/41829354924 LLM reason (✨ experimental): The CI failure is caused by a dynamic-type-mismatch detected by UndefinedBehaviorSanitizer during fuzz testing.

This seems unrelated to the other CI failures, but it looks like this failure is caused by a bug in the fuzz test:

https://github.com/bitcoin/bitcoin/blob/7eed9f17aec3e3ba93ec180a072b1c2f08dcbb39/src/wallet/test/fuzz/fees.cpp#L17-L25

Where the CWallet object is destroyed after the TestingSetup object so the interfaces::Chain object is already destroyed and this error happens.

Simplest fix would probably be to make the CWallet object a static variable in the function like the TestingSetup variable is, instead of a global variable.

@achow101 achow101 force-pushed the wallet-no-chainstateflushed branch from 7eed9f1 to dde9704 Compare May 8, 2025 23:00
@achow101
Copy link
Member Author

achow101 commented May 8, 2025

Simplest fix would probably be to make the CWallet object a static variable in the function like the TestingSetup variable is, instead of a global variable.

It can't be static because the FUZZ_TARGET below needs to access it. However, the wallet could be created in there instead of in the setup, as the other wallet fuzz tests do.

But, with moving WriteBestBlock back to RemoveWallet, this is not necessary either.

@achow101 achow101 force-pushed the wallet-no-chainstateflushed branch from dde9704 to e2e9c30 Compare May 9, 2025 18:38
Copy link
Contributor

@mzumsande mzumsande left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review ACK e2e9c30
(nits are not important)

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK e2e9c30a4d04bd8173c3386c5b97dbac227e810e just moving WriteBestBlock call from destructor back to RemoveWallet() since last review.

I left a bunch of suggestions below and repeated some of Martin's but they are all nitpicking at this point so feel free to ignore

}

// Update the best block
SetLastBlockProcessed(block.height - 1, *Assert(block.prev_hash));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "wallet: Update best block record after block dis/connect" (f1f254f6c9d3cead617300367442c1bbb449af7c)

re: #30221 (comment)

To deal with unclean shutdowns, during disconnect the reverse order compared to blockConnected would be better: If the locator is updated first, an unclean shutdown will lead to a rescan of the disconnected block (which might do nothing but correct the locator). But if txns are set to Inactive first and have an unclean shutdown after that, the transactions will be missing from the wallet / balance. As mentioned above, doing it all in one batch would resolve this in a more elegant way.

I might not totally understand this comment, but I think it is suggesting calling WriteBestBlock() at the top and SetLastBlockProcessed() at the bottom of blockDisconnected() , which is the opposite of what currently happens in blockConnected() where the last block is updated at the top and best block is updated at the bottom.

This could be reasonable. I also think it could be reasonable to do the opposite and just update the last block and best block at the bottom of both methods.

I think if the concern here is dealing with "unclean shutdowns" there is probably nothing special about the block disconnected method or reason to be particularly concerned about shutdowns here (unless I'm missing something), and it should be if the ok best block is a little out of date in that case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "wallet: Write best block record on unload" (8ecc7653556a1977eae5a902d51a2ececde94913)

Just to be sure, current state is that if WriteBestBlock is called here, rather than being called in the destructor or being called in FlushAndDeleteWallet, the best block locator could be a little out of date and this is maybe not ideal but fine?

re: #30221 (comment)

It can't be static because the FUZZ_TARGET below needs to access it. However, the wallet could be created in there instead of in the setup, as the other wallet fuzz tests do.

Yes, suggestion would be to follow the fuzz test setup pattern and create a static variable and global pointer. But no need to change this if debugging the CI failure is hard and only consequence is the best block could be a little behind.

@DrahtBot DrahtBot requested a review from mzumsande May 13, 2025 16:46
@achow101 achow101 force-pushed the wallet-no-chainstateflushed branch from e2e9c30 to 4e3acb8 Compare May 13, 2025 18:44
Copy link
Contributor

@rkrux rkrux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This partial review is for the previous diff at e2e9c30a4d04bd8173c3386c5b97dbac227e810e. I started looking into it today and need to spend some more time in internalising the consequences of this diff but at the moment I'm inclining towards a concept ack.

Left few questions below.

Comment on lines 3231 to 3224
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is specific to the concept of scanning for wallet transactions and then updating the last block processed both in memory and disk:

Additionally, after rescanning on wallet loading, instead of writing
the locator for the current chain tip, write the locator for the last
block that the rescan had scanned.

There are couple other usages of ScanForWalletTransactions in rescanblockchain RPC and conditionally in importdescriptors. I guess we don't want to update the block locator in disk in these cases because we also sync the block record to disk every 144 blocks now & doing it in these 2 RPCs doesn't seem necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those usages of ScanForWalletTransactions specifically set the option to not write the last block scanned to disk. This is because they are expected to be used during normal operation when the wallet will still be handling incoming blocks and transactions, so having that record possibly be out sync is actually quite bad.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also just mentioning that rescanblockchain can be used with ranges (start_height / stop_height). If stop_height is set to some past block, setting the best block back to that old block would be incorrect and lead to wrong balances etc.

Copy link
Contributor

@rkrux rkrux May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this makes sense to me.

Limiting the writing of best block to disk mainly to cases1 that are actively involved in processing the new blocks seems logical. Since these cases will be updating the best block periodically & conditionally, it's prudent to not interfere by doing the same via other user driven RPCs that work on different criterion as highlighted in the above comments.

This is because they are expected to be used during normal operation when the wallet will still be handling incoming blocks and transactions

I find this corroborated here in the scan function:

bitcoin/src/wallet/wallet.cpp

Lines 1893 to 1896 in c779ee3

// If rescanning was triggered with cs_wallet permanently locked (AttachChain), additional blocks that were connected during the rescan
// aren't processed here but will be processed with the pending blockConnected notifications after the lock is released.
// If rescanning without a permanent cs_wallet lock, additional blocks that were added during the rescan will be re-processed if
// the notification was processed and the last block height was updated.

Footnotes

  1. blockConnected, blockDisconnected notifications

achow101 added 7 commits May 14, 2025 11:03
When a block is connected, if the new block had anything relevant to the
wallet, update the best block record on disk. If not, also sync the best
block record to disk every 144 blocks.

Also reuse the new WriteBestBlock method in BackupWallet.
The only reason to call chainStateFlushed during wallet loading is to
ensure that the best block is written. Do these writes explicitly to
prepare for removing chainStateFlushed, while also ensuring that the
wallet's in memory state tracking is written to disk.

Additionally, after rescanning on wallet loading, instead of writing
the locator for the current chain tip, write the locator for the last
block that the rescan had scanned. This ensures that the stored best
block record matches the wallet's current state.

Any blocks dis/connected during the rescan are processed after the
rescan and the last block processed will be updated accordingly.
Migrating a wallet requires having a bestblock record. This is always
the case in normal operation as such a record is always written on
wallet loading if it didn't already exist. However, within the unit
tests and benchmarks, this is not guaranteed. Since migration requires
the record, WalletMigration needs to also add this record before the
benchmark.
chainStateFlushed is no longer needed since the best block is updated
after a block is scanned. Since the chainstate being flushed does not
necessarily coincide with the wallet having processed said block, it
does not entirely make sense for the wallet to be recording that block
as its best block, and this can cause race conditions where some blocks
are not processed. Thus, remove this notification.
StopWallets, which was being called prior to UnloadWallets, performs an
unnecessary database closing step. This causes issues in UnloadWallets
which does additional database cleanups. Since the database closing step
is unnecessary, StopWallets is removed, and UnloadWallets is now called
from WalletLoaderImpl::stop.
Since CWallet::chainStateFlushed is now no-op, this test no longer tests
the concurrent writes scenario. There are no other cases where multiple
DatabaseBatches are open at the same time.
Copy link
Contributor

@rkrux rkrux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 30a94b1

I like how this PR brings the best/last block processed updates both in memory & disk in unison, making it easier to reason about the wallet code ultimately. The detailed commit messages aids review.

I have verified that the location of such changes makes it possible for the following wallet RPCs to correctly handle best block related memory & disk updates:

createwallet
loadwallet
restorewallet
unloadwallet
backupwallet

I feel that updating the locator when wallet transactions are found in the latest block & periodically every 144 blocks in blockConnected notifications, and also going back to the previous best block hash in the corresponding blockConnected notification makes it more intuitive & easier to reason about. There's a possibility that the best block locator will be updated frequently if the wallet has transactions in contiguous blocks but I don't think that's a big deal as it's relatively less likely to occur.

Even though I had not read the chainStateFlushed code prior to this PR, I found consistent usage of that terminology within wallet instance slightly distracting & thus support its removal.

I prefer the SetLastBlockProcessedInMem naming as it clearly mentions the storage that's being updated. Also in favour of SetLastBlockProcessed function updating in memory & disk in one go.

I realised that there's an opportunity to put m_last_block_processed & m_last_block_processed_height together in a struct as they are more often than not used together, I see a related refactor has been alluded to in this comment: #30221 (comment)

@@ -3220,13 +3224,21 @@ bool CWallet::AttachChain(const std::shared_ptr<CWallet>& walletInstance, interf

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Within AttachChain itself, there is another opportunity to use setLastBlockProcessedInMem here:

bitcoin/src/wallet/wallet.cpp

Lines 3137 to 3143 in c779ee3

if (tip_height) {
walletInstance->m_last_block_processed = chain.getBlockHash(*tip_height);
walletInstance->m_last_block_processed_height = *tip_height;
} else {
walletInstance->m_last_block_processed.SetNull();
walletInstance->m_last_block_processed_height = -1;
}

The cs_wallet is held at the start of this function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do if I need to retouch.

Comment on lines 3231 to 3224
Copy link
Contributor

@rkrux rkrux May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this makes sense to me.

Limiting the writing of best block to disk mainly to cases1 that are actively involved in processing the new blocks seems logical. Since these cases will be updating the best block periodically & conditionally, it's prudent to not interfere by doing the same via other user driven RPCs that work on different criterion as highlighted in the above comments.

This is because they are expected to be used during normal operation when the wallet will still be handling incoming blocks and transactions

I find this corroborated here in the scan function:

bitcoin/src/wallet/wallet.cpp

Lines 1893 to 1896 in c779ee3

// If rescanning was triggered with cs_wallet permanently locked (AttachChain), additional blocks that were connected during the rescan
// aren't processed here but will be processed with the pending blockConnected notifications after the lock is released.
// If rescanning without a permanent cs_wallet lock, additional blocks that were added during the rescan will be re-processed if
// the notification was processed and the last block height was updated.

Footnotes

  1. blockConnected, blockDisconnected notifications

Comment on lines +3239 to +3240
// Also save the best block locator because rescanning only updates it intermittently.
walletInstance->SetLastBlockProcessed(*scan_res.last_scanned_height, scan_res.last_scanned_block);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I verified that ScanForWalletTransactions does update the best block locator on disk intermittently:

bitcoin/src/wallet/wallet.cpp

Lines 1875 to 1883 in c779ee3

if (save_progress && next_interval) {
CBlockLocator loc = m_chain->getActiveChainLocator(block_hash);
if (!loc.IsNull()) {
WalletLogPrintf("Saving scan progress %d.\n", block_height);
WalletBatch batch(GetDatabase());
batch.WriteBestBlock(loc);
}
}

But I don't see it updating m_last_block_processed, instead it reads it. Even if m_last_block_processed comes out to be same as the best block locator at the end in the AttachChain case, I find updating m_last_block_processed and best block locator here after rescan better both in terms of data consistency and readability - gives more confidence.

walletInstance->SetLastBlockProcessed(*scan_res.last_scanned_height, scan_res.last_scanned_block);
}
}
walletInstance->m_attaching_chain = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though these m_attaching_chain updates were done in AttachChain function, it looks quite cleaner now to get rid of this low level boolean from wallet instance, makes it easier to wrap head around AttachChain.

}
walletInstance->m_attaching_chain = false;
// Set and update the best block record
// Set last block scanned as the last block processed as it may be different in case the case of a reorg.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as it may be different in case the case of a reorg.

Redundancy if retouched.

# no longer the case (after the unclean shutdown, the node's chain returned to the pre-invalidation tip).
# This issue blocks any future spending and results in an incorrect balance display.
# After disconnecting the block, the wallet should record the new best block.
# Upon reload after the crash, since the chainstate was not flushed, the tip contains the previously abandoned
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the chainstate was not flushed

I now realise that this refers to the node specific chainstate.
Having seeing the removal of chain state flushed function in the wallet, I initially thought it referred to that (which doesn't exist anymore).

# Due to an existing bug, the wallet incorrectly keeps the transaction in an abandoned state, even though that's
# no longer the case (after the unclean shutdown, the node's chain returned to the pre-invalidation tip).
# This issue blocks any future spending and results in an incorrect balance display.
# After disconnecting the block, the wallet should record the new best block.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the wallet should record the new best block.

I assume there is no way for us to verify this before the crash and after tip invalidation. The wallet info returns value based on the one that's in memory: m_last_block_processed

wallet.getwalletinfo()['lastprocessedblock']['hash']

Copy link
Member Author

@achow101 achow101 May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what would be verified? After invalidateblock and before the crash, we do verify that the coinbase transaction is marked abandoned. This check is to verify that starting after the crash, the wallet has the correct before-invalidate state.

Copy link
Contributor

@rkrux rkrux May 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what would be verified? After invalidateblock and before the crash, we do verify that the coinbase transaction is marked abandoned.

I wanted to check that the best block locator is indeed persisted to disk after invalidation but the getwalletinfo RPC reads lastprocessedblock from memory, which can't be used for verifying this PR's change. The coinbase transaction verification was done before this PR as well, so that also not verifies.

Upon reload after the crash, since the chainstate was not flushed, the tip contains the previously abandoned coinbase. This should be rescanned and now un-abandoned.

While this comment is helpful, but I'd prefer to assert the "rescanning" portion in the test. Previously, I had in mind some functionality that reads from disk while returning the response in RPC but that might be going a bit too far just for tests. The other easier way I found out is to verify using the debug log from AttachChain that confirms the rescan indeed does happen - which tells me that the best block locator was persisted to disk after block invalidation and before node crash.

walletInstance->WalletLogPrintf("Rescanning last %i blocks (from block %i)...\n", *tip_height - rescan_height, rescan_height);

Functional test diff that worked in my system:

@@ -104,6 +104,7 @@ class ReorgsRestoreTest(BitcoinTestFramework):
 
         # Disconnect tip and sync wallet state
         tip = wallet.getbestblockhash()
+        tip_height = wallet.getblockstats(tip)["height"]
         wallet.invalidateblock(tip)
         wallet.syncwithvalidationinterfacequeue()
 
@@ -116,7 +117,8 @@ class ReorgsRestoreTest(BitcoinTestFramework):
         node.kill_process()
 
         # Restart the node and confirm that it has not persisted the last chain state changes to disk
-        self.start_node(0)
+        with self.nodes[0].assert_debug_log(expected_msgs=[f"Rescanning last 1 blocks (from block {tip_height - 1})...\n"]):
+            self.start_node(0)
         assert_equal(node.getbestblockhash(), tip)
 
         # After disconnecting the block, the wallet should record the new best block.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. I think that's unnecessary and the current test is sufficient to verify the behavior.

# Upon reload after the crash, since the chainstate was not flushed, the tip contains the previously abandoned
# coinbase. This should be rescanned and now un-abandoned.
wallet = node.get_wallet_rpc("reorg_crash")
assert_equal(wallet.getwalletinfo()['immature_balance'], 0) # FIXME: #31824.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- assert_equal(wallet.getwalletinfo()['immature_balance'], 0) # FIXME: #31824.
+ assert_greater_than(wallet.getwalletinfo()['immature_balance'], 0)

Nit: The correct balance is there, can be asserted.

# coinbase. This should be rescanned and now un-abandoned.
wallet = node.get_wallet_rpc("reorg_crash")
assert_equal(wallet.getwalletinfo()['immature_balance'], 0) # FIXME: #31824.
assert_equal(wallet.gettransaction(coinbase_tx_id)['details'][0]['abandoned'], False)
Copy link
Contributor

@rkrux rkrux May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.log.info("Test that wallet doesn't crash due to a duplicate block disconnection event after an unclean shutdown")

IMO the test log should be updated to mention this improvement also.
That the wallet transactions are not missed anymore in case of temporarily invalidated blocks with unclean node shutdown.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do if I need to retouch.

Copy link
Contributor

@mzumsande mzumsande left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review ACK 30a94b1

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK 30a94b1. Only changes since last review are using WriteBestBlock method more places and updating comments.

It looks like there are some small review comments above that could be followed up, but with 3 acks I think it would be good to merge this shortly and let smaller improvements can be made later. I plan to merge this soon after testing locally.


if (!m_last_block_processed.IsNull()) {
CBlockLocator loc;
chain().findBlock(m_last_block_processed, FoundBlock().locator(loc));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "wallet: Update best block record after block dis/connect" (7bacabb)

IMO would be good to assert !loc.IsNull() as a sanity check to ensure that doesn't happen. (It shouldn't, but node or wallet code could change in the future and invalidate current assumptions.)

@rkrux
Copy link
Contributor

rkrux commented May 21, 2025

It looks like there are some small review comments above that could be followed up, but with 3 acks I think it would be good to merge this shortly and let smaller improvements can be made later. I plan to merge this soon after testing locally.

I did the follow-ups in #32580.

Dorex45

This comment was marked as outdated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

wallet: wrong balance and crash after reorg and unclean shutdown

9 participants