Since commit f08c9fb0c6 from PR
https://github.com/bitcoin/bitcoin/pull/21726, index
`BlockUntilSyncedToCurrentChain` behavior has been less reliable, and there has
also been a race condition in the `coinstatsindex_initial_sync` unit test.
It seems better for `BlockUntilSyncedToCurrentChain` to actually wait for the
last connected block to be fully processed, than to be able to return before
prune locks are set, so this switches the order of `m_best_block_index =
block;` and `UpdatePruneLock` statements in `SetBestBlockIndex` to make it more
reliable.
Also since commit f08c9fb0c6, there has been a
race condition in the `coinstatsindex_initial_sync` test. Before that commit,
the atomic index best block pointer `m_best_block_index` was updated as the
last step of `BaseIndex::BlockConnected`, so `BlockUntilSyncedToCurrentChain`
could safely be used in tests to wait for the last `BlockConnected`
notification to be finished before stopping and destroying the index. But
after that commit, calling `BlockUntilSyncedToCurrentChain` is no longer
sufficient, and there is a race between the test shutdown code which destroys
the index object and the new code introduced in that commit calling
`AllowPrune()` and `GetName()` on the index object. Reproducibility
instructions for this are in
https://github.com/bitcoin/bitcoin/issues/25365#issuecomment-1259744133
This commit fixes the `coinstatsindex_initial_sync` race condition, even though
it will require an additional change to silence TSAN false positives,
https://github.com/bitcoin/bitcoin/pull/26188, after it is fixed. So this
partially addresses but does not resolve the bug reporting TSAN errors
https://github.com/bitcoin/bitcoin/issues/25365.
There is no known race condition outside of test code currently, because the
bitcoind `Shutdown` function calls `FlushBackgroundCallbacks` not
`BlockUntilSyncedToCurrentChain` to safely shut down.
Co-authored-by: Vasil Dimov <vd@FreeBSD.org>
Co-authored-by: MacroFake <falke.marco@gmail.com>
Github-Pull: #26215
Rebased-From: 8891949bdc
00eeb31c76 scripted-diff: rename CChainState -> Chainstate (James O'Beirne)
Pull request description:
Alright alright alright, I know: we hate refactors. We especially hate cosmetic refactors.
Nobody knows better than I that changing broad swaths of code out from under our already-abused collaborators, only to send a cascade of rebase bankruptcies, is annoying at best and sadistic at worst. And for a rename! The indignation!
But just for a second, imagine yourself. Programming `bitcoin/bitcoin`, on a sandy beach beneath a lapis lazuli sky. You go to type the name of what is probably the most commonly used data structure in the codebase, and you *only hit shift once*.
What could you do in such a world? You could do anything. [The only limit is yourself.](https://zombo.com/)
---
So maybe you like the idea of this patch but really don't want to deal with rebasing. You're in luck!
Here're the commands that will bail you out of rebase bankruptcy:
```sh
git rebase -i $(git merge-base HEAD master) \
-x 'sed -i "s/CChainState/Chainstate/g" $(git ls-files | grep -E ".*\.(py|cpp|h)$") && git commit --amend --no-edit'
# <commit changed?>
git add -u && git rebase --continue
```
---
~~Anyway I'm not sure how serious I am about this, but I figured it was worth proposing.~~ I have decided I am very serious about this.
Maybe we can have nice things every once in a while?
ACKs for top commit:
MarcoFalke:
cr ACK 00eeb31c76
hebasto:
ACK 00eeb31c76
glozow:
ACK 00eeb31c76, thanks for being the one to propose this
w0xlt:
ACK 00eeb31c76
Tree-SHA512: b828a99780614a9b74f7a9c347ce0687de6f8d75232840f5ffc26e02bbb25a3b1f5f9deabbe44f82ada01459586ee8452a3ee2da05d1b3c48558c8df6f49e1b1
faa3d38ec6 refactor: Pass reference to LookUpStats (MacroFake)
Pull request description:
I find it confusing to have an interface that accepts nullptr, but immediately crashes the program when someone does pass nullptr.
Fix that.
Also some include fixups.
ACKs for top commit:
aureleoules:
ACK faa3d38ec6
Tree-SHA512: f90b649e9991e137b83a9899258ee73605719c081a6b789ac27fe7fe73eb70fbb41d89479bcd536d5c3ad788a5795de8451bc1b94e5c9267dcf9636d9e4a1109
11780f29e7 doc: BaseIndex sync behavior with empty datadir (James O'Beirne)
Pull request description:
Make a note about a potentially confusing behavior with `BaseIndex::m_synced`;
if the user starts bitcoind with an empty datadir and an index enabled,
BaseIndex will consider itself synced (as a degenerate case). This affects
how indices are built during IBD (relying solely on BlockConnected signals vs.
using ThreadSync()).
ACKs for top commit:
mzumsande:
ACK 11780f29e7
Tree-SHA512: 0b530379e57c62e05d2ddca7bb8e2c936786fa00678f9eaa1bb3742d957c48f141d46f936734b03f6673d964abc7eb72c1769f1784b9d3563d218e96296b7afd
Make a note about a potentially confusing behavior with `BaseIndex::m_synced`;
if the user starts bitcoind with an empty datadir and an index enabled,
BaseIndex will consider itself synced (as a degenerate case). This affects
how indices are built during IBD (relying solely on BlockConnected signals vs.
using ThreadSync()).
facc2fa7b8 Use AutoFile where possible (MacroFake)
6666803c89 streams: Add AutoFile without ser-type and ser-version (MacroFake)
Pull request description:
This was done in the context of https://github.com/bitcoin/bitcoin/pull/25284 , but I think it also makes sense standalone.
The basic idea is that serialization type should not be initialized when it is not needed. Same for the serialization version.
So do this here for `AutoFile`. `CAutoFile` remains in places where it is not yet possible.
ACKs for top commit:
laanwj:
Code review ACK facc2fa7b8
fanquake:
ACK facc2fa7b8
Tree-SHA512: d82d024d55af57565ac53d9d1517afafc12b46964effba0332de62a6c77869356fa77f89e6d4834438fff44c45b64fccdf5a1358bfea03e28dfe55013b3c099d
faf98aecf8 Remove unused includes in rpc/fees.cpp (MacroFake)
1111ddeedf Remove unused includes from dbwrapper.h (MacroFake)
fa77fdd047 Add missing includes (MacroFake)
fa869ce2c2 Add missing includes to node/chainstate (MacroFake)
Pull request description:
Unused includes are confusing, but also cause unrelated compile errors when the unused includes were to be removed.
Fix that by adding the missing includes where they are needed and then remove them where they are not needed. This is also checked by iwyu.
ACKs for top commit:
hebasto:
ACK faf98aecf8, I have reviewed the code and it looks OK, I agree it can be merged.
jarolrod:
Code Review ACK faf98aecf8
Tree-SHA512: 75f3c6e6f6ecf8a98233e1a1463c75ca4e0eb3ec341150d274141072fe95413a3c2ec6386d1c527899cc63d43f63f5eb5991509847412773362808ddfb1bb435
Replace CommitInternal method with CustomCommit and use interfaces::Chain
instead of CChainState to generate block locator.
This commit does not change behavior in any way, except in the
(m_best_block_index == nullptr) case, which was added recently in
https://github.com/bitcoin/bitcoin/pull/24117 as part of an ongoing attempt to
prevent index corruption if bitcoind is interrupted during startup. New
behavior in that case should be slightly better than the old behavior (skipping
the entire custom+base commit now vs only skipping the base commit previously)
and this might avoid more cases of corruption.
Replace Rewind method with CustomRewind and pass block hashes and
heights instead of CBlockIndex* pointers
This commit does not change behavior in any way.
Replace overriden index Init() methods that use the best block
CBlockIndex* pointer with pure CustomInit() callbacks that are passed
the block hash and height.
This gets rid of more CBlockIndex* pointer uses so indexes can work
outside the bitcoin-node process. It also simplifies the initialization
call sequence so index implementations are not responsible for
initializing the base class.
There is a slight change in behavior here since now the best block
pointer is loaded and checked before the custom index init functions are
called instead of while they are called.
e734228d85 Update GCSFilter benchmarks (Calvin Kim)
aee9a8140b Add GCSFilterDecodeSkipCheck benchmark (Patrick Strateman)
299023c1d9 Add GCSFilterDecode and GCSBlockFilterGetHash benchmarks. (Patrick Strateman)
b0a53d50d9 Make sanity check in GCSFilter constructor optional (Patrick Strateman)
Pull request description:
This PR picks up the abandoned #19280
BlockFilterIndex was depending on `GolombRiceDecode()` during the filter decode to sanity check that the filter wasn't corrupt. However, we can check for corruption by ensuring that the encoded blockfilter's hash matches up with the one stored in the index database.
Benchmarks that were added in #19280 showed that checking the hash is much faster.
The benchmarks were changed to nanobench and the relevant benchmarks were like below, showing a clear win for the hash check method.
```
| ns/elem | elem/s | err% | ins/elem | bra/elem | miss% | total | benchmark
|--------------------:|--------------------:|--------:|----------------:|---------------:|--------:|----------:|:----------
| 531.40 | 1,881,819.43 | 0.3% | 3,527.01 | 411.00 | 0.2% | 0.01 | `DecodeCheckedGCSFilter`
| 258,220.50 | 3,872.66 | 0.1% | 2,990,092.00 | 586,706.00 | 1.7% | 0.01 | `DecodeGCSFilter`
| 13,036.77 | 76,706.09 | 0.3% | 64,238.24 | 513.04 | 0.2% | 0.01 | `BlockFilterGetHash`
```
ACKs for top commit:
mzumsande:
Code Review ACK e734228d85
theStack:
Code-review ACK e734228d85
stickies-v:
ACK e734228d85
ryanofsky:
Code review ACK e734228d85, with caveat that I mostly paid attention to the main code, not the changes to the benchmark. Only changes since last review were changes to the benchmark code.
Tree-SHA512: 02b86eab7b554e1a57a15b17a4d6d71faa91b556c637b0da29f0c9ee76597a110be8e3b4d0c158d4cab04af0623de18b764837be0ec2a72afcfe1ad9c78a83c6
The indexing codepath logic in node/coinstats.cpp is simple enough to be
moved into CoinStatsIndex::LookUpStats, avoiding an additional layer of
function calls. Callers are modified accordingly.
Also, add 2 missed BOOST_CHECKs to the coinstatsindex_initial_sync unit
test.
7171ebc7cb index: Don't commit a best block before indexing it during sync (Martin Zumsande)
Pull request description:
This changes the periodic commit of the best block during the index sync phase to use the already indexed predecessor of the current block index, instead of committing the current one that will only be indexed (by calling `WriteBlock()`) after committing the best block.
The previous code would leave the index database in an inconsistent state until the block is actually indexed - if an unclean shutdown happened at just this point in time, the index could get corrupted because at next startup, we'd assume that we have already indexed this block.
ACKs for top commit:
ryanofsky:
Code review ACK 7171ebc7cb. Looks great! Just commit message changes since last review
Tree-SHA512: a008de511dd6a1731b7fdf6a90add48d1e53f7f7d6402672adb83e362677fc5b9f5cd021d3111728cb41d73f1b9c2140db79d7e183df0ab359cda8c01b0ef928
Committing a block prior to indexing would leave the index database
in an inconsistent state until it is indexed, which could corrupt the
index in case of a unclean shutdown. Thus commit its predecessor.
Co-authored-by: Luke Dashjr <luke-jr+git@utopios.org>
f64aa9c411 Disallow more unsafe string->path conversions allowed by path append operators (Ryan Ofsky)
Pull request description:
Add more `fs::path` `operator/` and `operator+` overloads to prevent unsafe string->path conversions on Windows that would cause strings to be decoded according to the current Windows locale & code page instead of the correct string encoding.
Update application code to deal with loss of implicit string->path conversions by calling `fs::u8path` or `fs::PathFromString` explicitly, or by just changing variable types from `std::string` to `fs::path` to avoid conversions altogether, or make them happen earlier.
In all cases, there's no change in behavior either (1) because strings only contained ASCII characters and would be decoded the same regardless of what encoding was used, or (2) because of the 1:1 mapping between paths and strings using the `PathToString` and `PathFromString` functions.
Motivation for this PR was just that I was experimenting with #24469 and noticed that operations like `fs::path / std::string` were allowed, and I thought it would be better not to allow them.
ACKs for top commit:
hebasto:
ACK f64aa9c411
Tree-SHA512: 944cce49ed51537ee7a35ea4ea7f5feaf0c8fff2fa67ee81ec5adebfd3dcbaf41b73eb35e49973d5f852620367f13506fd12a7a9b5ae3a7a0007414d5c9df50f
Add more fs::path operator/ and operator+ overloads to prevent unsafe
string->path conversions on Windows that would cause strings to be
decoded according to the current Windows locale & code page instead of
the correct string encoding.
Update application code to deal with loss of implicit string->path
conversions by calling fs::u8path or fs::PathFromString explicitly, or
by just changing variable types from std::string to fs::path to avoid
conversions altoghther, or make them happen earlier.
In all cases, there's no change in behavior either (1) because strings
only contained ASCII characters and would be decoded the same regardless
of what encoding was used, or (2) because of the 1:1 mapping between
paths and strings using the PathToString and PathFromString functions.
Co-authored-by: Hennadii Stepanov <32963518+hebasto@users.noreply.github.com>
fa996c58e8 refactor: Avoid integer overflow in ApplyStats when activating snapshot (MarcoFalke)
fac01888d1 Move AdditionOverflow to util, Add CheckedAdd with unit tests (MarcoFalke)
fa526d8fb6 Add dev doc to CCoinsStats::m_hash_type and make it const (MarcoFalke)
faff051560 style: Remove unused whitespace (MarcoFalke)
Pull request description:
A snapshot contains the utxo set, including the out value. To activate the snapshot, the hash needs to be calculated. As a side-effect, the total amount in the snapshot is calculated (as the sum of all out values), but never used. Instead of running into an integer overflow in an unused result, don't calculate the result in the first place.
Other code paths (using the active utxo set) can not run into an integer overflow, since the active utxo set is valid.
Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=39716
ACKs for top commit:
shaavan:
reACK fa996c58e8
vasild:
ACK fa996c58e8
Tree-SHA512: 4f207f634841f6f634fd02ae1e5907e343fd767524fd0e8149aa99fa9a1834fe50167d14874834d45236e9c325d567925f28129bacb7d80be29cf22277a16a14
fab6d6b2d1 Move pindexBestInvalid to ChainstateManager (MarcoFalke)
facd2137ec Move m_failed_blocks to ChainstateManager (MarcoFalke)
fa47b5c100 Move AcceptBlockHeader to ChainstateManager (MarcoFalke)
fa3d62cf7b Move FindForkInGlobalIndex from BlockManager to CChainState (MarcoFalke)
Pull request description:
Move globals or members of the wrong class to the right class.
ACKs for top commit:
naumenkogs:
ACK fab6d6b2d1
Sjors:
ACK fab6d6b2d1
shaavan:
ACK fab6d6b2d1
Tree-SHA512: 926cbdfa22838517497bacb79ed5f521f64117c2aacf96a0176f62831b4713314a32abc0213df5ee067edf63e4a4300f752a26006d36e5aab415bb91209a271f
The helper was moved in commit b026e318c3,
which also mentioned that it could be moved to CChainState. So do that,
as the functionality is not block-storage related.
This also allows to drop one function argument.