This commit effectively splits the "load block index itself" logic from
"derive Chainstate variables from loaded block index" logic.
This means that BlockManager::LoadBlockIndex{,DB} will only load what's
relevant to the BlockManager.
I strongly recommend reviewing with the following git-diff flags:
--color-moved=dimmed_zebra --color-moved-ws=allow-indentation-change
6c23c41561 refactor: Rewrite AddToBlockIndex with try_emplace (Carl Dong)
c05cf7aa1e style: Modernize range-based loops over m_block_index (Carl Dong)
c2a1655799 style-only: Use using instead of typedef for BlockMap (Carl Dong)
dd79dad175 refactor: Rewrite InsertBlockIndex with try_emplace (Carl Dong)
531dce0347 tests: Remove now-unnecessary manual Unload's (Carl Dong)
bec86ae326 blockstorage: Make m_block_index own CBlockIndex's (Carl Dong)
Pull request description:
Part of: #24303
Split off from: #22564
```
Instead of having CBlockIndex's live on the heap, which requires manual
memory management, have them be owned by m_block_index. This means that
they will live and die with BlockManager.
```
The second commit demonstrates how this makes calls to `Unload()` to satisfy the address sanitizer unnecessary.
ACKs for top commit:
ajtowns:
ACK 6c23c41561
MarcoFalke:
re-ACK 6c23c41561🎨
Tree-SHA512: 81b2b5119be27cc0f8a9457b11da60cc60930315d2a5be36be89fe253d32073ffe622348ff153114b9b3212197bddbc791810913a43811b33cc58e7162bd105b
ae9ceed3e2 validation, refactoring: remove ChainstateManager::Reset() (Jon Atack)
daad0093e3 validation: replace lock with annotation in UnloadBlockIndex() (Jon Atack)
Pull request description:
Thread safety refactoring seen in #24177:
- replace re-acquiring lock cs_main with a thread safety annotation in UnloadBlockIndex()
- remove ChainstateManager::Reset(), as it is currently unused (can be reintroduced in the test utilities if needed for unit testing)
ACKs for top commit:
laanwj:
Code review ACK ae9ceed3e2
vasild:
ACK ae9ceed3e2
klementtan:
crACK ae9ceed3e2
Tree-SHA512: cebb782572997cc2dda01590d6bb6c5e479e8202324d8b6ff459b814ce09e818b996c881736bfebd1b8bf4b6d7a0f79faf3ffea176a4699dd7d7429de2db2d13
fa7991601c Fixup style of VerifyDB (MarcoFalke)
fa462ea787 Avoid implicit-integer-sign-change in VerifyLoadedChainstate (MarcoFalke)
Pull request description:
This happens when checking all blocks (`-1`).
To test:
```
./configure CC=clang CXX=clang++ --with-sanitizers=undefined,integer
make
UBSAN_OPTIONS="suppressions=$(pwd)/test/sanitizer_suppressions/ubsan:print_stacktrace=1:halt_on_error=1:report_error_type=1" ./test/functional/rpc_blockchain.py
ACKs for top commit:
theStack:
Code-review ACK fa7991601c
brunoerg:
crACK fa7991601c
Tree-SHA512: bcbe6becf2fbedd21bbde83a544122e79465937346802039532143b2e4165784905a8852c0ccb088b964874df5e5550931fdde3629cbcee3ae237f2f63c43a8e
Instead of having CBlockIndex's live on the heap, which requires manual
memory management, have them be owned by m_block_index. This means that
they will live and die with BlockManager.
A change to BlockManager::LookupBlockIndex:
- Previously, it was a const member function returning a non-const CBlockIndex*
- Now, there's are const and non-const versions of
BlockManager::LookupBlockIndex returning a CBlockIndex with the same
const-ness as the member function:
(e.g. const CBlockIndex* LookupBlockIndex(...) const)
See next commit for some weirdness that this eliminates.
The range based for-loops are modernize (using auto + destructuring) in
a future commit.
48742693ac Replace "can not" with "cannot" in docs, user messages, and tests (Jon Atack)
e670edd434 User-facing content fixups from transifex translator feedback (Jon Atack)
Pull request description:
Closes#24366.
ACKs for top commit:
laanwj:
Code review re-ACK 48742693ac
hebasto:
re-ACK 48742693ac, only suggested change since my previous [review](https://github.com/bitcoin/bitcoin/pull/24367#pullrequestreview-885938219).
Tree-SHA512: 4dcdcb417251a413e65fab6070515e13a1267c8e0dbcf521386b842511391f24c84a0c2168fe13458c977682034466509bf2a3453719d4d94d3c568fd9f4adb4
799968e8b3 tracing: misc follow-ups to 22902 (0xb10c)
36a6584703 tracing: correctly scope utxocache:flush tracepoint (Arnab Sen)
Pull request description:
This PR is a follow-up to the [#22902](https://github.com/bitcoin/bitcoin/pull/22902).
Previously, the tracepoint `utxocache:flush` was called, even when it was not flushing. So, the tracepoint is now scoped to write only when coins cache to disk.
ACKs for top commit:
0xB10C:
ACK 799968e8b3
Tree-SHA512: ebb096cbf991c551c81e4339821f10d9768c14cf3d8cb14d0ad851acff5980962228a1c746914c6aba3bdb27e8be53b33349c41efe8bab5542f639916e437b5f
eb8b22d517 block_connected: re-use previous GetTimeMicros (William Casarin)
80e1c55687 block_connected: don't serialize block hash twice (William Casarin)
Pull request description:
In the validation:block_connected tracepoint, we call block->GetHash(), which
ends up calling CBlockHeader::GetHash(), executing around 8000 serialization
instructions. We don't need to do this extra work, because block->GetHash() is
already called further up in the function. Let's save that value as a local
variable and re-use it in our tracepoint so there is no unnecessary tracepoint
overhead.
Shave off an extra 100 or so instructions from the validation:block_connected
tracepoint by reusing a nearby GetTimeMicros(). This brings the tracepoint down
to 54 instructions. Still high, but much better than the previous ~154 and
8000 instructions which it was originally.
Signed-off-by: William Casarin <jb55@jb55.com>
ACKs for top commit:
0xB10C:
ACK eb8b22d517
laanwj:
Code review ACK eb8b22d517
theStack:
re-ACK eb8b22d517
Tree-SHA512: 92ae585e487554e0f73042a8abaa239f630502c1d198e010bd7c1de252d882bccb627bbf0e4faec09c1253e782b145bcf153f9fee78cdb8456188044a96f8267
f485a07454 Add missing thread safety lock assertions in validation.h (Jon Atack)
37af8a20cf Add missing thread safety lock assertions in validation.cpp (Jon Atack)
Pull request description:
A number of functions in validation.{h,cpp} have a thread safety lock annotation in the declaration but are missing the corresponding run-time lock assertion in the definition.
ACKs for top commit:
hebasto:
re-ACK f485a07454, only suggested change since my [previous](https://github.com/bitcoin/bitcoin/pull/24177#pullrequestreview-877810465) review.
vasild:
ACK f485a07454
Tree-SHA512: c86c0c0e8fe6ec7ae9ed9890f1dd7d042aa482ecf99feb6679a670aa004f6e9a99f7bc047205a34968fab7f1f841898c59b48c3ed6245c166e3b5abbf0867445
020acea99b refactor: replace RecursiveMutex m_chainstate_mutex with Mutex (w0xlt)
ddeefeef20 refactor: add negative TS annotations for `m_chainstate_mutex` (w0xlt)
1dfd31bc26 scripted-diff: rename m_cs_chainstate -> m_chainstate_mutex (w0xlt)
Pull request description:
This PR is related to #19303 and gets rid of the `RecursiveMutex m_cs_chainstate`.
`m_cs_chainstate` is only held in `ActivateBestChain()` and `InvalidateBlock()`.
So apparently there is no recursion involved, so the `m_cs_chainstate` can be a non-recursive mutex.
ACKs for top commit:
hebasto:
ACK 020acea99b, I have reviewed the code and it looks OK, I agree it can be merged.
theStack:
Code-review ACK 020acea99b🌴
shaavan:
reACK 020acea99b
Tree-SHA512: c7c16e727e326df3410514915ce753a2a5e1da78857ef965ef683e36251e1b73c9cced4cd5231b04dbe2be0ea14084f6731b4d7a4d9a8e086e982b985e37e4b4
fa5d2e678c Remove unused char serialize (MarcoFalke)
fa24493d63 Use spans of std::byte in serialize (MarcoFalke)
fa65bbf217 span: Add BytePtr helper (MarcoFalke)
Pull request description:
This changes the serialize code (`.read()` and `.write()` functions) to take a `Span` instead of a pointer and size. This is a breaking change for the serialize interface, so at no additional cost we can also switch to `std::byte` (instead of using `char`).
The benefits of using `Span`:
* Less verbose and less fragile code when passing an already existing `Span`(-like) object to or from serialization
The benefits of using `std::byte`:
* `std::byte` can't accidentally be mistaken for an integer
The goal here is to only change serialize to use spans of `std::byte`. If needed, `AsBytes`, `MakeUCharSpan`, ... can be used (temporarily) to pass spans of the right type.
Other changes that are included here:
* [#22167](https://github.com/bitcoin/bitcoin/pull/22167) (refactor: Remove char serialize by MarcoFalke)
* [#21906](https://github.com/bitcoin/bitcoin/pull/21906) (Preserve const in cast on CTransactionSignatureSerializer by promag)
ACKs for top commit:
laanwj:
Concept and code review ACK fa5d2e678c
sipa:
re-utACK fa5d2e678c
Tree-SHA512: 08ee9eced5fb777cedae593b11e33660bed9a3e1711a7451a87b835089a96c99ce0632918bb4666a4e859c4d020f88fb50f2dd734216b0c3d1a9a704967ece6f
c5b36b1c1b Mempool Update Cut-Through Optimization (Jeremy Rubin)
c49daf9885 [TESTS] Increase limitancestorcount in tournament RPC test to showcase improved algorithm (Jeremy Rubin)
Pull request description:
Often when we're updating mempool entries we update entries that we ultimately end up removing the updated entries shortly thereafter. This patch makes it so that we filter for such entries a bit earlier in processing, which yields a mild improvement for these cases, and is negligible overhead otherwise.
There's potential for a better -- but more sophisticated -- algorithm that can be used taking advantage of epochs, but I figured it is better to do something that is simple and works first and upgrade it later as the other epoch mempool work proceeds as it makes the patches for the epoch algorithm simpler to understand, so you can consider this as preparatory work. It could either go in now if it is not controversial, or we could wait until the other patch is ready to go.
ACKs for top commit:
instagibbs:
reACK c5b36b1
sipa:
utACK c5b36b1c1b
mzumsande:
Code Review ACK c5b36b1c1b
Tree-SHA512: 78b16864f77a637d8a68a65e23c019a9757d8b2243486728ef601d212ae482f6084cf8e69d810958c356f1803178046e4697207ba40d6d10529ca57de647fae6
No behavior change.
This code was introduced in 5add7a7 before we required C++11, which is
why the struct was needed. As we are now using more modern C++ and this
is the only place where lockpoints are updated for mempool entries, it
is more idiomatic to call `modify` with a lambda.
Co-authored-by: Hennadii Stepanov <32963518+hebasto@users.noreply.github.com>
Behavior change: don't quit right after LimitMempoolSize() when a
package is partially submitted. We should still send
TransactionAddedToMempool notifications for
transactions that were submitted.
Not behavior change: add a new package validation result for mempool logic errors.
The previous interface required callers to guess that the tx had been
swapped and look up the tx again by txid to find a `MEMPOOL_ENTRY`
result. This is a confusing interface.
Instead, explicitly tell the caller that this transaction was
`DIFFERENT_WITNESS` in the result linked to the mempool entry's wtxid.
This gives the caller all the information they need in 1 lookup, and
they can query the mempool for the other transaction if needed.
fac22fd36b log: Remove GetAdjustedTime from IBD header progress estimation (MarcoFalke)
Pull request description:
This is a "refactor" that shouldn't change behaviour, because the two times are most likely equal. A minimum of 5 outbound peers are needed to adjust the time. And if the time is adjusted, it will be by at most 70 minutes (`DEFAULT_MAX_TIME_ADJUSTMENT`). Thus, the progress estimate should differ by at most 7 blocks.
ACKs for top commit:
laanwj:
Code review ACK fac22fd36b
vincenzopalazzo:
ACK fac22fd36b
Tree-SHA512: bf9f5eef66db0110dd268cf6dbfab64b9c11ba776924f5b386ceae3f2d005272cceb87ebcc96e0c8b854c051514854a2a5af39ae43bad008fac685b5aafaabd0
Shave off an extra 100 or so instructions from the
validation:block_connected tracepoint by reusing a nearby
GetTimeMicros(). This brings the tracepoint down to 54 instructions.
Still high, but much better than the previous ~154 and 8000 instructions
which it was originally.
Signed-off-by: William Casarin <jb55@jb55.com>
In the validation:block_connected tracepoint, we call block->GetHash(),
which ends up calling CBlockHeader::GetHash(), executing around 8000
serialization instructions. We don't need to do this extra work, because
block->GetHash() is already called further up in the function. Let's
save that value as a local variable and re-use it in our tracepoint so
there is no unnecessary tracepoint overhead.
Signed-off-by: William Casarin <jb55@jb55.com>
BlockManager is a large data structure, and cs_main is not required to
take its address or access every part of it. Individual BlockManager
fields and methods which do require cs_main like m_block_index and
LookupBlockIndex are already annotated separately, and these other
annotations describe locking requirements more accurately and do a
better job enforcing thread safety.
Since cs_main is not needed to access the address of the m_block object,
this commit drops cs_main LOCK calls which were added pointlessly to
satisfy this annotation in the past.
Co-authored-by: Carl Dong <contact@carldong.me>
This is a refactor and safe to do because:
* UnloadBlockIndex calls ChainstateManager::Unload, which calls
BlockManager::Unload
* Only unit tests call Unload directly
fa7efc915b Fixup style of moved code (MarcoFalke)
fade2a44f4 Move BlockManager to node/blockstorage (MarcoFalke)
Pull request description:
`BlockManager` is responsible for reading and writing block(headers). So move it to the existing `blockstorage` module in `node`. Also, move validation code unrelated to block-storage out from `BlockManager`.
ACKs for top commit:
ryanofsky:
Code review obvious ACK fa7efc915b
Tree-SHA512: 0197943d818e5f59e743b07fbb92e7661bff90081127a41e35e5692ce49d6f6a7872448670b0da282f7714580a45c8d93e571a67177c8b5f785ce9edefe834c5
b4adc5ad67 [bugfix] update lockpoints correctly during reorg (glozow)
b6002b07a3 MOVEONLY: update_lock_points to txmempool.h (glozow)
Pull request description:
I introduced a bug in #22677 (sorry! 😅)
Mempool entries cache `LockPoints`, containing the first height/blockhash/`CBlockIndex*` at which the transaction becomes valid. During a reorg, we re-check timelocks on all mempool entries using `CheckSequenceLocks(useExistingLockPoints=false)` and remove any now-invalid entries. `CheckSequenceLocks()` also mutates the `LockPoints` passed in, and we update valid entries' `LockPoints` using `update_lock_points`. Thus, `update_lock_points(lp)` needs to be called right after `CheckSequenceLocks(lp)`, otherwise we lose the data in `lp`. I incorrectly assumed they could be called in separate loops.
The incorrect behavior introduced is: if we have a reorg in which a timelocked mempool transaction is still valid but becomes valid at a different block, the cached `LockPoints` will be incorrect.
This PR fixes the bug, adds a test, and adds an assertion at the end of `removeForReorg()` to check that all mempool entries' lockpoints are valid. You can reproduce the bug by running the test added in the [test] commit on the code before the [bugfix] commit.
ACKs for top commit:
jnewbery:
ACK b4adc5ad67
vasild:
ACK b4adc5ad67
mzumsande:
Code Review ACK b4adc5ad67
hebasto:
ACK b4adc5ad67
MarcoFalke:
re-ACK b4adc5ad67🏁
Tree-SHA512: 16b59f6ff8140d0229079ca1c6b04f2f4a00a2e49931275150e4f3fe5ac4ec109698b083fa6b223ba9511f328271cc1ab081263669d5da020af7fee83c13e401