282019cd3d refactor: add kernel/cs_main.* (fanquake)
Pull request description:
One place to find / include `cs_main`.
No more:
> // Actually declared in validation.cpp; can't include because of circular dependency.
> extern RecursiveMutex cs_main;
Ultimately, no more need to include `validation.h` (which also includes (heavy/boost filled) `txmempool.h`) everywhere for `cs_main`. See #26087 for another example of why that is useful.
ACKs for top commit:
ajtowns:
ACK 282019cd3d
Tree-SHA512: 142835b794873e7a09c3246d6101843ae81ec0c6295e6873130c98a2abfa5f7282748d0f1a37237a779cc71c3bc0a75d03b20313ef5398c83d4814215cbc8287
This makes the interface more predictable and useful. The caller
understands one or more transactions failed, and can learn what happened
with each transaction. We already have this information, so we might as
well return it.
It doesn't make sense to do this for other PackageValidationResult
values because:
- PCKG_RESULT_UNSET: this means everything succeeded, so the individual
failures are no longer accurate.
- PCKG_MEMPOOL_ERROR: something went wrong with the mempool logic;
transaction failures might not be meaningful.
- PCKG_POLICY: this means something was wrong with the package as a
whole. The caller should use the PackageValidationState to find the
error, rather than looking at individual MempoolAcceptResults.
This value creates an extremely confusing interface as its existence is
dependent upon implementation details (whether something was submitted
on its own, etc). MempoolAcceptResult::m_effective_feerate is much more
helpful, as it always exists for submitted transactions.
Bug: not setting package_state means package_state.IsValid() == true and
the caller does not know that this failed.
We won't be validating this transaction again, so it makes sense to return this
failure to the caller.
Rename package_state to package_state_quit_early to make it more clear
what this variable is used for and what its scope is.
Co-authored-by: Greg Sanders <gsanders87@gmail.com>
47c4b1f52a mempool: log/halt when CalculateMemPoolAncestors fails unexpectedly (stickies-v)
5481f65849 mempool: add AssumeCalculateMemPoolAncestors helper function (stickies-v)
f911bdfff9 mempool: use util::Result for CalculateMemPoolAncestors (stickies-v)
66e028f739 mempool: use util::Result for CalculateAncestorsAndCheckLimits (stickies-v)
Pull request description:
Upon reviewing the documentation for `CTxMemPool::CalculateMemPoolAncestors`, I noticed `setAncestors` was meant to be an `out` parameter but actually is an `in,out` parameter, as can be observed by adding `assert(setAncestors.empty());` as the first line in the function and running `make check`. This PR fixes this unexpected behaviour and introduces refactoring improvements to make intents and effects of the code more clear.
## Unexpected behaviour
This behaviour occurs only in the package acceptance path, currently only triggered by `testmempoolaccept` and `submitpackage` RPCs.
In `MemPoolAccept::AcceptMultipleTransactions()`, we first call `PreChecks()` and then `SubmitPackage()` with the same `Workspace ws` reference. `PreChecks` leaves `ws.m_ancestors` in a potentially non-empty state, before it is passed on to `MemPoolAccept::SubmitPackage`. `SubmitPackage` is the only place where `setAncestors` isn't guaranteed to be empty before calling `CalculateMemPoolAncestors`. The most straightforward fix is to just forcefully clear `setAncestors` at the beginning of CalculateMemPoolAncestors, which is done in the first bugfix commit.
## Improvements
### Return value instead of out-parameters
This PR updates the function signatures for `CTxMemPool::CalculateMemPoolAncestors` and `CTxMemPool::CalculateAncestorsAndCheckLimits` to use a `util::Result` return type and eliminate both the `setAncestors` `in,out`-parameter as well as the error string. It simplifies the code and makes the intent and effects more explicit.
### Observability
There are 7 instances where we currently call `CalculateMemPoolAncestors` without actually checking if the function succeeded because we assume that it can't fail, such as in [miner.cpp](69b10212ea/src/node/miner.cpp (L399)). This PR adds a new wrapper `AssumeCalculateMemPoolAncestors` function that logs such unexpected failures, or in case of debug builds even halts the program. It's not crucial to the objective, more of an observability improvement that seems sensible to add on here.
ACKs for top commit:
achow101:
ACK 47c4b1f52a
w0xlt:
ACK 47c4b1f52a
glozow:
ACK 47c4b1f52a
furszy:
light code review ACK 47c4b1f5
aureleoules:
ACK 47c4b1f52a
Tree-SHA512: d908dad00d1a5645eb865c4877cc0bae74b9cd3332a3641eb4a285431aef119f9fc78172d38b55c592168a73dae83242e6af3348815f7b37cbe2d448a3a58648
b2aa9e8528 Add release note for MIN_STANDARD_TX_NONWITNESS_SIZE relaxation (Greg Sanders)
8c5b3646b5 Relax MIN_STANDARD_TX_NONWITNESS_SIZE to 65 non-witness bytes (Greg Sanders)
Pull request description:
Since the original fix was set to be a "reasonable" transaction to reduce allocations and the true motivation later revealed, it makes sense to relax this check to something more principled.
There are more exotic transaction patterns that could take advantage of a relaxed requirement, such as 1 input, 1 output OP_RETURN to burn a utxo to fees for CPFP purposes when change isn't practical.
Two changes could be accomplished:
1) Anything not 64 bytes could be allowed
2) Anything above 64 bytes could be allowed
In the Great Consensus Cleanup, suggestion (2)
was proposed as a consensus change, and is the simpler of the two suggestions. It would not allow an "empty" OP_RETURN but would reduce the required padding from 22 bytes to 5.
The functional test is also modified to test the actual case
we care about: 64 bytes
Related mailing list discussions here:
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-October/020995.html
And a couple years earlier:
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2020-May/017883.html
ACKs for top commit:
achow101:
reACK b2aa9e8528
glozow:
reACK b2aa9e8528
pablomartin4btc:
re-ACK b2aa9e8528
jonatack:
ACK b2aa9e8528 with some suggestions
Tree-SHA512: c1ec1af9ddcf31b2272209a4f1ee0c5607399f8172e5a1dfd4604cf98bfb933810dd9369a5917ad122add003327c9fcf6ee26995de3aca41d5c42dba527991ad
Since the original fix was set to be a "reasonable" transaction
to reduce allocations and the true motivation later revealed,
it makes sense to relax this check to something more principled.
There are more exotic transaction patterns that could take advantage
of a relaxed requirement, such as 1 input, 1 output OP_RETURN to burn
a utxo to fees for CPFP purposes when change isn't practical.
Two changes could be accomplished:
1) Anything not 64 bytes could be allowed
2) Anything above 64 bytes could be allowed
In the Great Consensus Cleanup, suggestion (2) was the route taken.
It would not allow an "empty" OP_RETURN
but would reduce the required padding from 22 bytes to 5.
The functional test is also modified to test the actual case
we care about: 64 bytes
07dfbb5bb8 Make static nLastFlush and nLastWrite Chainstate members (Aurèle Oulès)
Pull request description:
Fixes#22189.
The `static std::multimap<uint256, FlatFilePos> mapBlocksUnknownParent; ` referenced in the issue was already fixed by #25571. I don't believe Chainstate references any other static variables.
ACKs for top commit:
jamesob:
ACK 07dfbb5bb8 ([`jamesob/ackr/26513.1.aureleoules.make_static_nlastflush_a`](https://github.com/jamesob/bitcoin/tree/ackr/26513.1.aureleoules.make_static_nlastflush_a))
theStack:
Concept and code-review ACK 07dfbb5bb8
Tree-SHA512: 0f26463c079bbc5e0e62707d4ca4c8c9bbb99edfa3391d48d4915d24e2a1190873ecd4f9f11da25b44527671cdc82c41fd8234d56a4592a246989448d34406b0
38941a703e refactor: Move `txmempool_entry.h` --> `kernel/mempool_entry.h` (Hennadii Stepanov)
Pull request description:
This PR addresses the https://github.com/bitcoin/bitcoin/pull/17786#discussion_r1027818360:
> why not move it to the right place, that is to `kernel/txmempool_entry.h`?
ACKs for top commit:
MarcoFalke:
review ACK 38941a703e📊
Tree-SHA512: 0145974b63b67ca1d9d89af2dd9d4438beca480c16a563f330da05fec49b8394d7ba20ed83cf7d50b2e19454e006978ebed42b0e07887b98d00210f3201ce9ba
d885bb2f6e test: Test exclusion of OP_RETURN from getblockstats (Fabian Jahr)
ba9d288b24 test: Fix getblockstats test data generator (Fabian Jahr)
2ca5a496c2 rpc: Improve getblockstats (Fabian Jahr)
cb94db119f validation, index: Add unspendable coinbase helper functions (Fabian Jahr)
Pull request description:
Fixes#19885
The genesis block does not have undo data saved to disk so the RPC errored because of that.
ACKs for top commit:
achow101:
ACK d885bb2f6e
aureleoules:
ACK d885bb2f6e
stickies-v:
ACK d885bb2f6
Tree-SHA512: f37bda736ed605b7a41a81eeb4bfbb5d2b8518f847819e5d6a090548a61caf1455623e15165d72589ab3f4478252b00e7b624f9313ad6708cac06dd5edb62e9a
c8dc0e3eaa refactor: Inline `CTxMemPoolEntry` class's functions (Hennadii Stepanov)
75bbe594e5 refactor: Move `CTxMemPoolEntry` class to its own module (Hennadii Stepanov)
Pull request description:
This PR:
- gets rid of the `policy/fees` -> `txmempool` -> `policy/fees` circular dependency
- is an alternative to #13949, which nukes only one circular dependency
ACKs for top commit:
ryanofsky:
Code review ACK c8dc0e3eaa. Just include and whitespace changes since last review, and there's a moveonly commit now so it's very easy to review
theStack:
Code-review ACK c8dc0e3eaa
glozow:
utACK c8dc0e3eaa, agree these changes are an improvement.
Tree-SHA512: 36ece824e6ed3ab1a1e198b30a906c8ac12de24545f840eb046958a17315ac9260c7de26e11e2fbab7208adc3d74918db7a7e389444130f8810548ca2e81af41
db929893ef Faster -reindex by initially deserializing only headers (Larry Ruane)
c72de9990a util: add CBufferedFile::SkipTo() to move ahead in the stream (Larry Ruane)
48a68908ba Add LoadExternalBlockFile() benchmark (Larry Ruane)
Pull request description:
### Background
During the first part of reindexing, `LoadExternalBlockFile()` sequentially reads raw blocks from the `blocks/blk00nnn.dat` files (rather than receiving them from peers, as with initial block download) and eventually adds all of them to the block index. When an individual block is initially read, it can't be immediately added unless all its ancestors have been added, which is rare (only about 8% of the time), because the blocks are not sorted by height. When the block can't be immediately added to the block index, its disk location is saved in a map so it can be added later. When its parent is later added to the block index, `LoadExternalBlockFile()` reads and deserializes the block from disk a second time and adds it to the block index. Most blocks (92%) get deserialized twice.
### This PR
During the initial read, it's rarely useful to deserialize the entire block; only the header is needed to determine if the block can be added to the block index immediately. This change to `LoadExternalBlockFile()` initially deserializes only a block's header, then deserializes the entire block only if it can be added immediately. This reduces reindex time on mainnet by 7 hours on a Raspberry Pi, which translates to around a 25% reduction in the first part of reindexing (adding blocks to the index), and about a 6% reduction in overall reindex time.
Summary: The performance gain is the result of deserializing each block only once, except its header which is deserialized twice, but the header is only 80 bytes.
ACKs for top commit:
andrewtoth:
ACK db929893ef
achow101:
ACK db929893ef
aureleoules:
ACK db929893ef - minor changes and new benchmark since last review
theStack:
re-ACK db929893ef
stickies-v:
re-ACK db929893e
Tree-SHA512: 5a5377192c11edb5b662e18f511c9beb8f250bc88aeadf2f404c92c3232a7617bade50477ebf16c0602b9bd3b68306d3ee7615de58acfd8cae664d28bb7b0136
Since faf44876db, the maxtipage comparison
in IsInitialBlockDownload() has been broken, since the NodeClock::now()
time_point is in the system's native denomination (micrcoseconds).
Without this patch, specifying the maximum allowable -maxtipage
(9223372036854775807) results in a SIGABRT crash.
Co-authored-by: MacroFake <falke.marco@gmail.com>
aaaa7bd0ba iwyu: Add missing includes (MacroFake)
fa9ebec096 Remove g_parallel_script_checks (MacroFake)
fa7c834b9f Move ::fCheckBlockIndex into ChainstateManager (MacroFake)
fa43188d86 Move ::fCheckpointsEnabled into ChainstateManager (MacroFake)
cccca83099 Move ::nMinimumChainWork into ChainstateManager (MacroFake)
fa29d0b57c Move ::hashAssumeValid into ChainstateManager (MacroFake)
faf44876db Move ::nMaxTipAge into ChainstateManager (MacroFake)
Pull request description:
It seems preferable to assign globals to a class (in this case `ChainstateManager`), than to leave them dangling. This should clarify scope for code-readers, as well as clarifying unit test behaviour.
ACKs for top commit:
dergoegge:
Code review ACK aaaa7bd0ba
ryanofsky:
Code review ACK aaaa7bd0ba. No changes since last review, other than rebase
aureleoules:
reACK aaaa7bd0ba
Tree-SHA512: 83ec3ba0fb4f1dad95810d4bd4e578454e0718dc1bdd3a794cc4e48aa819b6f5dad4ac4edab3719bdfd5f89cbe23c2740a50fd56c1ff81c99e521c5f6d4e898d
When a block is initially read from a blk*.dat file during reindexing,
it can be added to the block index only if all of its ancestor blocks
have been added, which is rare. If the block's ancestors have not been
added, the block must be re-read from disk later when it can be added.
This commit: During the initial block read, deserialize only its header,
rather than the entire block, since this is sufficient to determine
if its parent (and thus all its ancestors) has been added. This is a
performance improvement.
Making the checks to identify BIP30 available outside of validation.cpp is needed for reporting and tracking statistics on specific blocks and the UTXO set correctly.
5d3f98d278 refactor: Replace m_params with chainman.GetParams() (Aurèle Oulès)
Pull request description:
Fixes a TODO introduced in #24595.
Removes `m_params` from `CChainState` class and replaces it with `m_chainman.GetParams()`.
ACKs for top commit:
MarcoFalke:
review ACK 5d3f98d278🌎
Tree-SHA512: de0fe31450d281cc7307c0d820495e86c93c7998e77a148db2c703da66cff1059e6560c041f1864913c42075aa24d259c2623d45e929ca0a8056ed330a9f9978
This changes the flag for the bitcoin-chainstate executable. Previously
it was false, now it is the chain's default value (still false for the
main chain).
This changes the minimum chain work for the bitcoin-chainstate
executable. Previously it was uint256{}, now it is the chain's default
minimum chain work.
e899d4ca6f init: limit bip30 exceptions to coinbase txs (Chris Geihsler)
511eb7fdea Ignore problematic blocks in DisconnectBlock (Chris Geihsler)
Pull request description:
Fixes https://github.com/bitcoin/bitcoin/issues/22596
When using checklevel=4, block verification fails because of duplicate coinbase transactions involving blocks 91812 and 91722. There was already a check in place within `ConnectBlock` to ignore the problematic blocks, but `DisconnectBlock` did not contain a similar check to ignore these blocks when called from `VerifyDB`.
By ignoring these two blocks in `DisconnectBlock`, the block verification process succeeds at checklevel=4.
(Note to reviewers: this is my first contribution to Bitcoin Core, so any feedback is most welcome. Thanks in advance for reviewing!)
## Steps to reproduce:
Use the following bitcoin.conf file and start bitcoind. I only used block data through block ~100000 so that the verification process was much faster.
```
assumevalid=0
checkblocks=0
checklevel=4
```
Without this change, you will see the following error when the blocks are verified:
```
2022-04-14T02:56:44Z init message: Verifying blocks…
2022-04-14T02:56:44Z Verifying last 101881 blocks at level 4
2022-04-14T02:56:44Z [0%]...[10%]...[20%]...[30%]...[40%]...ERROR: VerifyDB(): *** coin database inconsistencies found (last 10160 blocks, 142571 good transactions before that)
2022-04-14T02:57:01Z : Corrupted block database detected.
Please restart with -reindex or -reindex-chainstate to recover.
: Corrupted block database detected.
Please restart with -reindex or -reindex-chainstate to recover.
```
With this change, you will see this instead:
```
2022-04-14T02:32:29Z init message: Verifying blocks…
2022-04-14T02:32:29Z Verifying last 101746 blocks at level 4
2022-04-14T02:32:29Z [0%]...[10%]...[20%]...[30%]...[40%]...[50%]...[60%]...[70%]...[80%]...[90%]...[DONE].
2022-04-14T02:32:48Z No coin database inconsistencies in last 101746 blocks (226126 transactions)
```
ACKs for top commit:
laanwj:
Code review ACK e899d4ca6f
achow101:
ACK e899d4ca6f
jamesob:
(Biased) ACK e899d4ca6f ([`jamesob/ackr/24851.2.seejee.init_ignore_bip_30_verif`](https://github.com/jamesob/bitcoin/tree/ackr/24851.2.seejee.init_ignore_bip_30_verif))
Tree-SHA512: d2f6d25e9619aee32c1a73fe846b1b587698eaa5a4994fa6424f1038f45654f9fd52b74a69843cc84d90168d74827130ccf8e9201502f5d52281acdb20429291
bf95976061 doc: add note about snapshot chainstate init (James O'Beirne)
e4d7995286 test: add testcases for snapshot initialization (James O'Beirne)
cced4e7336 test: move-only-ish: factor out LoadVerifyActivateChainstate() (James O'Beirne)
51fc9241c0 test: allow on-disk coins and block tree dbs in tests (James O'Beirne)
3c361391b8 test: add reset_chainstate parameter for snapshot unittests (James O'Beirne)
00b357c215 validation: add ResetChainstates() (James O'Beirne)
3a29dfbfb2 move-only: test: make snapshot chainstate setup reusable (James O'Beirne)
8153bd9247 blockmanager: avoid undefined behavior during FlushBlockFile (James O'Beirne)
ad67ff377c validation: remove snapshot datadirs upon validation failure (James O'Beirne)
34d1590331 add utilities for deleting on-disk leveldb data (James O'Beirne)
252abd1e8b init: add utxo snapshot detection (James O'Beirne)
f9f1735f13 validation: rename snapshot chainstate dir (James O'Beirne)
d14bebf100 db: add StoragePath to CDBWrapper/CCoinsViewDB (James O'Beirne)
Pull request description:
This is part of the [assumeutxo project](https://github.com/bitcoin/bitcoin/projects/11) (parent PR: https://github.com/bitcoin/bitcoin/pull/15606)
---
Half of the replacement for #24232. The original PR grew larger than expected throughout the review process.
This change adds the ability to initialize a snapshot-based chainstate during init if one is detected on disk. This is of course unused as of now (aside from in unittests) given that we haven't yet enabled actually loading snapshots.
Don't be scared! There are some big move-only commits in here.
Accompanying changes include:
- moving the snapshot coinsdb directory from being called `chainstate_[base blockhash]` to `chainstate_snapshot`, since we only support one snapshot in use at a time. This simplifies some logic, but it necessitates writing that base blockhash out to a file within the coinsdb dir. See [discussion here](https://github.com/bitcoin/bitcoin/pull/24232#discussion_r832762880).
- adding a simple fix in `FlushBlockFile()` that avoids a crash when attemping to flush to disk before `LoadBlockIndexDB()` is called, which happens when calling `MaybeRebalanceCaches()` during multiple chainstate init.
- improving the unittest to allow testing with on-disk chainstates - necessary to test a simulated restart and re-initialization.
ACKs for top commit:
naumenkogs:
utACK bf95976061
ariard:
Code Review ACK bf9597606
ryanofsky:
Code review ACK bf95976061. Changes since last review: rebasing, switching from CAutoFile to AutoFile, adding comments, switching from BOOST_CHECK to Assert in test util, using chainman.GetMutex() in tests, destroying one ChainstateManager before creating a new one in tests
fjahr:
utACK bf95976061
aureleoules:
ACK bf95976061
Tree-SHA512: 15ae75caf19f8d12a12d2647c52897904d27b265a7af6b4ae7b858592eeadb8f9da6c2394b6baebec90adc28742c053e3eb506119577dae7c1e722ebb3b7bcc0
bcb0cacac2 reindex, log, test: fixes#21379 (mruddy)
Pull request description:
Fixes#21379.
The blocks/blk?????.dat files are mutated and become increasingly malformed, or corrupt, as a result of running the re-indexing process.
The mutations occur after the re-indexing process has finished, as new blocks are appended, but are a result of a re-indexing process miscalculation that lingers in the block manager's `m_blockfile_info` `nSize` data until node restart.
These additions to the blk files are non-fatal, but also not desirable.
That is, this is a form of data corruption that the reading code is lenient enough to process (it skips the extra bytes), but it adds some scary looking log messages as it encounters them.
The summary of the problem is that the re-index process double counts the size of the serialization header (magic message start bytes [4 bytes] + length [4 bytes] = 8 bytes) while calculating the blk data file size (both values already account for the serialization header's size, hence why it is over accounted).
This bug manifests itself in a few different ways, after re-indexing, when a new block from a peer is processed:
1. If the new block will not fit into the last blk file processed while re-indexing, while remaining under the 128MiB limit, then the blk file is flushed to disk and truncated to a size that is 8 greater than it should be. The truncation adds zero bytes (see `FlatFileSeq::Flush` and `TruncateFile`).
1. If the last blk file processed while re-indexing has logical space for the new block under the 128 MiB limit:
1. If the blk file was not already large enough to hold the new block, then the zeros are, in effect, added by `fseek` when the file is opened for writing. Eight zero bytes are added to the end of the last blk file just before the new block is written. This happens because the write offset is 8 too great due to the miscalculation. The result is 8 zero bytes between the end of the last block and the beginning of the next block's magic + length + block.
1. If the blk file was already large enough to hold the new block, then the current existing file contents remain in the 8 byte gap between the end of the last block and the beginning of the next block's magic + length + block. Commonly, when this occcurs, it is due to the blk file containing blocks that are not connected to the block tree during reindex and are thus left behind by the reindex process and later overwritten when new blocks are added. The orphaned blocks can be valid blocks, but due to the nature of concurrent block download, the parent may not have been retrieved and written by the time the node was previously shutdown.
ACKs for top commit:
LarryRuane:
tested code-review ACK bcb0cacac2
ryanofsky:
Code review ACK bcb0cacac2. This is a disturbing bug with an easy fix which seems well-worth merging.
mzumsande:
ACK bcb0cacac2 (reviewed code and did some testing, I agree that it fixes the bug).
w0xlt:
tACK bcb0cacac2
Tree-SHA512: acc97927ea712916506772550451136b0f1e5404e92df24cc05e405bb09eb6fe7c3011af3dd34a7723c3db17fda657ae85fa314387e43833791e9169c0febe51
fabf1cdb20 Use steady clock for bench logging (MacroFake)
faed342a23 scripted-diff: Rename time symbols (MacroFake)
Pull request description:
Instead of using `0.001` and similar constants to "convert" an int64_t to milliseconds, use the type-safe `Ticks<>` helper. Also, use steady clock instead of system clock, since the durations are used for benchmarking.
ACKs for top commit:
fanquake:
ACK fabf1cdb20 - validation bench output still looks sane.
Tree-SHA512: e6525b5fdad6045ca500c56014897d7428ad288aaf375933d3b5939feddf257f6910d562eb66ebcde9186bef9a604ee8d763a318253838318d59df2a285be7c2
Simplifies function signatures by removing repetition of all the
ancestor/descendant limits, and increases readability by being
more verbose by naming the limits, while still reducing the LoC.
fa521c9603 Use steady clock for all millis bench logging (MacroFake)
Pull request description:
Currently `GetTimeMillis` is used for bench logging in milliseconds integral precision. Replace it to use a steady clock that is type-safe and steady.
Microsecond or float precision can be done in a follow-up.
ACKs for top commit:
fanquake:
ACK fa521c9603 - started making the same change.
Tree-SHA512: 86a810e496fc663f815acb8771a6c770331593715cde85370226685bc50c13e8e987e3c5efd0b4e48b36ebd2372255357b709204bac750d41e94a9f7d9897fa6
If a UTXO snapshot fails to validate, don't leave the resulting datadir
on disk as this will confuse initialization on next startup and we'll
get an assertion error.
Used in later commits to remove leveldb directories for
- invalid snapshot chainstates, and
- background-vaildation chainstates that have finished serving their
purpose.