Commit graph

44520 commits

Author SHA1 Message Date
Lőrinc
b5dc42874d test: assert CScript allocation characteristics
Verifies that script types are correctly allocated using prevector's direct (stack) or indirect (heap) storage based on their size:

Direct (stack) allocated script types (size ≤ 28 bytes):
* OP_RETURN (small)
* P2WPKH
* P2SH
* P2PKH

Indirect (heap) allocated script types (size > 28 bytes):
* P2WSH
* P2TR
* P2PK
* MULTISIG (small)

This test provides a baseline for verifying changes to prevector's inline capacity.
2025-04-17 11:31:23 +02:00
Lőrinc
f07c79f034 optimization: look for NULL prevouts in the sorted values
For the 2 input case we simply check them both, like we did with equality.

For the general case, we take advantage of sorting, making invalid value detection constant time instead of linear in the worst case.

> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release && cmake --build build -j$(nproc) && build/src/bench/bench_bitcoin -filter='CheckBlockBench|DuplicateInputs|ProcessTransactionBench' -min-time=10000

> C++ compiler .......................... AppleClang 16.0.0.16000026

|            ns/block |             block/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|          179,971.00 |            5,556.45 |    0.3% |     11.02 | `CheckBlockBench`

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|          963,177.98 |            1,038.23 |    1.7% |     10.92 | `DuplicateInputs`
|            9,410.90 |          106,259.75 |    0.3% |     11.01 | `ProcessTransactionBench`

> C++ compiler .......................... GNU 13.3.0

|            ns/block |             block/s |    err% |       ins/block |       cyc/block |    IPC |      bra/block |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|          834,855.94 |            1,197.81 |    0.0% |    6,518,548.86 |    2,656,039.78 |  2.454 |     919,160.84 |    1.5% |     10.78 | `CheckBlockBench`

|               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|        4,261,492.75 |              234.66 |    0.0% |   17,379,823.40 |   13,559,793.33 |  1.282 |   4,265,714.28 |    3.4% |     11.00 | `DuplicateInputs`
|           55,819.53 |           17,914.88 |    0.1% |      227,828.15 |      177,520.09 |  1.283 |      15,184.36 |    0.4% |     10.91 | `ProcessTransactionBench`
2025-04-17 11:30:33 +02:00
Lőrinc
cb8c012b87 optimization: replace tree with sorted vector
A pre-sized vector retains locality (enabling SIMD operations), speeding up sorting and equality checks.
It's also simpler (therefore more reliable) than a sorted set. It also causes less memory fragmentation.

> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release && cmake --build build -j$(nproc) && build/src/bench/bench_bitcoin -filter='CheckBlockBench|DuplicateInputs|ProcessTransactionBench' -min-time=10000

> C++ compiler .......................... AppleClang 16.0.0.16000026

|            ns/block |             block/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|          181,922.54 |            5,496.85 |    0.2% |     10.98 | `CheckBlockBench`

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|          997,739.30 |            1,002.27 |    1.0% |     10.94 | `DuplicateInputs`
|            9,449.28 |          105,828.15 |    0.3% |     10.99 | `ProcessTransactionBench`

Co-authored-by: Pieter Wuille <pieter@wuille.net>
2025-04-17 11:30:33 +02:00
Lőrinc
765b71b90b optimization: simplify duplicate checks for trivial inputs
No need to create a set for checking duplicates for two-input-transactions.

> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release && cmake --build build -j$(nproc) && build/src/bench/bench_bitcoin -filter='CheckBlockBench|DuplicateInputs|ProcessTransactionBench' -min-time=10000

> C++ compiler .......................... AppleClang 16.0.0.16000026

|            ns/block |             block/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|          314,137.30 |            3,183.32 |    1.2% |     11.04 | `CheckBlockBench`

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|        3,220,592.73 |              310.50 |    1.3% |     10.92 | `DuplicateInputs`
|            9,425.98 |          106,089.77 |    0.3% |     11.00 | `ProcessTransactionBench`
2025-04-17 11:30:33 +02:00
Lőrinc
7b576a440d optimization: move duplicate checks outside of coinbase branch
`IsCoinBase` means single input with NULL prevout, so it makes sense to restrict duplicate check to non-coinbase transactions only.
The behavior is the same as before, except that single-input-transactions aren't checked for duplicates anymore (~70-90% of the cases, see https://transactionfee.info/charts/transactions-1in).
I've added braces to the conditions and loops to simplify review of followup commits.

> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release && cmake --build build -j$(nproc) && build/src/bench/bench_bitcoin -filter='CheckBlockBench|DuplicateInputs|ProcessTransactionBench' -min-time=10000

> C++ compiler .......................... AppleClang 16.0.0.16000026

|            ns/block |             block/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|          335,917.12 |            2,976.92 |    1.3% |     11.01 | `CheckBlockBench`

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|        3,286,337.42 |              304.29 |    1.1% |     10.90 | `DuplicateInputs`
|            9,561.02 |          104,591.35 |    0.2% |     11.02 | `ProcessTransactionBench`
2025-04-17 11:30:33 +02:00
Lőrinc
a300325c5b bench: add ProcessTransactionBench to measure CheckBlock in context
The newly introduced `ProcessTransactionBench` incorporates multiple steps in the validation pipeline, offering a more comprehensive view of `CheckBlock` performance within a realistic transaction validation context.

Previous microbenchmarks, such as DeserializeAndCheckBlockTest and DuplicateInputs, focused on isolated aspects of transaction and block validation. While these tests provided valuable insights for targeted profiling, they lacked context regarding the broader validation process, where interactions between components play a critical role.

> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release && cmake --build build -j$(nproc) && build/src/bench/bench_bitcoin -filter='ProcessTransactionBench' -min-time=10000

> C++ compiler .......................... AppleClang 16.0.0.16000026

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|            9,585.10 |          104,328.55 |    0.1% |     11.03 | `ProcessTransactionBench`

> C++ compiler .......................... GNU 13.3.0

|               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|           56,199.57 |           17,793.73 |    0.1% |      229,263.01 |      178,766.31 |  1.282 |      15,509.97 |    0.5% |     10.91 | `ProcessTransactionBench`
2025-04-17 11:30:33 +02:00
Lőrinc
bf580d7b4e bench: measure CheckBlock speed separately from serialization
> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release && cmake --build build -j$(nproc) && build/src/bench/bench_bitcoin -filter='CheckBlockBench|DuplicateInputs' -min-time=10000

> C++ compiler .......................... AppleClang 16.0.0.16000026

|            ns/block |             block/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|          372,743.63 |            2,682.81 |    1.1% |     10.99 | `CheckBlockBench`

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|        3,304,694.54 |              302.60 |    0.5% |     11.05 | `DuplicateInputs`

> C++ compiler .......................... GNU 13.3.0

|            ns/block |             block/s |    err% |       ins/block |       cyc/block |    IPC |      bra/block |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|        1,096,261.84 |              912.19 |    0.1% |    7,963,390.88 |    3,487,375.26 |  2.283 |   1,266,941.00 |    1.8% |     11.03 | `CheckBlockBench`

|               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|        8,366,309.48 |              119.53 |    0.0% |   23,865,177.67 |   26,620,160.23 |  0.897 |   5,972,887.41 |    4.0% |     10.78 | `DuplicateInputs`
2025-04-17 11:30:33 +02:00
Lőrinc
c15d130752 test: validate duplicate detection in CheckTransaction
The `CheckTransaction` validation function in https://github.com/bitcoin/bitcoin/blob/master/src/consensus/tx_check.cpp#L41-L45 relies on a correct ordering relation for detecting duplicate transaction inputs.

This update to the tests ensures that:
* Accurate detection of duplicates: Beyond trivial cases (e.g., two identical inputs), duplicates are detected correctly in more complex scenarios.
* Consistency across methods: Both sorted sets and hash-based sets behave identically when detecting duplicates for `COutPoint` and related values.
* Robust ordering and equality relations: The function maintains expected behavior for ordering and equality checks.

Using randomized testing with shuffled inputs (to avoid any remaining bias introduced), the enhanced test validates that `CheckTransaction` remains robust and reliable across various input configurations. It confirms identical behavior to a hashing-based duplicate detection mechanism, ensuring consistency and correctness.

To make sure the new branches in the follow-up commits will be covered, `basic_transaction_tests` was extended a randomized test one comparing against the old implementation (and also an alternative duplicate). The iterations and ranges were chosen such that every new branch is expected to be hit once.
2025-04-17 11:30:33 +02:00
Lőrinc
6269067bc3 optimization: add single byte writes
Single byte writes are used very often (used for every (u)int8_t or std::byte or bool and for every VarInt's first byte which is also needed for every (pre)Vector).
It makes sense to avoid the generalized serialization infrastructure that isn't needed:
* AutoFile write doesn't need to allocate 4k buffer for a single byte now;
* `VectorWriter` and `DataStream` avoids memcpy/insert calls.

> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release && cmake --build build -j$(nproc) && build/bin/bench_bitcoin -filter='SizeComputerBlock|SerializeBlock' --min-time=10000

> C compiler ............................ AppleClang 16.0.0.16000026

|            ns/block |             block/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|          174,569.19 |            5,728.39 |    0.6% |     10.89 | `SerializeBlock`
|           10,241.16 |           97,645.21 |    0.0% |     11.00 | `SizeComputerBlock`

> C++ compiler .......................... GNU 13.3.0

|            ns/block |             block/s |    err% |       ins/block |       cyc/block |    IPC |      bra/block |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|          615,000.56 |            1,626.01 |    0.0% |    8,015,883.64 |    2,208,340.88 |  3.630 |   1,517,035.62 |    0.5% |     10.56 | `SerializeBlock`
|           25,676.76 |           38,945.72 |    0.0% |      159,390.03 |       92,202.10 |  1.729 |      42,131.03 |    0.9% |     11.00 | `SizeComputerBlock`
2025-04-17 11:30:33 +02:00
Lőrinc
2bf6c56cab optimization: merge SizeComputer specializations + add new ones
Endianness doesn't affect the final size, we can skip it for `SizeComputer`.
We can `if constexpr` previous calls into existing method, short-circuiting existing logic when we only need their serialized sizes.

> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release && cmake --build build -j$(nproc) && build/src/bench/bench_bitcoin -filter='SizeComputerBlock|SerializeBlock' --min-time=10000

> C compiler ............................ AppleClang 16.0.0.16000026

|            ns/block |             block/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|          191,652.29 |            5,217.78 |    0.4% |     10.96 | `SerializeBlock`
|           10,323.55 |           96,865.92 |    0.2% |     11.01 | `SizeComputerBlock`

> C++ compiler .......................... GNU 13.3.0

|            ns/block |             block/s |    err% |       ins/block |       cyc/block |    IPC |      bra/block |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|          614,847.32 |            1,626.42 |    0.0% |    8,015,883.64 |    2,207,628.07 |  3.631 |   1,517,035.62 |    0.5% |     10.56 | `SerializeBlock`
|           26,020.31 |           38,431.52 |    0.0% |      159,390.03 |       93,438.33 |  1.706 |      42,131.03 |    0.9% |     11.00 | `SizeComputerBlock`
2025-04-17 11:30:33 +02:00
Lőrinc
8f71de5f8f refactor: add explicit static extent to spans 2025-04-17 11:30:33 +02:00
Lőrinc
5559eb68a9 refactor: reduce template bloat in primitive serialization
Merged multiple template methods into single constexpr-delimited implementation to reduce template bloat (i.e. related functionality is grouped into a single method, but can be optimized because of C++20 constexpr conditions).
This unifies related methods that were only bound before by similar signatures - and enables `SizeComputer` optimizations later
2025-04-17 11:30:33 +02:00
Lőrinc
9f15d4da35 cleanup: remove unused ser_writedata16be and ser_readdata16be 2025-04-17 11:30:33 +02:00
Lőrinc
3d7c8ae9fb bench: measure block (size)serialization speed
The SizeComputer is a special serializer which returns what the exact final size will be of the serialized content.

> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release && cmake --build build -j$(nproc) && build/bin/bench_bitcoin -filter='SizeComputerBlock|SerializeBlock' --min-time=10000

> C compiler ............................ AppleClang 16.0.0.16000026

|            ns/block |             block/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|          195,610.62 |            5,112.20 |    0.3% |     11.00 | `SerializeBlock`
|           12,061.83 |           82,906.19 |    0.1% |     11.01 | `SizeComputerBlock`

> C++ compiler .......................... GNU 13.3.0

|            ns/block |             block/s |    err% |       ins/block |       cyc/block |    IPC |      bra/block |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|          867,857.55 |            1,152.26 |    0.0% |    8,015,883.90 |    3,116,099.08 |  2.572 |   1,517,035.87 |    0.5% |     10.81 | `SerializeBlock`
|           30,928.27 |           32,332.88 |    0.0% |      221,683.03 |      111,055.84 |  1.996 |      53,037.03 |    0.8% |     11.03 | `SizeComputerBlock`
2025-04-17 11:30:33 +02:00
Lőrinc
73cfebb08b optimization: refactor: Introduce Uint256ExtraSipHasher to cache SipHash constant state
Previously, only k0 and k1 were stored, causing the constant xor operations to be recomputed in every call to `SipHashUint256Extra`.
This commit adds a dedicated `Uint256ExtraSipHasher` class that caches the initial state (v0-v3) and to perform the `SipHash` computation on a `uint256` (with an extra parameter), hiding the constant computation details from higher-level code and improving efficiency.
This basically brings the precalculations in the `CSipHasher` constructor to the `uint256` specialized SipHash implementation.

> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release && cmake --build build -j$(nproc) && build/src/bench/bench_bitcoin -filter='SaltedOutpointHasherBench.*' -min-time=10000

> C++ compiler .......................... AppleClang 16.0.0.16000026

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|               57.27 |       17,462,299.19 |    0.1% |     11.02 | `SaltedOutpointHasherBench_create_set`
|               11.24 |       88,997,888.48 |    0.3% |     11.04 | `SaltedOutpointHasherBench_hash`
|               13.91 |       71,902,014.20 |    0.2% |     11.01 | `SaltedOutpointHasherBench_match`
|               13.29 |       75,230,390.31 |    0.1% |     11.00 | `SaltedOutpointHasherBench_mismatch`

compared to master:
create_set - 17,462,299.19/17,065,922.04 - 2.3% faster
hash       - 88,997,888.48/83,576,684.83 - 6.4% faster
match      - 71,902,014.20/68,985,850.12 - 4.2% faster
mismatch   - 75,230,390.31/71,942,033.47 - 4.5% faster

> C++ compiler .......................... GNU 13.3.0

|               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|              135.38 |        7,386,349.49 |    0.0% |        1,078.19 |          486.16 |  2.218 |         119.56 |    1.1% |     11.00 | `SaltedOutpointHasherBench_create_set`
|               23.67 |       42,254,558.08 |    0.0% |          247.01 |           85.01 |  2.906 |           4.00 |    0.0% |     11.00 | `SaltedOutpointHasherBench_hash`
|               58.95 |       16,962,220.14 |    0.1% |          446.55 |          211.74 |  2.109 |          20.86 |    1.4% |     11.01 | `SaltedOutpointHasherBench_match`
|               76.98 |       12,991,047.69 |    0.1% |          548.93 |          276.50 |  1.985 |          20.25 |    2.3% |     10.72 | `SaltedOutpointHasherBench_mismatch`

compared to master:
create_set -  7,386,349.49/7,312,133.16  - 1% faster
hash       - 42,254,558.08/41,978,882.62 - 0.6% faster
match      - 16,962,220.14/16,549,695.42 - 2.4% faster
mismatch   - 12,991,047.69/12,713,595.35 - 2% faster

Co-authored-by: sipa <pieter@wuille.net>
2025-04-17 11:30:33 +02:00
Lőrinc
155ba7c349 refactor: Extract C0-C3 Siphash constants 2025-04-17 11:30:33 +02:00
Lőrinc
ae87260d29 test: Rename k1/k2 to k0/k1 for consistency 2025-04-17 11:30:33 +02:00
Lőrinc
c497ca6e91 bench: Add COutPoint and SaltedOutpointHasher benchmarks
This commit introduces new benchmarks to measure the performance of various operations using
SaltedOutpointHasher, including hash computation, set operations, and set creation.

These benchmarks are intended to provide insights about coin caching performance (e.g. during IBD).

> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release && cmake --build build -j$(nproc) && build/src/bench/bench_bitcoin -filter='SaltedOutpointHasherBench.*' -min-time=10000

> C++ compiler .......................... AppleClang 16.0.0.16000026

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|               58.60 |       17,065,922.04 |    0.3% |     11.02 | `SaltedOutpointHasherBench_create_set`
|               11.97 |       83,576,684.83 |    0.1% |     11.01 | `SaltedOutpointHasherBench_hash`
|               14.50 |       68,985,850.12 |    0.3% |     10.96 | `SaltedOutpointHasherBench_match`
|               13.90 |       71,942,033.47 |    0.4% |     11.03 | `SaltedOutpointHasherBench_mismatch`

> C++ compiler .......................... GNU 13.3.0

|               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|              136.76 |        7,312,133.16 |    0.0% |        1,086.67 |          491.12 |  2.213 |         119.54 |    1.1% |     11.01 | `SaltedOutpointHasherBench_create_set`
|               23.82 |       41,978,882.62 |    0.0% |          252.01 |           85.57 |  2.945 |           4.00 |    0.0% |     11.00 | `SaltedOutpointHasherBench_hash`
|               60.42 |       16,549,695.42 |    0.1% |          460.51 |          217.04 |  2.122 |          21.00 |    1.4% |     10.99 | `SaltedOutpointHasherBench_match`
|               78.66 |       12,713,595.35 |    0.1% |          555.59 |          282.52 |  1.967 |          20.19 |    2.2% |     10.74 | `SaltedOutpointHasherBench_mismatch`
2025-04-17 11:30:33 +02:00
Lőrinc
c5e866b190 optimization: Migrate fixed-size obfuscation end-to-end from std::vector<std::byte> to uint64_t
Since `util::Xor` accepts `uint64_t` values, we're eliminating any repeated vector-to-uint64_t conversions going back to the loading/saving of these values (we're still serializing them as vectors, but converting as soon as possible to `uint64_t`). This is the reason the tests still generate vector values and convert to `uint64_t` later instead of generating it directly.

We're also short-circuit `Xor` calls with 0 key values early to avoid unnecessary calculations (e.g. `MakeWritableByteSpan`) - even assuming that XOR is never called for 0.

>  cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release \
&& cmake --build build -j$(nproc) \
&& build/bin/bench_bitcoin -filter='XorObfuscationBench' -min-time=10000

C++ compiler .......................... AppleClang 17.0.0.17000013

|              ns/MiB |               MiB/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|           14,730.40 |           67,886.80 |    0.1% |     11.01 | `XorObfuscationBench`

C++ compiler .......................... GNU 13.3.0

|              ns/MiB |               MiB/s |    err% |         ins/MiB |         cyc/MiB |    IPC |        bra/MiB |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|           51,187.17 |           19,536.15 |    0.0% |      327,683.95 |      183,747.58 |  1.783 |      65,536.55 |    0.0% |     11.00 | `XorObfuscationBench`

----

A few other benchmarks that seem to have improved as well (tested with Clang only):
Before:

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|        2,202,618.49 |              454.01 |    0.2% |     11.01 | `ReadBlockBench`
|          734,444.92 |            1,361.57 |    0.3% |     10.66 | `ReadRawBlockBench`

After:

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|        1,912,308.06 |              522.93 |    0.4% |     10.98 | `ReadBlockBench`
|           49,092.93 |           20,369.53 |    0.2% |     10.99 | `ReadRawBlockBench`
2025-04-17 11:30:31 +02:00
Lőrinc
e50732d25f refactor: prepare mempool_persist for obfuscation key change 2025-04-17 11:30:13 +02:00
Lőrinc
8e6e0acd36 refactor: prepare dbwrapper for obfuscation key change
Since `CDBWrapper::Read` will still work with vectors, we won't be able to use the obfuscation key field to read into it directly.
This commit cleans up this part of the code, obviating that writing `obfuscate_key` is needed since following methods will actually use it implicitly, simplifying the `if (!key_exists` condition to extract the negation into the name of the boolean and inline the single-use `CreateObfuscateKey` which will just complicate the transition.
2025-04-17 11:30:13 +02:00
Lőrinc
5d5f3d06dd bench: Make XorObfuscationBench more representative
Since another PR solves the tiny byte xors during serialization, we're only concentrating on big continuous chunks now.

>  cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release \
&& cmake --build build -j$(nproc) \
&& build/bin/bench_bitcoin -filter='XorObfuscationBench' -min-time=10000

C++ compiler .......................... AppleClang 17.0.0.17000013

|              ns/MiB |               MiB/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|          731,927.62 |            1,366.26 |    0.2% |     10.67 | `XorObfuscationBench`

C++ compiler .......................... GNU 13.3.0

|              ns/MiB |               MiB/s |    err% |         ins/MiB |         cyc/MiB |    IPC |        bra/MiB |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|          941,015.26 |            1,062.68 |    0.0% |    9,437,186.97 |    3,378,911.52 |  2.793 |   1,048,577.15 |    0.0% |     10.99 | `XorObfuscationBench`
2025-04-17 11:30:13 +02:00
Lőrinc
3d203c2acf test: Compare util::Xor with randomized inputs against simple impl
Since production code only uses keys of length 8, we're not testing with other values anymore
2025-04-17 11:30:13 +02:00
merge-script
7a3afe6787
Merge bitcoin/bitcoin#32281: bench: Fix WalletMigration benchmark
7912cd4125 bench: Fix WalletMigration benchmark (pablomartin4btc)

Pull request description:

  The keys and scripts created for the Legacy Wallet needed to be persisted in order for the migration to work properly.

  Fixes #32277.

ACKs for top commit:
  achow101:
    ACK 7912cd4125
  davidgumberg:
    Tested ACK 7912cd4125
  furszy:
    utACK 7912cd4125

Tree-SHA512: fe7b8e0a80d4d030ad3fd6446717ee09a260ab2bd6140bc817bdca52d233e3af8a8fed2d754743ca2ba022f7d2c8615a36b5070991d12942c13835e8f72e359f
2025-04-17 09:43:42 +01:00
Ava Chow
e66e30c9e5
Merge bitcoin/bitcoin#31862: doc: Fix and clarify description of ZMQ message format
Some checks are pending
CI / test each commit (push) Waiting to run
CI / macOS 14 native, arm64, no depends, sqlite only, gui (push) Waiting to run
CI / macOS 14 native, arm64, fuzz (push) Waiting to run
CI / Windows native, VS 2022 (push) Waiting to run
CI / Windows native, fuzz, VS 2022 (push) Waiting to run
CI / Linux->Windows cross, no tests (push) Waiting to run
CI / Windows, test cross-built (push) Blocked by required conditions
CI / ASan + LSan + UBSan + integer, no depends, USDT (push) Waiting to run
7a93544cdc doc: Fix and clarify description of ZMQ message format (Jiri Jakes)

Pull request description:

  This change stresses that all ZMQ messages share the same structure and that they differ only in the format of the bodies. Previously this was not clear.

  Further it removes the notion of endianness of 32-byte hashes, as it was misleading, and replaces it with the term 'reversed byte order' (as opposed to natural or normal byte order produced by hashing functions).

  Additionally, it states that ZMQ 32-byte hashes are in the same format as in RPC. Previously it incorrectly stated that the two were in different formats.

  [Rendered](https://github.com/jirijakes/bitcoin/blob/zmq-doc/doc/zmq.md).

  Fixes #31856.

ACKs for top commit:
  w0xlt:
    Code review ACK 7a93544cdc
  achow101:
    ACK 7a93544cdc
  ryanofsky:
    Code review ACK 7a93544cdc. Nice changes. Documentation seems less repetitive and easier to understand now

Tree-SHA512: 8c5ab047c5fd9b5b6910d691b725886d7743dfd01510735b46e43d01c2d0d25ec52d79d71ec75dbeb142e96a88ad503d69ee14b971e3cdaeb8fd85e5292a8c21
2025-04-16 16:26:18 -07:00
Ava Chow
b6282dbd45
Merge bitcoin/bitcoin#32079: test: Add test coverage for rpcwhitelistdefault when unset
2929da1dd5 test: Add coverage for rpcwhitelistdefault when unset (naiyoma)
535b874707 test: Combine rpcwhitelistdefault functions (naiyoma)
2b6ce9254d test: Update permissions and string formatting (naiyoma)

Pull request description:

  This is a follow-up PR to address review feedback from [https://github.com/bitcoin/bitcoin/pull/29858](https://github.com/bitcoin/bitcoin/pull/29858)

  - [x]  add  case where rpcwhitelistdefault setting is [unset](https://github.com/bitcoin/bitcoin/pull/29858#pullrequestreview-2532726241)
  - [x] Code [cleanup](https://github.com/bitcoin/bitcoin/pull/29858#discussion_r1927238617) , change password and f-string formatting
  - [x] [Combine](https://github.com/bitcoin/bitcoin/pull/29858#discussion_r1930137601) rpcwhitelistdefault tests into `test_rpcwhitelistdefault_permissions`
  I am not sure if my approach of adding` test_rpcwhitelistdefault_unset` is better or if I should just include the assertions in the existing `test_rpcwhitelistdefault_permissions`

ACKs for top commit:
  w0xlt:
    Code review ACK 2929da1dd5
  achow101:
    ACK 2929da1dd5
  ryanofsky:
    Code review ACK 2929da1dd5. Only change since last review was simplifying the last commit as suggested

Tree-SHA512: 6750dd3e6abaca3a09ad1fd5d07c64767bc59188ff953cbc26aa7796071774cb92745ac82cf91e479632d682fd450bc00d53032454b65b22654a3e770ec68e89
2025-04-16 16:24:01 -07:00
Ava Chow
eb6b1003c1
Merge bitcoin/bitcoin#32286: test: Handle empty string returned by CLI as None in RPC tests
a4041c77f0 test: Handle empty string returned by CLI as None in RPC tests (Brandon Odiwuor)

Pull request description:

  Partially Fixes https://github.com/bitcoin/bitcoin/issues/32264

  Some tests are failing when `bitcoin-cli` returns an empty string. This change treats an empty response as `None`. See https://github.com/bitcoin/bitcoin/issues/32264#issuecomment-2807616694

  This fixes the error for:
  - feature_bip68_sequence.py
  - feature_nulldummy.py
  - feature_signet.py
  - mining_mainnet.py
  - rpc_scanblocks.py
  - rpc_scantxoutset.py
  - wallet_descriptor.py --descriptors

ACKs for top commit:
  maflcko:
    lgtm ACK a4041c77f0
  achow101:
    ACK a4041c77f0
  pablomartin4btc:
    ACK a4041c77f0
  mzumsande:
    ACK a4041c77f0

Tree-SHA512: 2f1a416a18e0b3eebdb014c2e2e8dadf1d46b15c231cb61f577d47f5e551994ab0e2aeb7c179c01be7c1f07ebc03476236d29cf2d04c358ffb1fae985aa385c9
2025-04-16 16:17:35 -07:00
Ava Chow
33df4aebae
Merge bitcoin/bitcoin#31551: [IBD] batch block reads/writes during AutoFile serialization
8d801e3efb optimization: bulk serialization writes in `WriteBlockUndo` and `WriteBlock` (Lőrinc)
520965e293 optimization: bulk serialization reads in `UndoRead`, `ReadBlock` (Lőrinc)
056cb3c0d2 refactor: clear up blockstorage/streams in preparation for optimization (Lőrinc)
67fcc64802 log: unify error messages for (read/write)[undo]block (Lőrinc)
a4de160492 scripted-diff: shorten BLOCK_SERIALIZATION_HEADER_SIZE constant (Lőrinc)
6640dd52c9 Narrow scope of undofile write to avoid possible resource management issue (Lőrinc)
3197155f91 refactor: collect block read operations into try block (Lőrinc)
c77e3107b8 refactor: rename leftover WriteBlockBench (Lőrinc)

Pull request description:

  This change is part of [[IBD] - Tracking PR for speeding up Initial Block Download](https://github.com/bitcoin/bitcoin/pull/32043)

  ### Summary
  We can serialize the blocks and undos to any `Stream` which implements the appropriate read/write methods.
  `AutoFile` is one of these, writing the results "directly" to disk (through the OS file cache). Batching these in memory first and reading/writing these to disk is measurably faster (likely because of fewer native fread calls or less locking, as [observed](https://github.com/bitcoin/bitcoin/pull/28226#issuecomment-1666842501) by Martinus in a similar change).

  ### Unlocking new optimization opportunities

  Buffered writes will also enable batched obfuscation calculations (implemented in https://github.com/bitcoin/bitcoin/pull/31144) - especially since currently we need to copy the write input's std::span to do the obfuscation on it, and batching enables doing the operations on the internal buffer directly.

  ### Measurements (micro benchmarks, full IBDs and reindexes)

  Microbenchmarks for `[Read|Write]BlockBench` show a ~**30%**/**168%** speedup with `macOS/Clang`, and ~**19%**/**24%** with `Linux/GCC` (the follow-up XOR batching improves these further):

  <details>
  <summary>macOS Sequoia - Clang 19.1.7</summary>

  > Before:

  |               ns/op |                op/s |    err% |     total | benchmark
  |--------------------:|--------------------:|--------:|----------:|:----------
  |        2,271,441.67 |              440.25 |    0.1% |     11.00 | `ReadBlockBench`
  |        5,149,564.31 |              194.19 |    0.8% |     10.95 | `WriteBlockBench`

  > After:

  |               ns/op |                op/s |    err% |     total | benchmark
  |--------------------:|--------------------:|--------:|----------:|:----------
  |        1,738,683.04 |              575.15 |    0.2% |     11.04 | `ReadBlockBench`
  |        3,052,658.88 |              327.58 |    1.0% |     10.91 | `WriteBlockBench`

  </details>

  <details>
  <summary>Ubuntu 24 - GNU 13.3.0</summary>

  > Before:

  |               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
  |--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
  |        6,895,987.11 |              145.01 |    0.0% |   71,055,269.86 |   23,977,374.37 |  2.963 |   5,074,828.78 |    0.4% |     22.00 | `ReadBlockBench`
  |        5,152,973.58 |              194.06 |    2.2% |   19,350,886.41 |    8,784,539.75 |  2.203 |   3,079,335.21 |    0.4% |     23.18 | `WriteBlockBench`

  > After:

  |               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
  |--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
  |        5,771,882.71 |              173.25 |    0.0% |   65,741,889.82 |   20,453,232.33 |  3.214 |   3,971,321.75 |    0.3% |     22.01 | `ReadBlockBench`
  |        4,145,681.13 |              241.21 |    4.0% |   15,337,596.85 |    5,732,186.47 |  2.676 |   2,239,662.64 |    0.1% |     23.94 | `WriteBlockBench`

  </details>

  2 full IBD runs against master (compiled with GCC where the gains seem more modest) for **888888** blocks (seeded from real nodes) indicates a ~**7%** total speedup.

  <details>
  <summary>Details</summary>

  ```bash
  COMMITS="d2b72b13699cf460ffbcb1028bcf5f3b07d3b73a 652b4e3de5c5e09fb812abe265f4a8946fa96b54"; \
  STOP_HEIGHT=888888; DBCACHE=1000; \
  C_COMPILER=gcc; CXX_COMPILER=g++; \
  BASE_DIR="/mnt/my_storage"; DATA_DIR="$BASE_DIR/BitcoinData"; LOG_DIR="$BASE_DIR/logs"; \
  (for c in $COMMITS; do git fetch origin $c -q && git log -1 --pretty=format:'%h %s' $c || exit 1; done) && \
  hyperfine \
    --sort 'command' \
    --runs 2 \
    --export-json "$BASE_DIR/ibd-${COMMITS// /-}-$STOP_HEIGHT-$DBCACHE-$C_COMPILER.json" \
    --parameter-list COMMIT ${COMMITS// /,} \
    --prepare "killall bitcoind; rm -rf $DATA_DIR/*; git checkout {COMMIT}; git clean -fxd; git reset --hard; \
      cmake -B build -DCMAKE_BUILD_TYPE=Release -DENABLE_WALLET=OFF -DCMAKE_C_COMPILER=$C_COMPILER -DCMAKE_CXX_COMPILER=$CXX_COMPILER && \
      cmake --build build -j$(nproc) --target bitcoind && \
      ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=1 -printtoconsole=0; sleep 100" \
    --cleanup "cp $DATA_DIR/debug.log $LOG_DIR/debug-{COMMIT}-$(date +%s).log" \
    "COMPILER=$C_COMPILER COMMIT=${COMMIT:0:10} ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP_HEIGHT -dbcache=$DBCACHE -blocksonly -printtoconsole=0"
  d2b72b1369 refactor: rename leftover WriteBlockBench
  652b4e3de5 optimization: Bulk serialization writes in `WriteBlockUndo` and `WriteBlock`
  Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = d2b72b1369)
    Time (mean ± σ):     41528.104 s ± 354.003 s    [User: 44324.407 s, System: 3074.829 s]
    Range (min … max):   41277.786 s … 41778.421 s    2 runs

  Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = 652b4e3de5)
    Time (mean ± σ):     38771.457 s ± 441.941 s    [User: 41930.651 s, System: 3222.664 s]
    Range (min … max):   38458.957 s … 39083.957 s    2 runs

  Relative speed comparison
          1.07 ±  0.02  COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = d2b72b1369)
          1.00          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = 652b4e3de5)
  ```

  </details>

ACKs for top commit:
  maflcko:
    re-ACK 8d801e3efb 🐦
  achow101:
    ACK 8d801e3efb
  ryanofsky:
    Code review ACK 8d801e3efb. Most notable change is switching from BufferedReader to ReadRawBlock for block reads, which makes sense, and there are also various cleanups in blockstorage and test code.
  hodlinator:
    re-ACK 8d801e3efb

Tree-SHA512: 24e1dee653b927b760c0ba3c69d1aba15fa5d9c4536ad11cfc2d70196ae16b9228ecc3056eef70923364257d72dc929882e73e69c6c426e28139d31299d08adc
2025-04-16 15:16:22 -07:00
Ava Chow
679bb2aac2
Merge bitcoin/bitcoin#31958: rpc: add cli examples, update docs
Some checks are pending
CI / test each commit (push) Waiting to run
CI / macOS 14 native, arm64, no depends, sqlite only, gui (push) Waiting to run
CI / macOS 14 native, arm64, fuzz (push) Waiting to run
CI / Windows native, VS 2022 (push) Waiting to run
CI / Windows native, fuzz, VS 2022 (push) Waiting to run
CI / Linux->Windows cross, no tests (push) Waiting to run
CI / Windows, test cross-built (push) Blocked by required conditions
CI / ASan + LSan + UBSan + integer, no depends, USDT (push) Waiting to run
32dcec269b rpc: update RPC help of `createpsbt` (rkrux)
931117a46f rpc: update the doc for `data` field in `outputs` argument (rkrux)
8134a6b5d4 rpc: add cli example for `walletcreatefundedpsbt` RPC (rkrux)

Pull request description:

  ### add cli example for `walletcreatefundedpsbt` and `createpsbt` RPCs
  The only example present earlier was one that creates an OP_RETURN output. This
      lack of examples has discouraged me earlier to use this RPC. Adding an example
      that creates PSBT sending bitcoin to address, a scenario that is much more common.

  ### rpc: update the doc for `data` field in `outputs` argument
  It was not evident to me that this field creates an `OP_RETURN` output until
      I read the code and tried it out. Thus, making the doc explicitly mention it.
  This affects docs of the following RPCs:
  `bumpfee`, `psbtbumpfee`, `send`, `walletcreatefundedpsbt`, `createpsbt`,
  and `createrawtransaction`

ACKs for top commit:
  sipa:
    utACK 32dcec269b
  1440000bytes:
    utACK 32dcec269b
  achow101:
    ACK 32dcec269b
  ryanofsky:
    Concept ACK 32dcec269b. These seem like helpful clarifications, but I did not look into the details

Tree-SHA512: f994488ba7d52d00960fc52064bb419cf548e29822fe23d6ee0452fdf514dd93f089145eddb32b8086a7918cf8cf33a4c3f16bfcb7948f3c9d5afd95e8d3a1cb
2025-04-16 13:13:20 -07:00
Ava Chow
dfa2813e31
Merge bitcoin/bitcoin#32248: Remove support for RNDR/RNDRRS for aarch64
7749d929a0 Remove support for RNDR/RNDRRS for aarch64 on Linux (laanwj)

Pull request description:

  This hardware feature is

  - Rarely supported on SoCs (and broken on like half of the chips that support it in the first place) (#31817). It is not clear if, or how, the brokenness will be worked around in the kernel, but working around it in user space seems the wrong thing to do, this is not the place to maintain special workarounds for specific hardware (which despite that, was attempted in #31826, but had to be reverted in #31908 due to other problems).
  - Apparently not compiled into the release binary anymore (https://github.com/bitcoin/bitcoin/issues/31817#issuecomment-2795885962). Did check this at the time, but a build system change must have caused this, and went undetected.
  - Hard to test in CI (as well as manually), due to unavailability of hardware.

  Better to remove it.

  This reverts commit aee5404e02 from #26839.

  Closes #31817.

ACKs for top commit:
  sipa:
    utACK 7749d929a0
  davidgumberg:
    utACK 7749d929a0
  achow101:
    ACK 7749d929a0
  w0xlt:
    utACK 7749d929a0

Tree-SHA512: d243ad7f745fb46f711f24b6983d9ea1d94e5d8ee60959229bafdba5caa210a60801a1c2cb5b558a0e72f365371b32285aee9a8d0cd24a60589adc7b03dd6a44
2025-04-16 11:57:11 -07:00
pablomartin4btc
7912cd4125 bench: Fix WalletMigration benchmark
The keys and scripts created for the Legacy Wallet
needed to be persisted in order for the migration to work
properly.
2025-04-16 12:33:46 -03:00
Brandon Odiwuor
a4041c77f0 test: Handle empty string returned by CLI as None in RPC tests 2025-04-16 14:18:48 +03:00
merge-script
cdc32994fe
Merge bitcoin/bitcoin#32272: [doc] archive 29.0 release notes
Some checks failed
CI / test each commit (push) Has been cancelled
CI / macOS 14 native, arm64, no depends, sqlite only, gui (push) Has been cancelled
CI / macOS 14 native, arm64, fuzz (push) Has been cancelled
CI / Windows native, VS 2022 (push) Has been cancelled
CI / Windows native, fuzz, VS 2022 (push) Has been cancelled
CI / Linux->Windows cross, no tests (push) Has been cancelled
CI / ASan + LSan + UBSan + integer, no depends, USDT (push) Has been cancelled
CI / Windows, test cross-built (push) Has been cancelled
12dc507c4a [doc] archive 29.0 release notes (glozow)

Pull request description:

ACKs for top commit:
  achow101:
    ACK 12dc507c4a
  janb84:
    ACK [12dc507](12dc507c4a)

Tree-SHA512: 61aa6c1892d70c53ad4600d17dd8bc23bd181cff48aaa432272ce5dfb4f9b24e6f558629513ba731060b2748877f50b2731c3dc7f3fc437a76f9922a0d26c7cc
2025-04-15 11:07:52 +01:00
glozow
12dc507c4a [doc] archive 29.0 release notes 2025-04-14 21:02:18 -04:00
Ava Chow
99a4ddf5ab
Merge bitcoin/bitcoin#31785: Have createNewBlock() wait for tip, make rpc handle shutdown during long poll and wait methods
Some checks are pending
CI / test each commit (push) Waiting to run
CI / macOS 14 native, arm64, no depends, sqlite only, gui (push) Waiting to run
CI / macOS 14 native, arm64, fuzz (push) Waiting to run
CI / Windows native, VS 2022 (push) Waiting to run
CI / Windows native, fuzz, VS 2022 (push) Waiting to run
CI / Linux->Windows cross, no tests (push) Waiting to run
CI / Windows, test cross-built (push) Blocked by required conditions
CI / ASan + LSan + UBSan + integer, no depends, USDT (push) Waiting to run
05117e6e17 rpc: clarify longpoll behavior (Sjors Provoost)
5315278e7c Have createNewBlock() wait for a tip (Sjors Provoost)
64a2795fd4 rpc: handle shutdown during long poll and wait methods (Sjors Provoost)
a3bf43343f rpc: drop unneeded IsRPCRunning() guards (Sjors Provoost)
f9cf8bd0ab Handle negative timeout for waitTipChanged() (Sjors Provoost)

Pull request description:

  This PR prevents Mining interface methods from sometimes crashing when called during startup before a tip is connected. It also makes other improvements like making more RPC methods usable from the GUI. Specifically this PR:

  - Adds an `Assume` check to disallow passing negative timeout values to `Mining::waitTipChanged`
  - Makes `waitfornewblock`, `waitforblock` and `waitforblockheight` RPC methods usable from the GUI when `-server=1` is not set.
  - Changes `Mining::waitTipChanged` to return `optional<BlockRef>` instead of `BlockRef` and return `nullopt` instead of crashing if there is a timeout or if the node is shut down before a tip is connected.
  - Changes `Mining::waitTipChanged` to not time out before a tip is connected, so it is convenient and safe to call during startup, and only returns `nullopt` on early shutdowns.
  - Changes `Mining::createNewBlock` to block and wait for a tip to be connected if it is called on startup instead of crashing. Also documents that it will return null on early shutdowns.

  This allows `waitNext()` (added in https://github.com/bitcoin/bitcoin/pull/31283) to safely assume `TipBlock()` isn't `null`, not even during a scenario of early shutdown.

  Finally this PR clarifies long poll behaviour, mostly by adding code comments, but also through an early `break`.

ACKs for top commit:
  achow101:
    ACK 05117e6e17
  ryanofsky:
    Code review ACK 05117e6e17, just updated a commit message since last review
  TheCharlatan:
    ACK 05117e6e17
  vasild:
    ACK 05117e6e17

Tree-SHA512: 277c285a6e73dfff88fd379298190b264254996f98b93c91c062986ab35c2aa5e1fbfec4cd71d7b29dc2d68e33f252b5cfc501345f54939d6bd78599b71fec04
2025-04-14 14:39:57 -07:00
Ava Chow
22770ce8cb
Merge bitcoin/bitcoin#31282: refactor: Make node_id a const& in RemoveBlockRequest
fa21f83d29 ci: Use G++ in valgrind tasks (MarcoFalke)
fabd05bf65 refactor: Fix net_processing iwyu includes (MarcoFalke)
fa1622db20 refactor: Make node_id a const& in RemoveBlockRequest (MarcoFalke)

Pull request description:

  Currently, `valgrind` is not usable on a default build with GCC. Specifically, `p2p_compactblocks.py --valgrind` gives a false-positive in `RemoveBlockRequest` when comparing `node_id` with `from_peer`. According to the upstream bug report, this happens because both symbols are on the stack and the compiler can more aggressively optimize the compare (order). See https://bugs.kde.org/show_bug.cgi?id=472329#c7

  It is possible to work around this bug by pulling at least one value from the stack. For example, by making `from_peer` a `const` reference. Alternatively, by replacing `auto [node_id, list_it]` with `const auto& [node_id, list_it]`, which is done here.

  I think this workaround is acceptable, because it does not look like valgrind can trivially fix this. The alternative would be to add a (temporary?) suppression.

  Fixes https://github.com/bitcoin/bitcoin/issues/27741

  Also, fix iwyu includes, while touching this module.

  Also, switch the CI valgrind scripts to use G++.

ACKs for top commit:
  achow101:
    ACK fa21f83d29
  TheCharlatan:
    ACK fa21f83d29
  darosior:
    utACK fa21f83d29
  ryanofsky:
    Code review ACK fa21f83d29. Code changes all look good but I'm a little confused about purpose of the third commit, so left a question about that

Tree-SHA512: 7b92cdafd525a5ac53ae2c1a7a92e599bc9b5fd5d315a694b493cd5079ac323d884393b57aa18581b7789247a588c9a27d47698de25b340bc76fc9f1dd1850b4
2025-04-14 14:22:56 -07:00
Lőrinc
8d801e3efb optimization: bulk serialization writes in WriteBlockUndo and WriteBlock
Similarly to the serialization reads optimization, buffered writes will enable batched XOR calculations.
This is especially beneficial since the current implementation requires copying the write input's `std::span` to perform obfuscation.
Batching allows us to apply XOR operations on the internal buffer instead, reducing unnecessary data copying and improving performance.

------

> macOS Sequoia 15.3.1
> C++ compiler .......................... Clang 19.1.7
> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ && cmake --build build -j$(nproc) && build/bin/bench_bitcoin -filter='WriteBlockBench' -min-time=10000

Before:

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|        5,149,564.31 |              194.19 |    0.8% |     10.95 | `WriteBlockBench`

After:

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|        2,990,564.63 |              334.39 |    1.5% |     11.27 | `WriteBlockBench`

------

> Ubuntu 24.04.2 LTS
> C++ compiler .......................... GNU 13.3.0
> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++ && cmake --build build -j$(nproc) && build/bin/bench_bitcoin -filter='WriteBlockBench' -min-time=20000

Before:

|               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|        5,152,973.58 |              194.06 |    2.2% |   19,350,886.41 |    8,784,539.75 |  2.203 |   3,079,335.21 |    0.4% |     23.18 | `WriteBlockBench`

After:

|               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|        4,145,681.13 |              241.21 |    4.0% |   15,337,596.85 |    5,732,186.47 |  2.676 |   2,239,662.64 |    0.1% |     23.94 | `WriteBlockBench`

Co-authored-by: Ryan Ofsky <ryan@ofsky.org>
Co-authored-by: Cory Fields <cory-nospam-@coryfields.com>
2025-04-14 12:04:06 +02:00
Lőrinc
520965e293 optimization: bulk serialization reads in UndoRead, ReadBlock
The obfuscation (XOR) operations are currently done byte-by-byte during serialization. Buffering the reads will enable batching the obfuscation operations later.

Different operating systems handle file caching differently, so reading larger batches (and processing them from memory) is measurably faster, likely because of fewer native fread calls and reduced lock contention.

Note that `ReadRawBlock` doesn't need buffering since it already reads the whole block directly.
Unlike `ReadBlockUndo`, the new `ReadBlock` implementation delegates to `ReadRawBlock`, which uses more memory than a buffered alternative but results in slightly simpler code and a small performance increase (~0.4%). This approach also clearly documents that `ReadRawBlock` is a logical subset of `ReadBlock` functionality.

The current implementation, which iterates over a fixed-size buffer, provides a more general alternative to Cory Fields' solution of reading the entire block size in advance.

Buffer sizes were selected based on benchmarking to ensure the buffered reader produces performance similar to reading the whole block into memory. Smaller buffers were slower, while larger ones showed diminishing returns.

------

> macOS Sequoia 15.3.1
> C++ compiler .......................... Clang 19.1.7
> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ && cmake --build build -j$(nproc) && build/bin/bench_bitcoin -filter='ReadBlockBench' -min-time=10000

Before:

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|        2,271,441.67 |              440.25 |    0.1% |     11.00 | `ReadBlockBench`

After:

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|        1,738,971.29 |              575.05 |    0.2% |     10.97 | `ReadBlockBench`

------

> Ubuntu 24.04.2 LTS
> C++ compiler .......................... GNU 13.3.0
> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++ && cmake --build build -j$(nproc) && build/bin/bench_bitcoin -filter='ReadBlockBench' -min-time=20000

Before:

|               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|        6,895,987.11 |              145.01 |    0.0% |   71,055,269.86 |   23,977,374.37 |  2.963 |   5,074,828.78 |    0.4% |     22.00 | `ReadBlockBench`

After:

|               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|        5,771,882.71 |              173.25 |    0.0% |   65,741,889.82 |   20,453,232.33 |  3.214 |   3,971,321.75 |    0.3% |     22.01 | `ReadBlockBench`

Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>
Co-authored-by: Ryan Ofsky <ryan@ofsky.org>
Co-authored-by: Martin Leitner-Ankerl <martin.ankerl@gmail.com>
Co-authored-by: Cory Fields <cory-nospam-@coryfields.com>
2025-04-14 12:04:06 +02:00
merge-script
5116655980
Merge bitcoin/bitcoin#32255: miniscript: Correct off-by-one assert guards (#31727 follow-up)
Some checks are pending
CI / test each commit (push) Waiting to run
CI / macOS 14 native, arm64, no depends, sqlite only, gui (push) Waiting to run
CI / macOS 14 native, arm64, fuzz (push) Waiting to run
CI / Windows native, VS 2022 (push) Waiting to run
CI / Windows native, fuzz, VS 2022 (push) Waiting to run
CI / Linux->Windows cross, no tests (push) Waiting to run
CI / Windows, test cross-built (push) Blocked by required conditions
CI / ASan + LSan + UBSan + integer, no depends, USDT (push) Waiting to run
3693e4d6ee miniscript: Correct off-by-one assert guards (Hodlinator)

Pull request description:

  First instances discovered by darosior in https://github.com/bitcoin/bitcoin/pull/31727#issuecomment-2619342125.

ACKs for top commit:
  maflcko:
    lgtm ACK 3693e4d6ee
  sipa:
    ACK 3693e4d6ee
  l0rinc:
    Tested ACK 3693e4d6ee

Tree-SHA512: a41302bb9349d5ad2daf89431c67fdbe80f91690e1759d01529acd1b2fa5d99bad3383da684ee62c00584f7696cd3f87efff084c74edf52eb7cd88d60ee99829
2025-04-14 10:59:07 +01:00
Lőrinc
056cb3c0d2 refactor: clear up blockstorage/streams in preparation for optimization
Made every OpenBlockFile#fReadOnly value explicit.

Replaced hard-coded values in ReadRawBlock with STORAGE_HEADER_BYTES.
Changed `STORAGE_HEADER_BYTES` and `UNDO_DATA_DISK_OVERHEAD` to `uint32_t` to avoid casts.

Also added `LIFETIMEBOUND` to the `AutoFile` parameter of `BufferedFile`, which stores a reference to the underlying `AutoFile`, allowing Clang to emit warnings if the referenced `AutoFile` might be destroyed while `BufferedFile` still exists.
Without this attribute, code with lifetime violations wouldn't trigger compiler warnings.

Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>
2025-04-14 11:57:14 +02:00
Lőrinc
67fcc64802 log: unify error messages for (read/write)[undo]block
Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>
2025-04-13 23:44:46 +02:00
Lőrinc
a4de160492 scripted-diff: shorten BLOCK_SERIALIZATION_HEADER_SIZE constant
Renames the constant to be less verbose and better reflect its purpose:
it represents the size of the storage header that precedes serialized block data on disk,
not to be confused with a block's own header.

-BEGIN VERIFY SCRIPT-
git grep -q "STORAGE_HEADER_BYTES" $(git ls-files) && echo "Error: Target name STORAGE_HEADER_BYTES already exists in the codebase" && exit 1
sed -i 's/BLOCK_SERIALIZATION_HEADER_SIZE/STORAGE_HEADER_BYTES/g' $(git grep -l 'BLOCK_SERIALIZATION_HEADER_SIZE')
-END VERIFY SCRIPT-
2025-04-13 23:44:46 +02:00
Lőrinc
6640dd52c9 Narrow scope of undofile write to avoid possible resource management issue
`AutoFile{OpenUndoFile(pos)}` was still in scope when `FlushUndoFile(pos.nFile)` was called, which could lead to file handle conflicts or other unexpected behavior.

Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com>
Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>
2025-04-13 23:44:46 +02:00
Lőrinc
3197155f91 refactor: collect block read operations into try block
Reorganized error handling in block-related operations by grouping related operations together within the same scope.

In `ReadBlockUndo()` and `ReadBlock()`, moved all deserialization operations, comments and checksum verification inside a single try/catch block for cleaner error handling.
In `WriteBlockUndo()`, consolidated hash calculation and data writing operations within a common block to better express their logical relationship.
2025-04-13 23:44:44 +02:00
Hodlinator
3693e4d6ee
miniscript: Correct off-by-one assert guards
Co-authored-by: Antoine Poinsot <darosior@protonmail.com>
2025-04-12 09:46:56 +02:00
merge-script
817edfb21e
Merge bitcoin/bitcoin#32245: doc: Updates how to reproduce fuzz CI failure locally
Some checks failed
CI / test each commit (push) Has been cancelled
CI / macOS 14 native, arm64, no depends, sqlite only, gui (push) Has been cancelled
CI / macOS 14 native, arm64, fuzz (push) Has been cancelled
CI / Windows native, VS 2022 (push) Has been cancelled
CI / Windows native, fuzz, VS 2022 (push) Has been cancelled
CI / Linux->Windows cross, no tests (push) Has been cancelled
CI / ASan + LSan + UBSan + integer, no depends, USDT (push) Has been cancelled
CI / Windows, test cross-built (push) Has been cancelled
8fe001d597 doc: Updates how to reproduce fuzz CI failure locally (Sergi Delgado Segura)

Pull request description:

  The current version of the doc does not explain how to reproduce a recent fuzzing CI failure (not yet part of the corpora). Add instructions on how to manually create a crash file based on a report.

ACKs for top commit:
  maflcko:
    lgtm ACK 8fe001d597
  glozow:
    ACK 8fe001d597

Tree-SHA512: 7436d71a30bbbffc34770027f1deeacca2de528d8d1b333431d6070c2ba779ecfcdaf25dc791d2154ba4dd37824d06aed2695a8412d7ca1f29e5bd1796d42aeb
2025-04-11 16:34:41 -04:00
merge-script
b2bb27f40c
Merge bitcoin/bitcoin#31741: multiprocess: Add libmultiprocess git subtree
Some checks are pending
CI / test each commit (push) Waiting to run
CI / macOS 14 native, arm64, no depends, sqlite only, gui (push) Waiting to run
CI / macOS 14 native, arm64, fuzz (push) Waiting to run
CI / Windows native, VS 2022 (push) Waiting to run
CI / Windows native, fuzz, VS 2022 (push) Waiting to run
CI / Linux->Windows cross, no tests (push) Waiting to run
CI / Windows, test cross-built (push) Blocked by required conditions
CI / ASan + LSan + UBSan + integer, no depends, USDT (push) Waiting to run
babb9f5db6 depends: remove non-native libmultiprocess build (Cory Fields)
5d105fb8c3 depends: Switch libmultiprocess packages to use local git subtree (Ryan Ofsky)
9b35518d2f depends, moveonly: split up int_get_build_id function (Ryan Ofsky)
2d373e2707 lint: Add exclusions for libmultiprocess subtree (Ryan Ofsky)
e88ab394c1 doc: Update documentation to explain libmultiprocess subtree (Ryan Ofsky)
d4bc563982 cmake: Fix clang-tidy "no input files" errors (Ryan Ofsky)
abdf3cb645 cmake: Fix warnings from boost headers (Ryan Ofsky)
8532fcb1c3 cmake: Fix ctest mptest "Unable to find executable" errors (Ryan Ofsky)
d597ab1dee cmake: Support building with libmultiprocess subtree (Ryan Ofsky)
69f0d4adb7 scripted-diff: s/WITH_MULTIPROCESS/ENABLE_IPC/ in cmake (Ryan Ofsky)
a2f28e4be9 Squashed 'src/ipc/libmultiprocess/' content from commit 35944ffd23fa (Ryan Ofsky)
d6244f85c5 depends: Update libmultiprocess library to simplify cmake subtree build (Ryan Ofsky)

Pull request description:

  This adds the [libmultiprocess](https://github.com/chaincodelabs/libmultiprocess) library and code generator as a subtree in `src/ipc/libmultiprocess` and allows it to be built with the cmake `-DENABLE_IPC` option, which is disabled by default.

  This PR does not entirely remove the depends system [libmultiprocess package](https://github.com/bitcoin/bitcoin/blob/master/depends/packages/native_libmultiprocess.mk) because the package is useful when cross compiling. (A cross-compiling cmake build cannot easily build and run a native code generation tool.) However, it does update the depends package to build from the new git subtree, instead of being downloaded separately from github, so the same sources are used to build both the runtime library and the code generator.

  This PR includes the following manual changes (not created automatically with `git subtree add`) which just update the build system and documentation:

  - [`d6244f85c509` depends: Update libmultiprocess library to simplify cmake subtree build](d6244f85c5)
  - [`69f0d4adb72c` scripted-diff: s/WITH_MULTIPROCESS/ENABLE_IPC/ in cmake](69f0d4adb7)
  - [`d597ab1dee6b` cmake: Support building with libmultiprocess subtree](d597ab1dee)
  - [`8532fcb1c30d` cmake: Fix ctest mptest "Unable to find executable" errors](8532fcb1c3)
  - [`abdf3cb6456f` cmake: Fix warnings from boost headers](abdf3cb645)
  - [`d4bc5639829f` cmake: Fix clang-tidy "no input files" errors](d4bc563982)
  - [`e88ab394c163` doc: Update documentation to explain libmultiprocess subtree](e88ab394c1)
  - [`2d373e27071f` lint: Add exclusions for libmultiprocess subtree](2d373e2707)
  - [`9b35518d2f3f` depends, moveonly: split up int_get_build_id function](9b35518d2f)
  - [`5d105fb8c3ff` depends: Switch libmultiprocess packages to use local git subtree](5d105fb8c3)
  - [`babb9f5db641` depends: remove non-native libmultiprocess build](babb9f5db6)

  ---

  Previous minisketch subtree PR #23114 may be useful for comparison

  Instructions for subtree verification can be found:

  - https://github.com/bitcoin/bitcoin/blob/master/doc/developer-notes.md#subtrees
  - https://github.com/bitcoin/bitcoin/tree/master/test/lint#git-subtree-checksh

  TL&DR:

  ```sh
  git remote add --fetch libmultiprocess https://github.com/chaincodelabs/libmultiprocess.git
  test/lint/git-subtree-check.sh -r src/ipc/libmultiprocess
  ```

  ---

  This PR is part of the [process separation project](https://github.com/bitcoin/bitcoin/issues/28722).

ACKs for top commit:
  Sjors:
    re-ACK babb9f5db6
  TheCharlatan:
    tACK babb9f5db6
  vasild:
    ACK babb9f5db6

Tree-SHA512: 43d4eecca5aab63e55c613de935965666eaced327f9fe859a0e9c9b85f7685dc16c5c8d6e03e09ca998628c5d468633f4f743529930b037049abe8e0101e0143
2025-04-11 13:40:31 +01:00
laanwj
7749d929a0 Remove support for RNDR/RNDRRS for aarch64 on Linux
This hardware feature is

- rarely supported on SoCs (and broken on like half of the chips that support it in the first place) (#31817)
- apparently not compiled into the release binary (https://github.com/bitcoin/bitcoin/issues/31817#issuecomment-2795885962)
- hard to test in CI, due to unavailable of hardware

Better to remove it.

This reverts commit aee5404e02.

Closes #31817.
2025-04-11 08:11:41 +02:00
merge-script
a4fd565191
Merge bitcoin/bitcoin#31727: miniscript: convert non-critical asserts to CHECK_NONFATAL
Some checks are pending
CI / test each commit (push) Waiting to run
CI / macOS 14 native, arm64, no depends, sqlite only, gui (push) Waiting to run
CI / macOS 14 native, arm64, fuzz (push) Waiting to run
CI / Windows native, VS 2022 (push) Waiting to run
CI / Windows native, fuzz, VS 2022 (push) Waiting to run
CI / Linux->Windows cross, no tests (push) Waiting to run
CI / Windows, test cross-built (push) Blocked by required conditions
CI / ASan + LSan + UBSan + integer, no depends, USDT (push) Waiting to run
ff0194a7ce miniscript: convert non-critical asserts to CHECK_NONFATAL (Antoine Poinsot)

Pull request description:

  The Miniscript code contains assertions to prevent ending up in an insane state or prevent UB, but also to enforce logical invariants. For the latter it is not necessary to crash the program if they are broken. Raising an exception suffices, especially as this code is often called through the RPC interface which can in turn handle the exception and the user can report it to developers.

  This revives #28678 from Pieter Wuille.

ACKs for top commit:
  hodlinator:
    ACK ff0194a7ce
  TheCharlatan:
    ACK ff0194a7ce
  brunoerg:
    code review ACK ff0194a7ce

Tree-SHA512: 8ed8f7b494e46ecf7cdebe75120cd0ffe543b6bc289bf882dac631fe2ec2cae590d5f7bc2316e52db085791694b136dffbc71c40c1e16886fa53ab00bd8cabd0
2025-04-10 14:06:21 -04:00
merge-script
e364e6b509
Merge bitcoin/bitcoin#32176: net: Prevent accidental circuit sharing when using Tor stream isolation
Some checks are pending
CI / Windows, test cross-built (push) Blocked by required conditions
CI / ASan + LSan + UBSan + integer, no depends, USDT (push) Waiting to run
CI / test each commit (push) Waiting to run
CI / macOS 14 native, arm64, no depends, sqlite only, gui (push) Waiting to run
CI / macOS 14 native, arm64, fuzz (push) Waiting to run
CI / Windows native, VS 2022 (push) Waiting to run
CI / Windows native, fuzz, VS 2022 (push) Waiting to run
CI / Linux->Windows cross, no tests (push) Waiting to run
ec81a72b36 net: Add randomized prefix to Tor stream isolation credentials (laanwj)
c47f81e8ac net: Rename `_randomize_credentials` Proxy parameter to `tor_stream_isolation` (laanwj)

Pull request description:

  Add a class TorsStreamIsolationCredentialsGenerator that generates unique credentials based on a randomly generated session prefix and an atomic counter. Use this in `ConnectThroughProxy` instead of a simple atomic int counter.

  This makes sure that different launches of the application won't share the same credentials, and thus circuits, even in edge cases.

  Example with `-debug=proxy`:
  ```
  2025-03-31T16:30:27Z [proxy] SOCKS5 sending proxy authentication 0afb2da441f5c105-0:0afb2da441f5c105-0
  2025-03-31T16:30:31Z [proxy] SOCKS5 sending proxy authentication 0afb2da441f5c105-1:0afb2da441f5c105-1
  ```

  Thanks to hodlinator in https://github.com/bitcoin/bitcoin/pull/32166#discussion_r2020973352 for the idea.

ACKs for top commit:
  hodlinator:
    re-ACK ec81a72b36
  jonatack:
    ACK ec81a72b36
  danielabrozzoni:
    tACK ec81a72b36

Tree-SHA512: 195f5885fade77545977b91bdc41394234ae575679cb61631341df443fd8482cd74650104e323c7dbfff7826b10ad61692cca1284d6810f84500a3488f46597a
2025-04-10 12:42:34 -04:00