0487f91a20 test: Fix intermittent failure in rpc_net.py --v2transport (stratospher)
Pull request description:
Fixes#29508.
Make sure that v2 handshake is complete before comparing getpeerinfo outputs so that `transport_protocol_type` isn't stuck at 'detecting'.
This is done by adding a wait_until statement till `transport_protocol_type = v2` so that bitcoind waits until the v2 handshake is complete. (on the python side, this is ensured by default since `wait_for_handshake = True` inside `add_p2p_connection()`)
ACKs for top commit:
Sjors:
ACK 0487f91a20
mzumsande:
Code Review ACK 0487f91a20
achow101:
ACK 0487f91a20
vasild:
ACK 0487f91a20
Tree-SHA512: 44dd646a61cd38da243f527df7321e22d1821c2b090be43673027746098caf450c6671708ed731ba257952df6b5886e64c9c2f9686a82f6ef0f25780b7a87d3d
Make sure that v2 handshake is complete before comparing getpeerinfo
outputs so that `transport_protocol_type` isn't stuck at 'detecting'.
- on the python side, this is ensured by default
`wait_for_handshake = True` inside `add_p2p_connection()`.
- on the c++ side, add a wait_until statement till
`transport_protocol_type = v2` so that v2 handshake is complete.
Co-Authored-By: Martin Zumsande <mzumsande@gmail.com>
d8087adc7e [test] IsBlockMutated unit tests (dergoegge)
1ed2c98297 Add transaction_identifier::size to allow Span conversion (dergoegge)
1ec6bbeb8d [validation] Cache merkle root and witness commitment checks (dergoegge)
5bf4f5ba32 [test] Add regression test for #27608 (dergoegge)
49257c0304 [net processing] Don't process mutated blocks (dergoegge)
2d8495e080 [validation] Merkle root malleation should be caught by IsBlockMutated (dergoegge)
66abce1d98 [validation] Introduce IsBlockMutated (dergoegge)
e7669e1343 [refactor] Cleanup merkle root checks (dergoegge)
95bddb930a [validation] Isolate merkle root checks (dergoegge)
Pull request description:
This PR proposes to check for mutated blocks early as a defense-in-depth mitigation against attacks leveraging mutated blocks.
We introduce `IsBlockMutated` which catches all known forms of block malleation and use it to do an early mutation check whenever we receive a `block` message.
We have observed attacks that abused mutated blocks in the past, which could have been prevented by simply not processing mutated blocks (e.g. https://github.com/bitcoin/bitcoin/pull/27608 for which a regression test is included in this PR).
ACKs for top commit:
achow101:
ACK d8087adc7e
maflcko:
ACK d8087adc7e🏄
fjahr:
Code review ACK d8087adc7e
sr-gi:
Code review ACK d8087adc7e
Tree-SHA512: 618ff4ea7f168e10f07504d3651290efbb1bb2ab3b838ffff3527c028caf6c52dedad18d04d3dbc627977479710930e200f2dfae18a08f627efe7e64a57e535f
bf5662c678 test: enable v2 for python p2p depending on global --v2transport flag (Martin Zumsande)
6e9e39da43 test: Don't use v2transport when it's too slow. (Martin Zumsande)
87549c8f89 test: enable p2p_invalid_messages.py with v2transport (Martin Zumsande)
5fc9db504b test: enable p2p_sendtxrcncl.py with v2transport (Martin Zumsande)
Pull request description:
#24748 added v2 transport to the python `P2PConnection`, but so far each test that wants to make use of it needs to enable it on an individual basis.
This PR changes it so that if the test suite is run with `--v2transport` option, v2 is used in each test by default, not only for connections between two bitcoind instances as before, but also wherever `P2PConnection` is used. Individual tests can override this global option.
To do that, a few tests need to be adjusted.
In addition, I added a commit to always use v1 in a few select subtests that send a large number of large messages (e.g. large reorgs). These tests don't have a fundamental problem with v2 but become very slow due to the unoptimised python ChaCha20 implementation (~30 minutes on my computer, so probably not suitable to be run in the CI).
As a result, `python3 test_runner.py --v2transport` should succeed and use `v2` everywhere (unless v1 is chosen explicitly).
[Edit]: To make the "test each commit" CI pass, several test fixes were squashed into the last commit, which actually enables v2 p2p for `P2PConnection`. I have an unsquashed version at https://github.com/mzumsande/bitcoin/tree/202401_bip324_alltests_unsquashed, in case that helps with review.
ACKs for top commit:
fjahr:
tACK bf5662c678
vasild:
ACK bf5662c678
stratospher:
reACK bf5662c6.
theStack:
Tested ACK bf5662c678
Tree-SHA512: 4f5a08248ba8a755f7d0f48deb2b79bef03292345cacb7deef01be955481093800e4e56ff218ea56734eef5de1fb3ab0f04657447ea27d393bb536539d7b289d
fa3a4102ef fuzz: Set -rss_limit_mb=8000 for generate as well (MarcoFalke)
fa4e396e1d fuzz: Generate with random libFuzzer settings (MarcoFalke)
Pull request description:
Sometimes a libFuzzer setting like `-use_value_profile=1` helps [0], sometimes it hurts [1].
[0] https://github.com/bitcoin/bitcoin/pull/20789#issuecomment-752961937
[1] https://github.com/bitcoin/bitcoin/pull/27888#issuecomment-1645976254
By picking a random value, it is ensured that at least some of the runs will have the beneficial configuration set.
Also, set `-max_total_time` to prevent slow fuzz targets from getting a larger time share, or possibly peg to a single core for a long time and block the python script from exiting for a long time. This can be improved in the future. For example, the python script can exit after some time (https://github.com/bitcoin/bitcoin/pull/20752#discussion_r549248791). Alternatively, it can measure if coverage progress was made and run for less time if no progress has been made recently anyway, so that more time can be spent on targets that are new or still make progress.
ACKs for top commit:
murchandamus:
utACK fa3a4102ef
dergoegge:
utACK fa3a4102ef
brunoerg:
light ACK fa3a4102ef
Tree-SHA512: bfd04a76ca09aec612397bae5f3f263a608faa7087697169bd4c506c8195c4d2dd84ddc7fcd3ebbc75771eab618fad840af819114968ca3668fc730092376768
5f240ab2e8 test: Add option to skip unit tests for the test runner (Martin Zumsande)
Pull request description:
In the python `test_runner`, it's possible to disable specific functional tests (or just enable a few specific ones), but the unit tests for the python test framework cannot be skipped.
Add this option (`--skipunit` or `-u`), it would save some time for devs not interested in running those every time.
ACKs for top commit:
fjahr:
re-ACK 5f240ab2e8
tdb3:
Code review and tested ACK 5f240ab2e8
stratospher:
tested ACK 5f240ab.
Tree-SHA512: f7c9cfefc18a6510e24ca4601309b40fdf4180a4c5fe592be9cf7607be6541784b283c46c8d6e60740ff3eba83025dd5d0db36e55bf8bad1404b38120859e113
fa58ae74ea refactor: Add missing include for USE_BDB, USE_SQLITE to bench/wallet_ismine.cpp (MarcoFalke)
fa31908ea8 lint: Check for missing or redundant bitcoin-config.h includes (MarcoFalke)
fa63b0e351 lint: Make lint error easier to spot in output (MarcoFalke)
fa770fd368 doc: Add missing RUST_BACKTRACE=1 (MarcoFalke)
fa10051267 lint: Add get_subtrees() helper (MarcoFalke)
Pull request description:
Missing `bitcoin-config.h` includes are problematic, because the build could silently pass, but produce an unintended result. For example, a slower fallback algorithm could be picked, even though `bitcoin-config.h` indicates that a faster feature is available and should be used.
As the build succeeds silently, this problem is not possible to detect with iwyu.
Thus, fix this by using a linter based on grepping the source code.
ACKs for top commit:
theuni:
Weak ACK fa58ae74ea.
TheCharlatan:
ACK fa58ae74ea
hebasto:
ACK fa58ae74ea, tested on Ubuntu 23.10 -- it catches bugs properly. I didn't review rust code changes.
Tree-SHA512: cf4346f81ea5b8c215da6004cb2403d1aaf569589613c305d8ba00329b82b3841da94fe1a69815ce15f2edecbef9b031758ec9b6433564976190e3cf91ec8181
8d20602e55 test, assumeutxo: Add test to ensure failure when mempool not empty (Hernan Marino)
Pull request description:
Add a test to ensure that loadtxoutset fails when the node's mempool is not empty, as suggested by maflcko here: https://github.com/bitcoin/bitcoin/pull/27596#discussion_r1344713537
ACKs for top commit:
maflcko:
re-ACK 8d20602e55
BrandonOdiwuor:
ACK 8d20602e55
Tree-SHA512: 97c9668c0f38897934bf0d326515d898d4e682ff219deba9d751b35125b5cf33d51c9df116a74117ecf0394f28995a3d0cae1266b1e5acb4365ff4f309ce3f6c
44d11532f8 test: fix intermittent failure in wallet_reorgrestore.py (Martin Zumsande)
Pull request description:
By adding a missing `sync_blocks` call.
There was a race at `node2` between connecting the block produced by `node0`, and using `-generate` to create new blocks itself. In the failed run, block generation started before connecting the block, resulting in a final block height that was smaller by 1 than expected.
See https://github.com/bitcoin/bitcoin/issues/29392#issuecomment-1939541603 for a more detailed analysis of the failed run.
Can be reproduced by adding a sleep to [this spot](6ff0aa089c/src/validation.cpp (L4217)) in `ChainstateManager::ProcessNewBlock()`:
```
if (util::ThreadGetInternalName() == "msghand") {
std::this_thread::sleep_for(0.2s);
}
```
which fails for me on master and succeeds with the fix.
Fixes#29392
ACKs for top commit:
maflcko:
lgtm ACK 44d11532f8
Tree-SHA512: c08699e5ae348d4c0626022b519449d052f511d3f44601bcd8dac836a130a3f67fca149532e1e3690367ebfdcbcdd32e527170d039209c1f599ce861136ae29f
...by adding a missing sync_blocks call.
There was a race at node2 between connecting the block
produced by node 0, and using -generate to create new blocks
itself. In the failed run, the latter happened first,
resulting in a final block height that was smaller by 1 than
expected.
9a3c5c8697 scripted-diff: rename ZapSelectTx to RemoveTxs (furszy)
83b762845f wallet: batch and simplify ZapSelectTx process (furszy)
595d50a103 wallet: migration, remove extra NotifyTransactionChanged call (furszy)
a2b071f992 wallet: ZapSelectTx, remove db rewrite code (furszy)
Pull request description:
Work decoupled from #28574. Brother of #28894.
Includes two different, yet interconnected, performance and code improvements to the zap wallet transactions process.
1) As the goal of the `ZapSelectTx` function is to erase tx records that match any of the inputted hashes. There is no need to traverse the whole database record by record. We could just check if the tx exist, and remove it directly by calling `EraseTx()`.
2) Instead of performing single write operations per removed tx record, this PR batches them all within a single atomic db txn.
Moreover, these changes will enable us to consolidate all individual write operations that take place during the wallet migration process into a single db txn in the future.
ACKs for top commit:
achow101:
ACK 9a3c5c8697
josibake:
ACK 9a3c5c8697
Tree-SHA512: fb2ecc48224c400ab3b1fbb32e174b5b13bf03794717727f80f01f55fb183883b067a68c0a127b2de8885564da15425d021a96541953bf38a72becc2e9929ccf
This changes the default behavior, individual tests can overwrite this option.
As a result, it is possible to run the entire test suite with
--v2transport, and all connections to the python p2p will then use it.
Also adjust several tests that are already running with --v2transport in the
test runner (although they actually made v1 connection before this change).
This is done in the same commit so that there isn't an
intermediate commit in which the CI fails.
fa0ceae970 test: Fix utxo set hash serialisation signedness (MarcoFalke)
Pull request description:
It is unsigned in Bitcoin Core, so the tests should match it:
5b8990a1f3/src/kernel/coinstats.cpp (L54)
Large positive values for the block height are too difficult to hit in tests, but it still seems fine to fix this.
The bug was introduced when the code was written in 6ccc8fc067.
(Lowercase `i` means signed, see https://docs.python.org/3/library/struct.html#format-characters)
ACKs for top commit:
epiccurious:
Tested ACK fa0ceae970.
fjahr:
utACK fa0ceae970
Tree-SHA512: ab4405c74fb191fff8520b456d3a800cd084d616bb9ddca27d56b8e5c8969bd537490f6e204c1870dbb09a3e130b03b22a27b6644252a024059c200bbd9004e7
29029df5c7 [doc] v3 signaling in mempool-replacements.md (glozow)
e643ea795e [fuzz] v3 transactions and sigop-adjusted vsize (glozow)
1fd16b5c62 [functional test] v3 transaction submission (glozow)
27c8786ba9 test framework: Add and use option for tx-version in MiniWallet methods (MarcoFalke)
9a1fea55b2 [policy/validation] allow v3 transactions with certain restrictions (glozow)
eb8d5a2e7d [policy] add v3 policy rules (glozow)
9a29d470fb [rpc] return full string for package_msg and package-error (glozow)
158623b8e0 [refactor] change Workspace::m_conflicts and adjacent funcs/structs to use Txid (glozow)
Pull request description:
See #27463 for overall package relay tracking.
Delving Bitcoin discussion thread: https://delvingbitcoin.org/t/v3-transaction-policy-for-anti-pinning/340
Delving Bitcoin discussion for LN usage: https://delvingbitcoin.org/t/lightning-transactions-with-v3-and-ephemeral-anchors/418
Rationale:
- There are various pinning problems with RBF and our general ancestor/descendant limits. These policies help mitigate many pinning attacks and make package RBF feasible (see #28984 which implements package RBF on top of this). I would focus the most here on Rule 3 pinning. [1][2]
- Switching to a cluster-based mempool (see #27677 and #28676) requires the removal of CPFP carve out, which applications depend on. V3 + package RBF + ephemeral anchors + 1-parent-1-child package relay provides an intermediate solution.
V3 policy is for "Priority Transactions." [3][4] It allows users to opt in to more restrictive topological limits for shared transactions, in exchange for the more robust fee-bumping abilities that offers. Even though we don't have cluster limits, we are able to treat these transactions as having as having a maximum cluster size of 2.
Immediate benefits:
- You can presign a transaction with 0 fees (not just 1sat/vB!) and add a fee-bump later.
- Rule 3 pinning is reduced by a significant amount, since the attacker can only attach a maximum of 1000vB to your shared transaction.
This also enables some other cool things (again see #27463 for overall roadmap):
- Ephemeral Anchors
- Package RBF for these 1-parent-1-child packages. That means e.g. a commitment tx + child can replace another commitment tx using the child's fees.
- We can transition to a "single anchor" universe without worrying about package limit pinning. So current users of CPFP carve out would have something else to use.
- We can switch to a cluster-based mempool [5] (#27677#28676), which removes CPFP carve out [6].
[1]: Original mailing list post and discussion about RBF pinning problems https://gist.github.com/glozow/25d9662c52453bd08b4b4b1d3783b9ff, https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-January/019817.html
[2]: A FAQ is "we need this for cluster mempool, but is this still necessary afterwards?" There are some pinning issues that are fixed here and not fully fixed in cluster mempool, so we will still want this or something similar afterward.
[3]: Mailing list post for v3 https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-September/020937.html
[4]: Original PR #25038 also contains a lot of the discussion
[5]: https://delvingbitcoin.org/t/an-overview-of-the-cluster-mempool-proposal/393/7
[6]: https://delvingbitcoin.org/t/an-overview-of-the-cluster-mempool-proposal/393#the-cpfp-carveout-rule-can-no-longer-be-supported-12
ACKs for top commit:
sdaftuar:
ACK 29029df5c7
achow101:
ACK 29029df5c7
instagibbs:
ACK 29029df5c7 modulo that
Tree-SHA512: 9664b078890cfdca2a146439f8835c9d9ab483f43b30af8c7cd6962f09aa557fb1ce7689d5e130a2ec142235dbc8f21213881baa75241c5881660f9008d68450
The goal of the function is to erase the wallet transactions that
match the inputted hashes. There is no need to traverse the database,
reading record by record, to then perform single entry removals for
each of them.
To ensure consistency and improve performance, this change-set removes
all tx records within a single atomic db batch operation, as well as
it cleans up code, improves error handling and simplifies the
transactions removal process entirely.
This optimizes the removal of watch-only transactions during the wallet
migration process and the 'removeprunedfunds' RPC command.
b58f009d95 test: check that mempool msgs lead to disconnect if uploadtarget is reached (Sebastian Falbesoner)
dd5cf38818 test: check for specific disconnect reasons in feature_maxuploadtarget.py (Sebastian Falbesoner)
73d7372115 test: verify `-maxuploadtarget` limit state via `getnettotals` RPC result (Sebastian Falbesoner)
Pull request description:
This PR improves existing and adds new test coverage for the `-maxuploadtarget` mechanism (feature_maxuploadtarget.py) in the following ways, one commit each:
* verify the uploadtarget state via the `getnettotals` RPC (`uploadtarget` result field):
160d23677a/src/rpc/net.cpp (L581-L582)
Note that reaching the total limit (`target_reached` == True) always implies that the historical blocks serving limits is also reached (`serve_historical_blocks` == False), i.e. it's impossible that both flags are set to True.
* check for peer's specific disconnect reason (in this case, `"historical block serving limit reached, disconnect peer"`):
160d23677a/src/net_processing.cpp (L2272-L2280)
* add a test for a peer disconnect if the uploadtarget is reached and a `mempool` message is received (if bloom filters are enabled):
160d23677a/src/net_processing.cpp (L4755-L4763)
Note that another reason for disconnect after receiving a MEMPOOL msg of a peer is if bloom filters are disabled on the node. This case is already covered in the functional test `p2p_nobloomfilter_messages.py`.
ACKs for top commit:
maflcko:
lgtm ACK b58f009d95
achow101:
ACK b58f009d95
sr-gi:
tACK [b58f009](b58f009d95)
Tree-SHA512: 7439134129695c9c3a7ddc5e39f2ed700f91e7c91f0b7a9e0a783f275c6aa2f9918529cbfd38bb37f9139184e05e0f0354ef3c3df56da310177ec1d6b48b43d0
cc87ee4c39 test: fix intermittent failure in rpc_setban.py --v2transport (Martin Zumsande)
Pull request description:
This test failed for me on master locally:
The reason is that when initiating a v2 connection and being immediately disconnected, a node cannot know if the disconnect happens because the peer only supports v1, or because it has banned you, so it schedules to reconnect with v1. If the test doesn't wait for that, the reconnect can happen at a bad time, resulting in failure in a later `connect_nodes` call.
Also add the test with `--v2transport` to the test runner because banning with v2 seems like a useful thing to have test coverage for.
ACKs for top commit:
delta1:
tested ACK cc87ee4c39
epiccurious:
Concept ACK cc87ee4c39.
achow101:
ACK cc87ee4c39
stratospher:
tested ACK cc87ee4. nice find!
Tree-SHA512: ae234d9b771d9c9c11501ddd93c99cf93257c999de3da62280d4d51806cd246b289c10a5f41fa7d5651b2fb4fdaee753f5b2d6939a99f89d71aa012af4a4d231
cfcb9b1ecf test: wallet, coverage for concurrent db transactions (furszy)
548ecd1155 tests: Test for concurrent writes with db tx (Ava Chow)
395bcd2454 sqlite: Ensure that only one SQLiteBatch is writing to db at a time (Ava Chow)
Pull request description:
The way that we have configured SQLite to run means that only one database transaction can be open at a time. Typically, each individual read and write operation will be its own transaction that is opened and committed automatically by SQLite. However, sometimes we want these operations to be batched into a multi-statement transaction, so `SQLiteBatch::TxnBegin`, `SQLiteBatch::TxnCommit`, and `SQLiteBatch::TxnAbort` are used to manage the transaction of the database.
However, once a db transaction is begun with one `SQLiteBatch`, any operations performed by another `SQLiteBatch` will also occur within the same transaction. Furthermore, those other `SQLiteBatch`s will not be expecting a transaction to be active, and will abort it once the `SQLiteBatch` is destructed. This is problematic as it will prevent some data from being written, and also cause the `SQLiteBatch` that opened the transaction in the first place to be in an unexpected state and throw an error.
To avoid this situation, we need to prevent the multiple batches from writing at the same time. To do so, I've implemented added a `CSemaphore` within `SQLiteDatabase` which will be used by any `SQLiteBatch` trying to do a write operation. `wait()` is called by `TxnBegin`, and at the beginning of `WriteKey`, `EraseKey`, and `ErasePrefix`. `post()` is called in `TxnCommit`, `TxnAbort` and at the end of `WriteKey`, `EraseKey`, and `ErasePrefix`. To avoid deadlocking on ` TxnBegin()` followed by a `WriteKey()`, `SQLiteBatch will now also track whether a transaction is in progress so that it knows whether to use the semaphore.
This issue is not a problem for BDB wallets since BDB uses WAL and provides transaction objects that must be used if an operation is to occur within a transaction. Specifically, we either pass a transaction pointer, or a nullptr, to all BDB operations, and this allows for concurrent transactions so it doesn't have this problem.
Fixes#29110
ACKs for top commit:
josibake:
ACK cfcb9b1ecf
furszy:
ACK cfcb9b1ecf
ryanofsky:
Code review ACK cfcb9b1ecf. This looks great and I think it is ready for merge. Just holding off because josibake seemed ready to review https://github.com/bitcoin/bitcoin/pull/29112#issuecomment-1930372190 and might have more feedback.
Tree-SHA512: 2dd5a8e76df52451a40e0b8a87c7139d68a0d8e1bf2ebc79168cc313e192dab87cfa4270ff17fea4f7b370060d3bc9b5d294d50f7e07994d9b5a69b40397c927
facafa90f7 test: Fix CPartialMerkleTree.nTransactions signedness (MarcoFalke)
Pull request description:
It is unsigned in Bitcoin Core, so the tests should match it:
aa9231fafe/src/merkleblock.h (L59)
Large positive values, or "negative" values, are rejected anyway, but it still seems fine to fix this.
The bug was introduced when the code was written in d280617bf5.
(Lowercase `i` means signed, see https://docs.python.org/3/library/struct.html#format-characters)
ACKs for top commit:
theStack:
LGTM ACK facafa90f7
Empact:
ACK facafa90f7
Tree-SHA512: 35ac11bb5382dffe132bfae6097efc343ef6c06b1b4b1545130ca27b228ca6894679004862fee921b095172abaddbef5972c24d9bc195ce970f35643bd4a0f09
Sending multiple large messages is rather slow with the non-optimized python
implementation of ChaCha20.
Apart from the slowness, these tests would also run successfully with v2.
By adding to the test framework a wait until the v2 handshake
is completed, so that p2p_sendtxrcncl.py (which doesn't need
to be changed itself) doesnt't send out any other messages before that.
There are occasions where a multi-statement tx is begun in one batch,
and a second batch is created which does a normal write (without a
multi-statement tx). These should not conflict with each other and all
of the data should end up being written to disk.
e7fd70f4b6 [test] make v2transport arg in addconnection mandatory and few cleanups (stratospher)
Pull request description:
- make `v2transport` argument in `addconnection` regression-testing only RPC mandatory. https://github.com/bitcoin/bitcoin/pull/24748#discussion_r1470738750
- previously it was an optional arg with default `false` value.
- only place this RPC is used is in the [functional tests](11b436a66a/test/functional/test_framework/test_node.py (L742)) where we always pass the appropriate `v2transport` option to the RPC anyways. (and that too just for python dummy peer(`P2PInterface`) and bitcoind(`TestNode`) interactions)
- rename `v2_handshake()` to `_on_data_v2_handshake()` https://github.com/bitcoin/bitcoin/pull/24748#discussion_r1466958424
- more compact return statement in `wait_for_reconnect()` https://github.com/bitcoin/bitcoin/pull/24748#discussion_r1466979708
- assertion to check that empty version packets are received from `TestNode`.
ACKs for top commit:
glozow:
ACK e7fd70f4b6
theStack:
Code-review ACK e7fd70f4b6
mzumsande:
Code Review ACK e7fd70f4b6
Tree-SHA512: e66e29baccd91e1e4398b91f7d45c5fc7c2841d77d8a6178734586017bf2be63496721649da91848dec71da605ee31664352407d5bb896e624cc693767c61a1f
c340503b67 test: p2p: adhere to typical VERSION message protocol flow (Sebastian Falbesoner)
7ddfc28309 test: p2p: process post-v2-handshake data immediately (Sebastian Falbesoner)
b198b9c2ce test: p2p: introduce helper for sending prepared VERSION message (Sebastian Falbesoner)
Pull request description:
This PR addresses a quirk in the test framework's p2p implementation regarding the version handshake protocol:
Currently, the VERSION message is sent immediately after an inbound connection (i.e. TestNode outbound connection) is made. This doesn't follow the usual protocol flow where the initiator sends a version first, the responder processes that and only then responds with its own version message. Change that accordingly by only sending immediate VERSION message for outbound connections (or after v2 handshake for v2 connections, respectively), and sending out VERSION message as response for incoming VERSION messages (i.e. in the function `on_version`) for inbound connections.
I first stumbled upon this issue through reading comment https://mirror.b10c.me/bitcoin-bitcoin/24748/#discussion_r1465420112 (see last paragraph) and recently again in the course of working on a v2-followup for #29279, where this causes issues for TestNode outbound connections that disconnect *before* sending out their own version message.
Note that these changes lead to slightly more code in some functional tests that override the `on_version` method, as the version reply has to be sent explicitly now, but I think is less confusing and reflects better what is actually happening.
ACKs for top commit:
epiccurious:
utACK c340503b67
stratospher:
tested ACK c340503b67. very useful to have since we'd want real node behaviour!
mzumsande:
ACK c340503b67
sr-gi:
tACK c340503b67
Tree-SHA512: 63eac287d3e1c87a01852bfd9f0530363354bbb642280298673b9c8817056356373adf348955c4e92af95c7c6efa8cc515cee2892e9f077bfbe1bce8e97ad082
Note that another reason for disconnect after receiving a MEMPOOL msg of a peer
is if bloom filters are disabled on the node. This case is covered in the
functional test `p2p_nobloomfilter_messages.py`.
This ensures that the disconnect happens for the expected reason and
also makes it easier to navigate between implementation and test code,
i.e. both the questions "do we have test coverage for this disconnect?"
(from an implementation reader's perspective) and "where is the code
handling this disconnect?" (from a test reader's perspective) can be
answered simply by grep-ping the corresponding debug message.