0f83ab407e test: display abrupt shutdown errors in console output (furszy)
Pull request description:
Making it easier to debug errors in the CI environment,
particularly in scenarios where it's not immediately clear
what happened nor which node crashed (or shutdown abruptly).
A bit of context:
Currently, the test framework redirects each node's stderr output
stream to a different temporary file inside each node's data directory.
While this is sufficient for storing the error, it isn't very helpful for
understanding what happened just by reading the CI console output.
Most of the time, reading the stderr file in the CI environment is not
possible, because people don't have access to it.
Testing Note:
The displayed error difference can be observed by cherry-picking this
commit 9cc5393c0f on top of this branch and running any
functional test.
ACKs for top commit:
maflcko:
lgtm ACK 0f83ab407e
theStack:
ACK 0f83ab407e
Tree-SHA512: 83ce4d21d5316e8cb16a17d3fbe77b8649fced9e09410861d9674c233f6e9c34bcf573504e387e4f439c2841b2ee9855d0d35607fa13aa89eafe0080c45ee82d
Making it easier to debug errors in the CI environment,
particularly in scenarios where it's not immediately clear
what happened nor which node crashed (or shutdown abruptly).
9eac5a0529 [functional test] transaction orphan handling (glozow)
61e77bb901 [test framework] make it easier to fast-forward setmocktime (glozow)
Pull request description:
I was doing some mutation testing (through reckless refactoring) locally and found some specific behaviors in orphan handling that weren't picked up by tests. Adding some of these test cases now can maybe help with reviewing refactors like #28031.
- Parent requests aren't sent immediately. A delay is added and the requests are filtered by AlreadyHaveTx before they are sent, which means you can't use fake orphans to probe precise arrival timing of a tx.
- Parent requests include all that are not AlreadyHaveTx. This means old confirmed parents may be requested.
- The node does not give up on orphans if the peer responds to a parent request with notfound. This means that if a parent is an old confirmed transaction (in which notfound is expected), the orphan should still be resolved.
- Rejected parents can cause an orphan to be dropped, but it depends on the reason and only based on txid.
- Rejected parents can cause an orphan to be rejected too, by both wtxid and txid.
- Requests for orphan parents should be de-duplicated with "regular" txrequest. If a missing parent has the same hash as an in-flight request, it shouldn't be requested.
- Multiple orphans with overlapping parents should not cause duplicated parent requests.
ACKs for top commit:
instagibbs:
reACK 9eac5a0529
dergoegge:
reACK 9eac5a0529
achow101:
ACK 9eac5a0529
fjahr:
Code review ACK 9eac5a0529
Tree-SHA512: 85488dc6a3f62cf0c38e7dfe7839c01215b44b172d1755b18164d41d01038f3a749451241e4eba8b857fd344a445740b21d6382c45977234b21460e3f53b1b2a
Have each TestNode keep track of the last timestamp it called
setmocktime with, and add a bumpmocktime() function to bump by a
number of seconds. Makes it easy to fast forward n seconds without
keeping track of what the last timestamp was.
8a20f765cc test: drop duplicate getaddrs from p2p_getaddr_caching (Martin Zumsande)
feb0096139 test: fix intermittent failure in p2p_getaddr_caching (Martin Zumsande)
Pull request description:
Fixes#28133
In the consistency check, it's not enough to check that our address/port is unique, only the combination of source and target must be unique. Otherwise, the OS may reuse ports for connections to different `-addrbind`, which was happening in the failed runs.
While at it, the second commit cleans up duplicate `getaddr` messages in `p2p_getaddr_caching.py` that do nothing but generate `Ignoring repeated "getaddr"` log messages (and cleans up some whitespace the python linter complains about).
ACKs for top commit:
vasild:
ACK 8a20f765cc
Tree-SHA512: eabe4727d7887f729074076f6333a918bba8cb34b8e3baaa83f167b441b0daa24f7c4824abcf03a9538a2ef14b2d826ff19aeffcb93a6c20735253a9678aac9c
fafe43cb6c scripted-diff: Use blocks_path where possible (MarcoFalke)
fa060c15fb test: Add blocks_path property to TestNode (MarcoFalke)
faba4fc325 test: Drop 22.x node from TxindexCompatibilityTest (MarcoFalke)
fa7f65b0f8 test: Use clean chain in MempoolCompatibilityTest (MarcoFalke)
Pull request description:
The node in this test was never really needed, because the compatibility tests shouldn't be used to test previous releases. (The test suite of the previous release itself should be used for that). So remove it.
Also, other test changes. (See individual commits)
ACKs for top commit:
theStack:
Code-review ACK fafe43cb6c
Tree-SHA512: 289f54695bf5310663ab38ebf1aa457f53d0aafae56e6657be0e75bf96b303165bad417dc7eaf4c40f0639aa92ce139e5bacb318a2eabab1f8e23a811cabe0cc
Only the combined addr:port of source and destination
must be unique. If the destination is different, the same addr:port
for the source may be used by the OS.
read() fails in text mode when the unicode hasn't been fully written
yet. Fixes: "wallet_importdescriptors.py: can't decode bytes in position
228861-228863: unexpected end of data"
(https://github.com/bitcoin/bitcoin/issues/28030)
20b49460b3 test: remove race in the user-agent reception check (Vasil Dimov)
Pull request description:
In `add_p2p_connection()` we connect to `bitcoind` from the Python client and check that it has received our version string.
This check looked up the last/newest entry from `getpeerinfo` RPC, assuming that it must be the connection we have just opened. But this will not be the case if a new inbound or outbound connection is made to/from `bitcoind` in the meantime.
Instead of the last entry in `getpeerinfo`, check all and find the one which corresponds to our connection using our outgoing address:port tuple which is unique.
ACKs for top commit:
jonatack:
re-ACK 20b49460b3
MarcoFalke:
Concept ACK 20b49460b3
Tree-SHA512: 61fd3359ef11ea830021ede0e745497a7b60690c32d21c47cd12ff79f78907bb45e922c9d01e5d7ff614a8cd5e4560d39a3fc86b01b200429773a23ace3917e4
faf902858d test: Check expected_stderr after stop (MarcoFalke)
Pull request description:
This fixes a bug where stderr wasn't checked for the shutdown sequence.
Fix that by waiting for the shutdown to finish and then check stderr.
ACKs for top commit:
theStack:
ACK faf902858d
Tree-SHA512: a70cd1e6cda84d542782e41e8b59741dbcd472c0d0575bcef5cbfd1418473ce94efe921481d557bae3fbbdd78f1c49c09c48872883c052d87c5c9a9a51492692
The thread does not only load blocks, it loads the mempool and,
in a future commit, will start the indexes as well.
Also, renamed the 'ThreadImport' function to 'ImportBlocks'
And the 'm_load_block' class member to 'm_thread_load'.
-BEGIN VERIFY SCRIPT-
sed -i "s/ThreadImport/ImportBlocks/g" $(git grep -l ThreadImport -- ':!/doc/')
sed -i "s/loadblk/initload/g" $(git grep -l loadblk -- ':!/doc/release-notes/')
sed -i "s/m_load_block/m_thread_load/g" $(git grep -l m_load_block)
-END VERIFY SCRIPT-
In `add_p2p_connection()` we connect to `bitcoind` from the Python
client and check that it has received our version string.
This check looked up the last/newest entry from `getpeerinfo` RPC,
assuming that it must be the connection we have just opened. But this
will not be the case if a new inbound or outbound connection is made
to/from `bitcoind` in the meantime.
Instead of the last entry in `getpeerinfo`, check all and find the one
which corresponds to our connection using our outgoing address:port
tuple which is unique.
Co-authored-by: MarcoFalke <*~=`'#}+{/-|&$^_@721217.xyz>
Co-authored-by: Jon Atack <jon@atack.com>
Seems odd to hardcode all parent directory names in the path for no good
reason.
Also, add wallet_path property to TestNode.
Also, rework wallet_backup.py test for scripted-diff in the next commit.
It seems odd to return `EXIT_SUCCESS` when the node aborted
execution due a fatal internal error or any post-init problem
that triggers an unrequested shutdown.
e.g. blocks or coins db I/O errors, disconnect block failure,
failure during thread import (external blocks loading process
error), among others.
Co-authored-by: Ryan Ofsky <ryan@ofsky.org>
Show an error on startup if a bitcoin datadir that is being used contains a
`bitcoin.conf` file that is ignored. There are two cases where this could
happen:
- One case reported in
https://github.com/bitcoin/bitcoin/issues/27246#issuecomment-1470006043
happens when a bitcoin.conf file in the default datadir (e.g.
$HOME/.bitcoin/bitcoin.conf) has a "datadir=/path" line that sets different
datadir containing a second bitcoin.conf file. Currently the second
bitcoin.conf file is ignored with no warning.
- Another way this could happen is if a -conf= command line argument points
to a configuration file with a "datadir=/path" line and that specified path
contains a bitcoin.conf file, which is currently ignored.
This change only adds an error message and doesn't change anything about way
settings are applied. It also doesn't trigger errors if there are redundant
-datadir or -conf settings pointing at the same configuration file, only if
they are pointing at different files and one file is being ignored.
61bb4e783b lint: enable E722 do not use bare except (Leonardo Lazzaro)
Pull request description:
Improve test code and enable E722 lint check.
If you want to catch all exceptions that signal program errors, use except Exception: (bare except is equivalent to except BaseException:).
Reference: https://peps.python.org/pep-0008/#programming-recommendations
ACKs for top commit:
MarcoFalke:
lgtm ACK 61bb4e783b
Tree-SHA512: c7497769d5745fa02c78a20f4a0e555d8d3996d64af6faf1ce28e22ac1d8be415b98e967294679007b7bda2a9fd04031a9d140b24201e00257ceadeb5c5d7665
d8b12a75db rpc: Allow named and positional arguments to be used together (Ryan Ofsky)
Pull request description:
It's nice to be able to use named options and positional arguments together.
Most shell tools accept both, and python functions combine options and arguments allowing them to be passed with even more flexibility. This change adds support for python's approach so as a motivating example:
```sh
bitcoin-cli -named createwallet wallet_name=mywallet load_on_startup=1
```
Can be shortened to:
```sh
bitcoin-cli -named createwallet mywallet load_on_startup=1
```
JSON-RPC standard doesn't have a convention for passing named and positional parameters together, so this implementation makes one up and interprets any unused `"args"` named parameter as a positional parameter array.
This change is backwards compatible. It doesn't change the interpretation of any previously valid calls, just treats some previously invalid calls as valid.
Another use case even if you only occasionally use named arguments is that you can define an alias:
```
alias bcli='bitcoin-cli -named'
```
And now use both named named and unnamed arguments from the same alias without having to manually add `-named` option for named arguments or see annoying error "No '=' in named argument... this needs to be present for every argument (even if it is empty)`" for unnamed arguments
ACKs for top commit:
achow101:
ACK d8b12a75db
stickies-v:
re-ACK d8b12a75d
aureleoules:
re-ACK d8b12a75db
Tree-SHA512: 0cff8b50f584bcbbd376624adccf40536566ed8d1bcd6c88ad565dbc208f19d5e7a48c994efd6329d42b560149340d330397278f08a2912af5f3418d8c8837a9
fa10f193b5 test: Set default in add_wallet_options if only one type can be chosen (MacroFake)
555519d082 test: Remove wallet option from non-wallet tests (MacroFake)
fac8d59d31 test: Set -disablewallet when no wallet has been compiled (MacroFake)
fa68937b89 test: Make requires_wallet private (MacroFake)
Pull request description:
The tests have several issues:
* Some tests that are wallet-type specific offer the option to run the test with the incompatible type
For example, `wallet_dump.py` offers `--descriptors` and on current master fails with `JSONRPCException: Invalid public key`. After the changes here, it fails with a clear error: `unrecognized arguments: --descriptors`.
* Tests that don't use the wallet at all offer the option to run it with a wallet type. This is confusing and wastes developers time if they are "tricked" into running the test for both wallet types, even though no wallet code is executed at all.
For example, `feature_addrman.py` will happily accept and run with `--descriptors` or `--legacy-wallet`. After the changes here, it no longer silently ignores the flag, but reports a clear error: `unrecognized arguments`.
ACKs for top commit:
achow101:
ACK fa10f193b5
Tree-SHA512: a5784da7305f4ec58c0013f433289000d94fc3d434b00fc329ffa37b812e2cd1da0071e34c3462bf79d904808564f2ae6d3d582f6b86b26215f9b07391b58460
It's nice to be able to use named options and positional arguments together.
Most shell tools accept both, and python functions combine options and
arguments allowing them to be passed with even more flexibility. This change
adds support for python's approach so as a motivating example:
bitcoin-cli -named createwallet wallet_name=mywallet load_on_startup=1
Can be shortened to:
bitcoin-cli -named createwallet mywallet load_on_startup=1
JSON-RPC standard doesn't have a convention for passing named and positional
parameters together, so this implementation makes one up and interprets any
unused "args" named parameter as a positional parameter array.
Using disconnect_p2ps instead of peer_disconnect makes
the node wait for the disconnect to complete. As a result,
we can reuse p2p_idx=0 in the add_outbound_p2p_connection calls.
Currently, debug.log is spammed with messages from random.cpp
when functional tests are run. These logs are not useful for
debugging, and decrease the signal to noise ratio of the logs.
for verbose log messages for development or debugging only, as bitcoind may run
more slowly, that are more granular/frequent than the Debug log level, i.e. for
very high-frequency, low-level messages to be logged distinctly from
higher-level, less-frequent debug logging that could still be usable in production.
An example would be to log higher-level peer events (connection, disconnection,
misbehavior, eviction) as Debug, versus Trace for low-level, high-volume p2p
messages in the BCLog::NET category. This will enable the user to log only the
former without the latter, in order to focus on high-level peer management events.
With respect to the name, "trace" is suggested as the most granular level
in resources like the following:
- https://sematext.com/blog/logging-levels
- https://howtodoinjava.com/log4j2/logging-levels
Update the test framework and add test coverage.
c4d76c6faa tests: Tests for inactive HD chains (Andrew Chow)
8077862c5e wallet: Refactor TopUp to be able to top up inactive chains too (Andrew Chow)
70134eb34f wallet: Properly set hd chain counters when loading (Andrew Chow)
961b9e4e40 wallet: Parse hdKeypath if key_origin is not available (Andrew Chow)
0652ee73ec Add size check on meta.key_origin.path (Rob Fielding)
Pull request description:
Currently inactive HD chains are only derived from at the time a key in that chain is found to have been used. However, at that time, the wallet may not be able to derive keys (e.g. it is locked). Currently we would just move on and not derive any new keys, however this could result in missing funds.
This PR resolves this problem by adding memory only variables to `CHDChain` which track the highest known index. `TopUp` is modified to always try to top up the inactive HD chains, and this process will use the new variables to determine how much to top up. In this way, after an encrypted wallet is unlocked, the inactive HD chains will be topped up and hopefully funds will not be missed.
Note that because these variables are not persisted to disk (because `CHDChain`s for inactive HD chains are not written to disk), if an encrypted wallet is not unlocked in the same session as a key from an inactive chain is found to be used, then it will not be topped up later unless more keys are found.
Additionally, wallets which do not have upgraded key metadata will not derive any keys from inactive HD chains. This is resolved by using the derivation path string in `CKeyMetadata.hdKeypath` to determine what indexes to derive.
ACKs for top commit:
laanwj:
Code review ACK c4d76c6faa
Tree-SHA512: b2b572ad7f1b1b2847edece09f7583543d63997e18ae32764e5a27ad608dd64b9bdb2d84ea27137894e986a8e82f047a3dba9c8015b74f5f179961911f0c4095
eaf6be0114 [net processing] Do not request transaction relay from feeler connections (John Newbery)
0220b834b1 [test] Add testing for outbound feeler connections (John Newbery)
Pull request description:
Feelers are short-lived connections used to test the viability of peers. The bitcoind node will periodically open feeler connections to addresses in its addrman, wait for a `version` message from the peer, and then close the connection.
Currently, we set `fRelay` to `1` in the `version` message for feeler connections, indicating that we want the peer to relay transactions to us. However, we close the connection immediately on receipt of the `version` message, and so never process any incoming transaction announcements. This PR changes that behaviour to instead set `fRelay` to `0` indicating that we do not wish to receive transaction announcements from the peer.
This PR also extends the `addconnection` RPC to allow creating outbound feeler connections from the node to the test framework, and a test to verify that the node sets `fRelay` to `0` in the `version` message to feeler connections.
ACKs for top commit:
naumenkogs:
ACK eaf6be0114
MarcoFalke:
review ACK eaf6be0114🏃
Tree-SHA512: 1c56837dbd0a396fe404a5e39f7459864d15f666664d6b35ad109628b13158e077e417e586bf48946a23bd5cbe63716cb4bf22cdf8781b74dfce6047b87b465a
d9803f7a0a test: add stress tests for initialization (James O'Beirne)
23f85616a8 test: add node.chain_path and node.debug_log_path (James O'Beirne)
Pull request description:
In the course of coming up with a test plan for #23280, I thought it would be neat to include a Python snippet showing how I tested the initialization process. I quickly realized I was reinventing the functional test framework... so here's a new test.
This change bangs init around like the Fonz hitting a jukebox. It adds some interesting (read: lazy and random) coverage for the initialization process by
- interrupting init with SIGTERM after certain log statements,
- interrupting init at random points, and
- starting init with some essential data missing (block files, block indices, etc.) to test init error paths.
As far as I can tell, some of these code paths are uncovered otherwise (namely the startup errors).
---
Incidentally, I think I may have uncovered some kind of a bug or race condition with indexing initialization based on an intermittent failure in this testcase. This test sometimes fails after shutting down immediately after `loadblk` thread start:
```
2021-10-15T21:14:51.295000Z TestFramework (INFO): Starting node and will exit after line 'loadblk thread start'
36 │ 2021-10-15T21:14:51.296000Z TestFramework.node0 (DEBUG): bitcoind started, waiting for RPC to come up
37 │ 2021-10-15T21:14:51.493000Z TestFramework (INFO): terminating node after 110 log lines seen
38 │ 2021-10-15T21:14:51.625000Z TestFramework (INFO): Starting node and will exit after line 'txindex thread start'
39 │ 2021-10-15T21:14:51.625000Z TestFramework.node0 (DEBUG): bitcoind started, waiting for RPC to come up
------> [[ FAILURE HERE ]] 2021-10-15T21:15:21.626000Z TestFramework (WARNING): missed line {bail_line}; bailing now after {num_lines} lines
```
and then fails to start up afterwards. Combined logs showing `Error: txindex best block of the index goes beyond pruned data`, when the node under test is not pruned:
```
node0 2021-10-15T21:16:51.848439Z [shutoff] [validationinterface.cpp:244] [ChainStateFlushed] Enqueuing ChainStateFlushed: block hash=1014bc4ff4917602ae53d10e9dfe230af4b7d52a6cdaa8a47798b9c288180907
node0 2021-10-15T21:16:51.848954Z [shutoff] [init.cpp:302] [Shutdown] Shutdown: done
test 2021-10-15T21:16:51.882000Z TestFramework (ERROR): Unexpected exception caught during testing
Traceback (most recent call last):
File "/home/james/src/bitcoin/test/functional/test_framework/test_framework.py", line 132, in main
self.run_test()
File "/home/james/src/bitcoin/./test/functional/stress_init.py", line 87, in run_test
check_clean_start()
File "/home/james/src/bitcoin/./test/functional/stress_init.py", line 60, in check_clean_start
node.wait_for_rpc_connection()
File "/home/james/src/bitcoin/test/functional/test_framework/test_node.py", line 224, in wait_for_rpc_connection
raise FailedToStartError(self._node_msg(
test_framework.test_node.FailedToStartError: [node 0] bitcoind exited with status 1 during initialization
test 2021-10-15T21:16:51.882000Z TestFramework (DEBUG): Closing down network thread
test 2021-10-15T21:16:51.933000Z TestFramework (INFO): Stopping nodes
test 2021-10-15T21:16:51.933000Z TestFramework.node0 (DEBUG): Stopping node
node0 stderr Error: txindex best block of the index goes beyond pruned data. Please disable the index or reindex (which will download the whole blockchain again)
node0 stderr Error: txindex best block of the index goes beyond pruned data. Please disable the index or reindex (which will download the whole blockchain again)
node0 stderr Error: txindex best block of the index goes beyond pruned data. Please disable the index or reindex (which will download the whole blockchain again)
node0 stderr Error: txindex best block of the index goes beyond pruned data. Please disable the index or reindex (which will download the whole blockchain again)
node0 stderr Error: txindex best block of the index goes beyond pruned data. Please disable the index or reindex (which will download the whole blockchain again)
```
ACKs for top commit:
laanwj:
Code review ACK d9803f7a0a
Tree-SHA512: 4d80dc399daf199a6222e81e47d12d830dc7af07355eddbb7f52479a676a645b8d3d45093ff54a9295f01a163b2f4fe0e038e83fc269969e03d4cfda69eaf111