There is a keyword that allows us to break out of loops. Use it.
There's a small change in behaviour here: if we process multiple orphans
that are still orphans, then we'll only call mempool.check() once at the
end, instead of after processing each tx.
bb6a32ce99 [net processing] Move Misbehaving() to PeerManager (John Newbery)
aa114b1c9b [net_processing] Move SendBlockTransactions into PeerManager (John Newbery)
3115e00f75 [net processing] Move MaybePunishPeerForTx to PeerManager (John Newbery)
e662e2d42a [net processing] Move ProcessOrphanTx to PeerManager (John Newbery)
b70cd890e3 [net processing] Move MaybePunishNodeForBlock into PeerManager (John Newbery)
d7778351bf [net processing] Move ProcessHeadersMessage to PeerManager (John Newbery)
64f6162651 [whitespace] tidy up indentation after scripted diff (John Newbery)
58bd369b0d scripted-diff: [net processing] Rename PeerLogicValidation to PeerManager (John Newbery)
2297b26b3c [net_processing] Pass chainparams to PeerLogicValidation constructor (John Newbery)
824bbd1ffb [move only] Collect all private members of PeerLogicValidation together (John Newbery)
Pull request description:
Continues the work of moving net_processing logic into PeerLogicValidation. See https://github.com/bitcoin/bitcoin/pull/19704 and https://github.com/bitcoin/bitcoin/pull/19607#discussion_r462032894 for motivation.
This PR also renames `PeerLogicValidation` to `PeerManager` as suggested in https://github.com/bitcoin/bitcoin/pull/10756#pullrequestreview-53892618.
ACKs for top commit:
MarcoFalke:
re-ACK bb6a32ce99 only change is rebase due to conflict in struct NodeContext and variable rename 🤸
hebasto:
re-ACK bb6a32ce99, only rebased, and added renaming `s/peer_logic/peerman/` into scripted-diff since my [previous](https://github.com/bitcoin/bitcoin/pull/19791#pullrequestreview-483118079) review (verified with `git range-diff`).
Tree-SHA512: a2de4a521688fd25125b401e5575402c52b328a0fa27b3010567008d4f596b960aabbd02b2d81f42658f88f4365443fadb1008150a62fbcea123fb42d85a2c21
Block-relay-only peers were introduced by #15759. According to its
author, it was intented to make them only immune to outbound peer
rotation-based eviction and not from all eviction as modified comment
leans to think of.
Clearly indicate that outbound block-relay peers aren't protected
from eviction by the bad/lagging chain logic.
-BEGIN VERIFY SCRIPT-
sed -i 's/PeerLogicValidation/PeerManager/g' $(git grep -l PeerLogicValidation ./src ./test)
sed -i 's/peer_logic/peerman/g' $(git grep -l peer_logic ./src ./test)
-END VERIFY SCRIPT-
PeerLogicValidation was originally net_processing's implementation to
the validation interface. It has since grown to contain much of
net_processing's logic. Therefore rename it to reflect its
responsibilities.
Suggested in
https://github.com/bitcoin/bitcoin/pull/10756#pullrequestreview-53892618.
Keep a references to chainparams, rather than calling the global
Params() function every time it's needed. This is fine, since
globalChainParams does not get updated once it's been set, and it's
available at the point of constructing the PeerLogicValidation object.
We don't have a project style for ordering class members, but it always
makes sense to have no more than one of each public/protected/private
specifier.
Also move documentation for MaybeDiscourageAndDisconnect to the header.
296be8f58e Get rid of unused functions CTxMemPool::GetMemPoolChildren, CTxMemPool::GetMemPoolParents (Jeremy Rubin)
46d955d196 Remove mapLinks in favor of entry inlined structs with iterator type erasure (Jeremy Rubin)
Pull request description:
Currently we have a peculiar data structure in the mempool called maplinks. Maplinks job is to track the in-pool children and parents of each transaction. This PR can be primarily understood and reviewed as a simple refactoring to remove this extra data structure, although it comes with a nice memory and performance improvement for free.
Maplinks is particularly peculiar because removing it is not as simple as just moving it's inner structure to the owning CTxMempoolEntry. Because TxLinks (the class storing the setEntries for parents and children) store txiters to each entry in the mempool corresponding to the parent or child, it means that the TxLinks type is "aware" of the boost multiindex (mapTx) it's coming from, which is in turn, aware of the entry type stored in mapTx. Thus we used maplinks to store this entry associated data we in an entirely separate data structure just to avoid a circular type reference caused by storing a txiter inside a CTxMempoolEntry.
It turns out, we can kill this circular reference by making use of iterator_to multiindex function and std::reference_wrapper. This allows us to get rid of the maplinks data structure and move the ownership of the parents/child sets to the entries themselves.
The benefit of this good all around, for any of the reasons given below the change would be acceptable, and it doesn't make the code harder to reason about or worse in any respect (as far as I can tell, there's no tradeoff).
### Simpler ownership model
No longer having to consistency check that mapLinks did have records for our CTxMempoolEntry, impossible to have a mapLinks entry outlive or incorrectly die before a CTxMempoolEntry.
### Memory Usage
We get rid of a O(Transactions) sized map in the mempool, which is a long lived data structure.
### Performance
If you have a CTxMemPoolEntry, you immediately know the address of it's children/parents, rather than having to do a O(log(Transactions)) lookup via maplinks (which we do very often). We do it in *so many* places that a true benchmark has to look at a full running node, but it is easy enough to show an improvement in this case.
The ComplexMemPool shows a good coherence check that we see the expected result of it being 12.5% faster / 1.14x faster.
```
Before:
# Benchmark, evals, iterations, total, min, max, median
ComplexMemPool, 5, 1, 1.40462, 0.277222, 0.285339, 0.279793
After:
# Benchmark, evals, iterations, total, min, max, median
ComplexMemPool, 5, 1, 1.22586, 0.243831, 0.247076, 0.244596
```
The ComplexMemPool benchmark only checks doing addUnchecked and TrimToSize for 800 transactions. While this bench does a good job of hammering the relevant types of function, it doesn't test everything.
Subbing in 5000 transactions shows a that the advantage isn't completely wiped out by other asymptotic factors (this isn't the only bottleneck in growing the mempool), but it's only a bit proportionally slower (10.8%, 1.12x), which adds evidence that this will be a good change for performance minded users.
```
# Benchmark, evals, iterations, total, min, max, median
ComplexMemPool, 5, 1, 59.1321, 11.5919, 12.235, 11.7068
# Benchmark, evals, iterations, total, min, max, median
ComplexMemPool, 5, 1, 52.1307, 10.2641, 10.5206, 10.4306
```
I don't think it's possible to come up with an example of where a maplinks based design would have better performance, but it's something for reviewers to consider.
# Discussion
## Why maplinks in the first place?
I spoke with the author of mapLinks (sdaftuar) a while back, and my recollection from our conversation was that it was implemented because he did not know how to resolve the circular dependency at the time, and there was no other reason for making it a separate map.
## Is iterator_to weird?
iterator_to is expressly for this purpose, see https://www.boost.org/doc/libs/1_51_0/libs/multi_index/doc/tutorial/indices.html#iterator_to
> iterator_to provides a way to retrieve an iterator to an element from a pointer to the element, thus making iterators and pointers interchangeable for the purposes of element pointing (not so for traversal) in many situations. This notwithstanding, it is not the aim of iterator_to to promote the usage of pointers as substitutes for real iterators: the latter are specifically designed for handling the elements of a container, and not only benefit from the iterator orientation of container interfaces, but are also capable of exposing many more programming bugs than raw pointers, both at compile and run time. iterator_to is thus meant to be used in scenarios where access via iterators is not suitable or desireable:
>
> - Interoperability with preexisting APIs based on pointers or references.
> - Publication of pointer-based interfaces (for instance, when designing a C-compatible library).
> - The exposure of pointers in place of iterators can act as a type erasure barrier effectively decoupling the user of the code from the implementation detail of which particular container is being used. Similar techniques, like the famous Pimpl idiom, are used in large projects to reduce dependencies and build times.
> - Self-referencing contexts where an element acts upon its owner container and no iterator to itself is available.
In other words, iterator_to is the perfect tool for the job by the last reason given. Under the hood it should just be a simple pointer cast and have no major runtime overhead (depending on if the function call is inlined).
Edit by laanwj: removed at sign from the description
ACKs for top commit:
jonatack:
re-ACK 296be8f per `git range-diff ab338a19 3ba1665 296be8f`, sanity check gcc 10.2 debug build is clean.
hebasto:
re-ACK 296be8f58e, only rebased since my [previous](https://github.com/bitcoin/bitcoin/pull/19478#pullrequestreview-482400727) review (verified with `git range-diff`).
Tree-SHA512: f5c30a4936fcde6ae32a02823c303b3568a747c2681d11f87df88a149f984a6d3b4c81f391859afbeb68864ef7f6a3d8779f74a58e3de701b3d51f78e498682e
zmqconfig.h is currently not really needed anywhere, except that
it declares zmqError (which is then defined in
zmqnotificationinterface.cpp). Note in particular that there is
no need to conditionally include zmq.h only if ZMQ is enabled, because
the place in the core code where the ZMQ library itself is included
(init.cpp) is conditional already on that.
This commit removes zmqconfig.h and replaces it by a much simpler
zmqutil.h library for zmqError. The definition of the function is
moved to the matching (newly created) zmqutil.cpp.
Instead of returning a raw pointer from CZMQNotifierFactory and
implicitly requiring the caller to know that it has to take ownership,
return a std::unique_ptr to make this explicit.
This also changes the typedef for CZMQNotifierFactory to use the new
C++11 using syntax, which makes it (a little) less cryptic.
This factors out the common logic to run over all ZMQ notifiers, call a
function on them, and remove them from the list if the function fails is
extracted to a helper method.
Note that this also fixes a potential memory leak: When a notifier was
removed previously after its callback returned false, it would just be
removed from the list without destructing the object. This is now done
correctly by std::unique_ptr behind the scenes.
This is a pure refactoring of zmqnotificationinterface to make the
code easier to read and maintain. It replaces explicit iterators
with C++11 for-each loops where appropriate and uses std::unique_ptr
to make memory ownership more explicit.
fafb381af8 Remove mempool global (MarcoFalke)
fa0359c5b3 Remove mempool global from p2p (MarcoFalke)
eeee1104d7 Remove mempool global from init (MarcoFalke)
Pull request description:
This refactor unlocks some nice potential features, such as, but not limited to:
* Removing the fee estimates global (would avoid slightly fragile workarounds such as #18766)
* Making the mempool optional for a "blocksonly" operation mode
Even absent those features, the new code without the global should be easier to maintain, read and write tests for.
ACKs for top commit:
jnewbery:
utACK fafb381af8
hebasto:
ACK fafb381af8, I have reviewed the code and it looks OK, I agree it can be merged.
darosior:
ACK fafb381af8
Tree-SHA512: a2e696dc377e2e81eaf9c389e6d13dde4a48d81f3538df88f4da502d3012dd61078495140ab5a5854f360a06249fe0e1f6a094c4e006d8b5cc2552a946becf26
7bf6dfbb48 wallet: Remove path checking code from bitcoin-wallet tool (Russell Yanofsky)
77d5bb72b8 wallet: Remove path checking code from createwallet RPC (Russell Yanofsky)
a987438e9d wallet: Remove path checking code from loadwallet RPC (Russell Yanofsky)
8b5e7297c0 refactor: Pass wallet database into CWallet::Create (Russell Yanofsky)
3c815cfe54 wallet: Remove Verify and IsLoaded methods (Russell Yanofsky)
0d94e60625 refactor: Use DatabaseStatus and DatabaseOptions types (Russell Yanofsky)
b5b414151a wallet: Add MakeDatabase function (Russell Yanofsky)
288b4ffb6b Remove WalletLocation class (Russell Yanofsky)
Pull request description:
Get rid of file path handling in wallet application code and move it down to database layer.
There is no change in behavior except for some changed error messages.
Motivation for this change is to make code more understandable, but also to prepare for adding SQLite support in #19077 so SQLite implementation can be contained at the database layer and wallet loading code does not need to become more complicated.
ACKs for top commit:
achow101:
ACK 7bf6dfbb48
meshcollider:
Code re-review and functional test run ACK 7bf6dfbb48
Tree-SHA512: 23ad18324c9e8947f0cf88a3734c2e9fb25536b2cb4d552cf5d1a4ade320fbffb73bb2d1b3a99585c11630aa7092e0fcfc2dd4fe65b91e3a54161433a5cd13cb
637d8bce74 Change FILE_CHAR_BLOCKLIST to FILE_CHARS_DISALLOWED (Benoit Verret)
Pull request description:
Blocklist is ambiguous. It could mean a list of blocks.
Example: "blocknotify" in the same file refers to Bitcoin blocks.
ACKs for top commit:
MarcoFalke:
ACK 637d8bce74
laanwj:
ACK 637d8bce74 — this is a clear variable name improvement
theStack:
ACK 637d8bce74
jonatack:
ACK 637d8bce74
eriknylund:
ACK 637d8bce74
promag:
ACK 637d8bce74.
Tree-SHA512: 028e7102eeaf61105736c55c119a7f5c05411f2b6715a7939c41cb9e8f13afb757bbb6e7a302b3aae21722e69dab91f6eff8099e5884d248299905b4c7687c02
2f79e9d002 refactor: remove unused header <arpa/inet.h> in protocol.cpp (Sebastian Falbesoner)
Pull request description:
There is no code using types or functions related to "internet operations" anymore in protocol.cpp (since #735, more than 8 years ago!), hence the header include can be removed.
ACKs for top commit:
practicalswift:
ACK 2f79e9d002 -- patch looks correct and CI is happy
epson121:
Code review ACK 2f79e9d002
laanwj:
ACK 2f79e9d002
promag:
Code review ACK 2f79e9d002.
Tree-SHA512: b3f75fa080125a34ce224f11eb13ec7b914cd9930e3bbed24f550031ce92a7e0830e8ff20159d737ffe487dfd28c24c273ad5e89c6932c8c6960d7fadb6c5e54
56b018ca7f test: Fix flaky wallet_basic test (Fabian Jahr)
Pull request description:
Fixes#19853
I investigated the issue in #19876 and I still intend to fix the underlying issue of a race when using wallet RPCs right after starting a node in that PR. However, since that is a bit more complicated than I initially thought it makes sense to merge the fix of the test so the intermittent test failures stop. This fix in the test is going to be needed, either way, #19876 will only provide an error where before it was reporting a false balance.
Top commit has no ACKs.
Tree-SHA512: 52bb2388a3e77aa20d26ab0fd45796bc1781483b1cffe49cbb44e2488a72e76998edfb1198495373f9c6fd2ec26064d4176bd1a64dd59806622d5e50a4f4e870
fa8e148714 ci: Double tsan CPU and Memory to avoid global timeout (MarcoFalke)
Pull request description:
Fix#19864
ACKs for top commit:
practicalswift:
ACK fa8e148714 -- patch looks correct
hebasto:
ACK fa8e148714, according to https://cirrus-ci.org/guide/linux/ the limits are:
Tree-SHA512: b6d522290bfe80ed7453387b811628bf42c7657aa6a84d2f5984c8bb16f9857a71eabc6b8a4d63b84227d59b41a8ed7dd85d86cae5628dc9cf6b85bd365248d7
fa9ee52556 doc: Add doxygen comment to IsRBFOptIn (MarcoFalke)
faef4fc9b4 Remove mempool global from interfaces (MarcoFalke)
fa831684e5 refactor: Add IsRBFOptInEmptyMempool (MarcoFalke)
Pull request description:
The chain interface has an `m_node` member, which has a pointer to the mempool global. Use the pointer instead of the global to prepare the removal of the mempool global. See #19556
ACKs for top commit:
jnewbery:
utACK fa9ee52556
darosior:
ACK fa9ee52
hebasto:
re-ACK fa9ee52556, since my [previous](https://github.com/bitcoin/bitcoin/pull/19848#pullrequestreview-482403942) review:
Tree-SHA512: 11b4c1446f0860a743fdaa67f95c52bf0262d0a4f888be0eaf07ee497448965d32be414111bf016bd568f2989cde923430e3a3889e224057b73c499f06de7199
86d4cf42d9 Increase the ip address relay branching factor for unreachable networks (Pieter Wuille)
Pull request description:
Onion addresses propagate very badly among the IPv4/IPv6 network, resulting
in difficulty for those to find each other.
The branching factor 1 is probably so low that propagations die out before
they reach another onion peer. Increase it to 1.5 on average.
ACKs for top commit:
practicalswift:
ACK 86d4cf42d9 -- patch looks correct
naumenkogs:
ACK 86d4cf4
jonatack:
ACK 86d4cf42d9. Code review, built and running with some sanity check logging. `RelayAddress()` is called by `ProcessMessage() ADDR` msg handling, from within the loop while processing each new address to relay it to a limited number of other nodes. According to git blame, the line setting `nRelayNodes` hasn't been touched since 2016 in e736772c56 *Move network-msg-processing code out of main to its own file*, which moved the line but otherwise did not change it. Running a mixed clearnet/onion node with this patch and the logging below, I'm only seeing values of `fReachable 1, nRelayNodes 2`. IIUC, I need to use the settings in `init.cpp` that call `SetReachable(*, false)`. *Edit:* with `onlynet=onion` am now seeing entries of `fReachable 0` with `nRelayNodes` values of 1 and 2.
vasild:
ACK 86d4cf42d
Tree-SHA512: 22391e16d60bcfdec9a9336728da39d68a24a183b3d1b0e8fbc038d265ca6ddf71d16db018f3678745fd9f3e9281049e42197fa0a29124833c50a9170ed6f793