ab654c7d58 Unroll Keccak-f implementation (Pieter Wuille)
3f01ddb01b Add SHA3 benchmark (Pieter Wuille)
2ac8bf9583 Implement keccak-f[1600] and SHA3-256 (Pieter Wuille)
Pull request description:
Add a simple (and initially unoptimized) Keccak/SHA3 implementation based on https://github.com/mjosaarinen/tiny_sha3/blob/master/sha3.c, as one will be needed for TORv3 support (the conversion from BIP155 encoding to .onion notation uses a SHA3-based checksum). In follow-up commits, a benchmark is added, and the Keccakf function is unrolled for a (for me) 4.9x speedup.
Test vectors are taken from https://csrc.nist.gov/projects/cryptographic-algorithm-validation-program/secure-hashing#sha3vsha3vss.
ACKs for top commit:
practicalswift:
ACK ab654c7d58 -- patch looks correct and no sanitizer complaints when doing some basic fuzz testing of the added code (remember: **don't trust: fuzz!**) :)
laanwj:
re-ACK ab654c7d58
vasild:
ACK ab654c7
Tree-SHA512: 8a91b18c46e8fb178b7ff82046cff626180362337e515b92fbbd771876e795da2ed4e3995eb4849773040287f6e687237f469a90474ac53f521fc12e0f5031d9
Use object returned from add_p2p_connection to refer to
p2ps. Add a test class attribute if it needs to be used across
many methods. Don't use the p2p property.
The field m_protect is used to protect from eviction both by bad/lagging
chain and extra outbound peers logics. Outbound block-relay peers are
always excluded from this protection.
Instead of printing "<n> of the last 100 blocks have unexpected version"
as a warning appended to UpdateTip, it is now printed in the validation
log category.
Although loadwallet() in rpcwallet.cpp assumes LoadWallet() always
assign some value to the 'status', but LoadWallet() does not do so
in some situation.
This fixes above and prevends loadwallet() returns ambiguous error code.
Adds two new features to MiniWallet:
* The fee rate is irrelevant sometimes, so just set an arbitrary default
* The utxo to spend needs to be selected manually sometimes
fa188c9c59 test: Use MiniWalet in p2p_feefilter (MarcoFalke)
fa39c62eb7 test: inline hashToHex (MarcoFalke)
Pull request description:
This introduces a minimalistic test wallet, which can be used as a drop in replacement for the Bitcoin Core wallet to create dummy transactions with a given fee rate.
ACKs for top commit:
jnewbery:
utACK fa188c9c59
Tree-SHA512: 0aad9cb14eea4f0055bd6a47cc8c8f82a16941b152598c3bf1e083aae84cca4ffa23f0b854a362a68be1b917deba1b5ec7c0207b63b0805d747ba9a7d1d82efe
No longer create a default wallet. The default wallet will still be
loaded if it exists and not other wallets were specified (anywhere,
including settings.json, bitcoin.conf, and command line).
Tests are updated to be started with -wallet= if they need the default
wallet.
Added test to wallet_startup.py testing that no default wallet is
created and that it is loaded if it exists and no other wallets were
specified.
fa7e407b50 Do not pass chain params to CheckForStaleTipAndEvictPeers twice (MarcoFalke)
Pull request description:
`PeerManager` already keeps a reference to the chain params as a member variable. No need to pass it in once again as a function parameter.
ACKs for top commit:
naumenkogs:
utACK fa7e407b50
jnewbery:
code review ACK fa7e407b50
epson121:
Code review ACK fa7e407b50
Tree-SHA512: 640c2d8adf9f1d54d0bfbdf81989064be2f5ba4b534d07d42258b372dc130f7b9c3fd087c7d28f0439678d124127f5d6f82f3139b1766f59f5ed661e7ac2a923
This commit ignores the qa-assets/ folder in case a user is running
the cov_fuzz target without DIR_FUZZ_SEED_CORPUS set. In this case,
the qa-assets folder is assumed to live in the bitcoin directory and
should be ignored by git.
This commit allows the user to specify the location of the
fuzz_seed_corpus directory on their machine when running the cov_fuzz
target. If DIR_FUZZ_SEED_CORPUS is specified, then this will be used.
Otherwise, qa-assets/fuzz_seed_corpus is assumed to be in the bitcoin
directory.
There is a keyword that allows us to break out of loops. Use it.
There's a small change in behaviour here: if we process multiple orphans
that are still orphans, then we'll only call mempool.check() once at the
end, instead of after processing each tx.
bb6a32ce99 [net processing] Move Misbehaving() to PeerManager (John Newbery)
aa114b1c9b [net_processing] Move SendBlockTransactions into PeerManager (John Newbery)
3115e00f75 [net processing] Move MaybePunishPeerForTx to PeerManager (John Newbery)
e662e2d42a [net processing] Move ProcessOrphanTx to PeerManager (John Newbery)
b70cd890e3 [net processing] Move MaybePunishNodeForBlock into PeerManager (John Newbery)
d7778351bf [net processing] Move ProcessHeadersMessage to PeerManager (John Newbery)
64f6162651 [whitespace] tidy up indentation after scripted diff (John Newbery)
58bd369b0d scripted-diff: [net processing] Rename PeerLogicValidation to PeerManager (John Newbery)
2297b26b3c [net_processing] Pass chainparams to PeerLogicValidation constructor (John Newbery)
824bbd1ffb [move only] Collect all private members of PeerLogicValidation together (John Newbery)
Pull request description:
Continues the work of moving net_processing logic into PeerLogicValidation. See https://github.com/bitcoin/bitcoin/pull/19704 and https://github.com/bitcoin/bitcoin/pull/19607#discussion_r462032894 for motivation.
This PR also renames `PeerLogicValidation` to `PeerManager` as suggested in https://github.com/bitcoin/bitcoin/pull/10756#pullrequestreview-53892618.
ACKs for top commit:
MarcoFalke:
re-ACK bb6a32ce99 only change is rebase due to conflict in struct NodeContext and variable rename 🤸
hebasto:
re-ACK bb6a32ce99, only rebased, and added renaming `s/peer_logic/peerman/` into scripted-diff since my [previous](https://github.com/bitcoin/bitcoin/pull/19791#pullrequestreview-483118079) review (verified with `git range-diff`).
Tree-SHA512: a2de4a521688fd25125b401e5575402c52b328a0fa27b3010567008d4f596b960aabbd02b2d81f42658f88f4365443fadb1008150a62fbcea123fb42d85a2c21
Block-relay-only peers were introduced by #15759. According to its
author, it was intented to make them only immune to outbound peer
rotation-based eviction and not from all eviction as modified comment
leans to think of.
Clearly indicate that outbound block-relay peers aren't protected
from eviction by the bad/lagging chain logic.
-BEGIN VERIFY SCRIPT-
sed -i 's/PeerLogicValidation/PeerManager/g' $(git grep -l PeerLogicValidation ./src ./test)
sed -i 's/peer_logic/peerman/g' $(git grep -l peer_logic ./src ./test)
-END VERIFY SCRIPT-
PeerLogicValidation was originally net_processing's implementation to
the validation interface. It has since grown to contain much of
net_processing's logic. Therefore rename it to reflect its
responsibilities.
Suggested in
https://github.com/bitcoin/bitcoin/pull/10756#pullrequestreview-53892618.
Keep a references to chainparams, rather than calling the global
Params() function every time it's needed. This is fine, since
globalChainParams does not get updated once it's been set, and it's
available at the point of constructing the PeerLogicValidation object.
We don't have a project style for ordering class members, but it always
makes sense to have no more than one of each public/protected/private
specifier.
Also move documentation for MaybeDiscourageAndDisconnect to the header.
296be8f58e Get rid of unused functions CTxMemPool::GetMemPoolChildren, CTxMemPool::GetMemPoolParents (Jeremy Rubin)
46d955d196 Remove mapLinks in favor of entry inlined structs with iterator type erasure (Jeremy Rubin)
Pull request description:
Currently we have a peculiar data structure in the mempool called maplinks. Maplinks job is to track the in-pool children and parents of each transaction. This PR can be primarily understood and reviewed as a simple refactoring to remove this extra data structure, although it comes with a nice memory and performance improvement for free.
Maplinks is particularly peculiar because removing it is not as simple as just moving it's inner structure to the owning CTxMempoolEntry. Because TxLinks (the class storing the setEntries for parents and children) store txiters to each entry in the mempool corresponding to the parent or child, it means that the TxLinks type is "aware" of the boost multiindex (mapTx) it's coming from, which is in turn, aware of the entry type stored in mapTx. Thus we used maplinks to store this entry associated data we in an entirely separate data structure just to avoid a circular type reference caused by storing a txiter inside a CTxMempoolEntry.
It turns out, we can kill this circular reference by making use of iterator_to multiindex function and std::reference_wrapper. This allows us to get rid of the maplinks data structure and move the ownership of the parents/child sets to the entries themselves.
The benefit of this good all around, for any of the reasons given below the change would be acceptable, and it doesn't make the code harder to reason about or worse in any respect (as far as I can tell, there's no tradeoff).
### Simpler ownership model
No longer having to consistency check that mapLinks did have records for our CTxMempoolEntry, impossible to have a mapLinks entry outlive or incorrectly die before a CTxMempoolEntry.
### Memory Usage
We get rid of a O(Transactions) sized map in the mempool, which is a long lived data structure.
### Performance
If you have a CTxMemPoolEntry, you immediately know the address of it's children/parents, rather than having to do a O(log(Transactions)) lookup via maplinks (which we do very often). We do it in *so many* places that a true benchmark has to look at a full running node, but it is easy enough to show an improvement in this case.
The ComplexMemPool shows a good coherence check that we see the expected result of it being 12.5% faster / 1.14x faster.
```
Before:
# Benchmark, evals, iterations, total, min, max, median
ComplexMemPool, 5, 1, 1.40462, 0.277222, 0.285339, 0.279793
After:
# Benchmark, evals, iterations, total, min, max, median
ComplexMemPool, 5, 1, 1.22586, 0.243831, 0.247076, 0.244596
```
The ComplexMemPool benchmark only checks doing addUnchecked and TrimToSize for 800 transactions. While this bench does a good job of hammering the relevant types of function, it doesn't test everything.
Subbing in 5000 transactions shows a that the advantage isn't completely wiped out by other asymptotic factors (this isn't the only bottleneck in growing the mempool), but it's only a bit proportionally slower (10.8%, 1.12x), which adds evidence that this will be a good change for performance minded users.
```
# Benchmark, evals, iterations, total, min, max, median
ComplexMemPool, 5, 1, 59.1321, 11.5919, 12.235, 11.7068
# Benchmark, evals, iterations, total, min, max, median
ComplexMemPool, 5, 1, 52.1307, 10.2641, 10.5206, 10.4306
```
I don't think it's possible to come up with an example of where a maplinks based design would have better performance, but it's something for reviewers to consider.
# Discussion
## Why maplinks in the first place?
I spoke with the author of mapLinks (sdaftuar) a while back, and my recollection from our conversation was that it was implemented because he did not know how to resolve the circular dependency at the time, and there was no other reason for making it a separate map.
## Is iterator_to weird?
iterator_to is expressly for this purpose, see https://www.boost.org/doc/libs/1_51_0/libs/multi_index/doc/tutorial/indices.html#iterator_to
> iterator_to provides a way to retrieve an iterator to an element from a pointer to the element, thus making iterators and pointers interchangeable for the purposes of element pointing (not so for traversal) in many situations. This notwithstanding, it is not the aim of iterator_to to promote the usage of pointers as substitutes for real iterators: the latter are specifically designed for handling the elements of a container, and not only benefit from the iterator orientation of container interfaces, but are also capable of exposing many more programming bugs than raw pointers, both at compile and run time. iterator_to is thus meant to be used in scenarios where access via iterators is not suitable or desireable:
>
> - Interoperability with preexisting APIs based on pointers or references.
> - Publication of pointer-based interfaces (for instance, when designing a C-compatible library).
> - The exposure of pointers in place of iterators can act as a type erasure barrier effectively decoupling the user of the code from the implementation detail of which particular container is being used. Similar techniques, like the famous Pimpl idiom, are used in large projects to reduce dependencies and build times.
> - Self-referencing contexts where an element acts upon its owner container and no iterator to itself is available.
In other words, iterator_to is the perfect tool for the job by the last reason given. Under the hood it should just be a simple pointer cast and have no major runtime overhead (depending on if the function call is inlined).
Edit by laanwj: removed at sign from the description
ACKs for top commit:
jonatack:
re-ACK 296be8f per `git range-diff ab338a19 3ba1665 296be8f`, sanity check gcc 10.2 debug build is clean.
hebasto:
re-ACK 296be8f58e, only rebased since my [previous](https://github.com/bitcoin/bitcoin/pull/19478#pullrequestreview-482400727) review (verified with `git range-diff`).
Tree-SHA512: f5c30a4936fcde6ae32a02823c303b3568a747c2681d11f87df88a149f984a6d3b4c81f391859afbeb68864ef7f6a3d8779f74a58e3de701b3d51f78e498682e