51e9393c1f refactor: s/command/msg_type/ in CNetMsgMaker and CSerializedNetMsg (Sebastian Falbesoner)
Pull request description:
Follow-up PR for #18533 -- another small step towards getting rid of the confusing "command" terminology. Also see PR #18610 which tackled the functional tests.
ACKs for top commit:
MarcoFalke:
ACK 51e9393c1f
Tree-SHA512: bb6f05a7be6823d5c4eab1d05b31fee944e700946827ad9425d59a3957fd879776c88c606319cbe9832d9451b275baedf913b71429ea3e01e4e82bf2d419e819
96954d1794 DNS seeds: don't query DNS while network is inactive (Anthony Towns)
fa5894f7f5 DNS seeds: wait for 5m instead of 11s if 1000+ peers are known (Anthony Towns)
Pull request description:
Changes the logic for querying DNS seeds: after this PR, if there's less than 1000 entries in addrman, it will still usually query DNS seeds after 11s (unless the first few peers tried mostly succeed), but if there's more than 1000 entries it won't try DNS seeds until 5 minutes have passed without getting multiple outbound peers. (If there's 0 entries in addrman, it will still immediately query the DNS seeds). Additionally, delays querying DNS seeds while the p2p network is not active.
Fixes#15434
ACKs for top commit:
fanquake:
ACK 96954d1794 - Ran some tests of different scenarios. More documentation is being added in #19084.
ariard:
Tested ACK 96954d1, on Debian 9.1. Both MANY_PEERS/FEW_PEERS cases work.
Sjors:
tACK 96954d1 (rebased on master) on macOS 10.15.4. It found it useful to run with `-debug=addrman` and change `DNSSEEDS_DELAY_MANY_PEERS` to something lower to test the behaviour, as well as renaming `peers.dat` to test the peer threshold.
naumenkogs:
utACK 96954d1794
Tree-SHA512: 73693db3da73bf8e76c3df9e9c82f0a7fb08049187356eac2575c4ffa455f76548dd1c86a11fc6beea8a3baf0ba020e047bebe927883c731383ec72442356005
5478d6c099 logging: thread safety annotations (Anthony Towns)
e685ca1992 util/system.cpp: add thread safety annotations for dir_locks (Anthony Towns)
a788789948 test/checkqueue_tests: thread safety annotations (Anthony Towns)
479c5846f7 rpc/blockchain.cpp: thread safety annotations for latestblock (Anthony Towns)
8b5af3d4c1 net: fMsgProcWake use LOCK instead of lock_guard (Anthony Towns)
de7c5f41ab wallet/wallet.h: Remove mutexScanning which was only protecting a single atomic bool (Anthony Towns)
c3cf2f5501 rpc/blockchain.cpp: Remove g_utxosetscan mutex that is only protecting a single atomic variable (Anthony Towns)
Pull request description:
In a few cases we need to use `std::mutex` rather than the sync.h primitives. But `std::lock_guard<std::mutex>` doesn't include the clang thread safety annotations unless you also use clang's C library, which means you can't indicate when variables should be guarded by `std::mutex` mutexes.
This adds an annotated version of `std::lock_guard<std::mutex>` to threadsafety.h to fix that, and modifies places where `std::mutex` is used to take advantage of the annotations.
It's based on top of #16112, and turns the thread safety comments included there into annotations.
It also changes the RAII classes in wallet/wallet.h and rpc/blockchain.cpp to just use the atomic<bool> flag for synchronisation rather than having a mutex that doesn't actually guard anything as well.
ACKs for top commit:
MarcoFalke:
ACK 5478d6c099🗾
hebasto:
re-ACK 5478d6c099, only renamed s/`MutexGuard`/`LockGuard`/, and dropped the commit "test/util_threadnames_tests: add thread safety annotations" since the [previous](https://github.com/bitcoin/bitcoin/pull/16127#pullrequestreview-414184113) review.
ryanofsky:
Code review ACK 5478d6c099. Thanks for taking suggestions! Only changes since last review are dropping thread rename test commit d53072ec730d8eec5a5b72f7e65a54b141e62b19 and renaming mutex guard to lock guard
Tree-SHA512: 7b00d31f6f2b5a222ec69431eb810a74abf0542db3a65d1bbad54e354c40df2857ec89c00b4a5e466c81ba223267ca95f3f98d5fbc1a1d052a2c3a7d2209790a
static constexpr CMessageHeader::HEADER_SIZE is already used in this file,
src/net.cpp, in 2 instances. This commit replaces the remaining 2 integer
values with it and adds the explicit include header.
Co-authored by: Gleb Naumenko <naumenko.gs@gmail.com>
If 1000 potential peers are known, wait for 5m before querying DNS seeds
for more peers, since eventually the addresses we already know should
get us connected. Also check every 11s whether we've got enough active
outbounds that DNS seeds aren't worth querying, and exit the dnsseed
thread early if so.
16d6113f4f Refactor message transport packaging (Jonas Schnelli)
Pull request description:
This PR factors out transport packaging logic from `CConnman::PushMessage()`.
It's similar to #16202 (where we refactor deserialization).
This allows implementing a new message transport protocol like BIP324.
ACKs for top commit:
dongcarl:
ACK 16d6113f4f FWIW
ariard:
Code review ACK 16d6113
elichai:
semiACK 16d6113f4f ran functional+unit tests.
MarcoFalke:
ACK 16d6113f4f🙎
Tree-SHA512: 8c2f8ab9f52e9b94327973ae15019a08109d5d9f9247492703a842827c5b5d634fc0411759e0bb316d824c586614b0220c2006410851933613bc143e58f7e6c1
3c1bc40205 Add extra logging of asmap use and bucketing (Gleb Naumenko)
e4658aa8ea Return mapped AS in RPC call getpeerinfo (Gleb Naumenko)
ec45646de9 Integrate ASN bucketing in Addrman and add tests (Gleb Naumenko)
8feb4e4b66 Add asmap utility which queries a mapping (Gleb Naumenko)
Pull request description:
This PR attempts to solve the problem explained in #16599.
A particular attack which encouraged us to work on this issue is explained here [[Erebus Attack against Bitcoin Peer-to-Peer Network](https://erebus-attack.comp.nus.edu.sg/)] (by @muoitranduc)
Instead of relying on /16 prefix to diversify the connections every node creates, we would instead rely on the (ip -> ASN) mapping, if this mapping is provided.
A .map file can be created by every user independently based on a router dump, or provided along with the Bitcoin release. Currently we use the python scripts written by @sipa to create a .map file, which is no larger than 2MB (awesome!).
Here I suggest adding a field to peers.dat which would represent a hash of asmap file used while serializing addrman (or 0 for /16 prefix legacy approach).
In this case, every time the file is updated (or grouping method changed), all buckets will be re-computed.
I believe that alternative selective re-bucketing for only updated ranges would require substantial changes.
TODO:
- ~~more unit tests~~
- ~~find a way to test the code without including >1 MB mapping file in the repo.~~
- find a way to check that mapping file is not corrupted (checksum?)
- comments and separate tests for asmap.cpp
- make python code for .map generation public
- figure out asmap distribution (?)
~Interesting corner case: I’m using std::hash to compute a fingerprint of asmap, and std::hash returns size_t. I guess if a user updates the OS to 64-bit, then the hash of asap will change? Does it even matter?~
ACKs for top commit:
laanwj:
re-ACK 3c1bc40205
jamesob:
ACK 3c1bc40205 ([`jamesob/ackr/16702.3.naumenkogs.p2p_supplying_and_using`](https://github.com/jamesob/bitcoin/tree/ackr/16702.3.naumenkogs.p2p_supplying_and_using))
jonatack:
ACK 3c1bc40205
Tree-SHA512: e2dc6171188d5cdc2ab2c022fa49ed73a14a0acb8ae4c5ffa970172a0365942a249ad3d57e5fb134bc156a3492662c983f74bd21e78d316629dcadf71576800c
-BEGIN VERIFY SCRIPT-
# Delete outdated alias for RecursiveMutex
sed -i -e '/CCriticalSection/d' ./src/sync.h
# Replace use of outdated alias with RecursiveMutex
sed -i -e 's/CCriticalSection/RecursiveMutex/g' $(git grep -l CCriticalSection)
-END VERIFY SCRIPT-
Instead of using /16 netgroups to bucket nodes in Addrman for connection
diversification, ASN, which better represents an actor in terms
of network-layer infrastructure, is used.
For testing, asmap.raw is used. It represents a minimal
asmap needed for testing purposes.
b6d2183858 Minor refactoring to remove implied m_addr_relay_peer. (User)
a552e8477c added asserts to check m_addr_known when it's used (User)
090b75c14b p2p: Avoid allocating memory for addrKnown where we don't need it (User)
Pull request description:
We should allocate memory for addrKnown filter only for those peers which are expected to participate in address relay.
Currently, we do it for all peers (including SPV and block-relay-only), which results in extra RAM where it's not needed.
Upd:
In future, we would still allow SPVs to ask for addrs, so allocation still will be done by default.
However, they will be able to opt-out via [this proposal](https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2019-October/017428.html) and then we could save some more memory.
This PR still saves memory for block-relay-only peers immediately after merging.
Top commit has no ACKs.
Tree-SHA512: e84d93b2615556d466f5ca0e543580fde763911a3bfea3127c493ddfaba8f05c8605cb94ff795d165af542b594400995a2c51338185c298581408687e7812463
c72906dcc1 refactor: Remove redundant c_str() calls in formatting (Wladimir J. van der Laan)
Pull request description:
Our formatter, tinyformat, *never* needs `c_str()` for strings. Still, many places call it redundantly, resulting in longer code and a slight overhead.
Remove redundant `c_str()` calls for:
- `strprintf`
- `LogPrintf`
- `tfm::format`
(also, combined with #17095, I think this improves logging in case of unexpected embedded NULL characters)
ACKs for top commit:
ryanofsky:
Code review ACK c72906dcc1. Easy to review with `git log -p -n1 --word-diff-regex=. -U0 c72906dcc11a73fa06a0adf97557fa756b551bee`
Tree-SHA512: 9e21e7bed8aaff59b8b8aa11571396ddc265fb29608c2545b1fcdbbb36d65b37eb361db6688dd36035eab0c110f8de255375cfda50df3d9d7708bc092f67fefc
ed2dc5e48a Add override/final modifiers to V1TransportDeserializer (Pieter Wuille)
f342a5e61a Make resetting implicit in TransportDeserializer::Read() (Pieter Wuille)
6a91499496 Remove oversized message detection from log and interface (Pieter Wuille)
b0e10ff4df Force CNetMessage::m_recv to use std::move (Jonas Schnelli)
efecb74677 Use adapter pattern for the network deserializer (Jonas Schnelli)
1a5c656c31 Remove transport protocol knowhow from CNetMessage / net processing (Jonas Schnelli)
6294ecdb8b Refactor: split network transport deserializing from message container (Jonas Schnelli)
Pull request description:
**This refactors the network message deserialization.**
* It transforms the `CNetMessage` into a transport protocol agnostic message container.
* A new class `TransportDeserializer` (unique pointer of `CNode`) is introduced, handling the network buffer reading and the decomposing to a `CNetMessage`
* **No behavioral changes** (in terms of disconnecting, punishing)
* Moves the checksum finalizing into the `SocketHandler` thread (finalizing was in `ProcessMessages` before)
The **optional last commit** makes the `TransportDeserializer` following an adapter pattern (polymorphic interface) to make it easier to later add a V2 transport protocol deserializer.
Intentionally not touching the sending part.
Pre-Requirement for BIP324 (v2 message transport protocol).
Replacement for #14046 and inspired by a [comment](https://github.com/bitcoin/bitcoin/pull/14046#issuecomment-431528330) from sipa
ACKs for top commit:
promag:
Code review ACK ed2dc5e48a.
marcinja:
Code review ACK ed2dc5e48a
ryanofsky:
Code review ACK ed2dc5e48a. 4 cleanup commits added since last review. Unaddressed comments:
ariard:
Code review and tested ACK ed2dc5e.
Tree-SHA512: bab8d87464e2e8742529e488ddcdc8650f0c2025c9130913df00a0b17ecdb9a525061cbbbd0de0251b76bf75a8edb72e3ad0dbf5b79e26f2ad05d61b4e4ded6d
6170ec5d3a Do not query all DNS seed at once (Pieter Wuille)
Pull request description:
Before this PR, when we don't have enough connections after 11 seconds, we proceed to query all DNS seeds in a fixed order, loading responses from all of them.
Change this to to only query three randomly-selected DNS seed. If 11 seconds later we still don't have enough connections, try again with another one, and so on.
This reduces the amount of information DNS seeds can observe about the requesters by spreading the load over all of them.
ACKs for top commit:
Sjors:
ACK 6170ec5d3
sdaftuar:
ACK 6170ec5d3a
jonasschnelli:
utACK 6170ec5d3a - I think the risk of a single seeder codebase is orthogonal to this PR. Such risks could also be interpreted differently (diversity could also increase the risk based on the threat model).
fanquake:
ACK 6170ec5d3a - Agree with the reasoning behind the change. Did some testing with and without `-forcednsseed` and/or a `peers.dat` and monitored the DNS activity.
Tree-SHA512: 33f6be5f924a85d312303ce272aa8f8d5e04cb616b4b492be98832e3ff37558d13d2b16ede68644ad399aff2bf5ff0ad33844e55eb40b7f8e3fddf9ae43add57
0ba08020c9 Disconnect peers violating blocks-only mode (Suhas Daftuar)
937eba91e1 doc: improve comments relating to block-relay-only peers (Suhas Daftuar)
430f489027 Don't relay addr messages to block-relay-only peers (Suhas Daftuar)
3a5e885306 Add 2 outbound block-relay-only connections (Suhas Daftuar)
b83f51a4bb Add comment explaining intended use of m_tx_relay (Suhas Daftuar)
e75c39cd42 Check that tx_relay is initialized before access (Suhas Daftuar)
c4aa2ba822 [refactor] Change tx_relay structure to be unique_ptr (Suhas Daftuar)
4de0dbac9b [refactor] Move tx relay state to separate structure (Suhas Daftuar)
26a93bce29 Remove unused variable (Suhas Daftuar)
Pull request description:
Transaction relay is optimized for a combination of redundancy/robustness as well as bandwidth minimization -- as a result transaction relay leaks information that adversaries can use to infer the network topology.
Network topology is better kept private for (at least) two reasons:
(a) Knowledge of the network graph can make it easier to find the source IP of a given transaction.
(b) Knowledge of the network graph could be used to split a target node or nodes from the honest network (eg by knowing which peers to attack in order to achieve a network split).
We can eliminate the risks of (b) by separating block relay from transaction relay; inferring network connectivity from the relay of blocks/block headers is much more expensive for an adversary.
After this commit, bitcoind will make 2 additional outbound connections that are only used for block relay. (In the future, we might consider rotating our transaction-relay peers to help limit the effects of (a).)
ACKs for top commit:
sipa:
ACK 0ba08020c9
ajtowns:
ACK 0ba08020c9 -- code review, ran tests. ran it on mainnet for a couple of days with MAX_BLOCKS_ONLY_CONNECTIONS upped from 2 to 16 and didn't observe any unexpected behaviour: it disconnected a couple of peers that tried sending inv's, and it successfully did compact block relay with some block relay peers.
TheBlueMatt:
re-utACK 0ba08020c9. Pointed out that stats.fRelayTxes was sometimes uninitialized for blocksonly peers (though its not a big deal and only effects RPC), which has since been fixed here. Otherwise changes are pretty trivial so looks good.
jnewbery:
utACK 0ba08020c9
jamesob:
ACK 0ba08020c9
Tree-SHA512: 4c3629434472c7dd4125253417b1be41967a508c3cfec8af5a34cad685464fbebbb6558f0f8f5c0d4463e3ffa4fa3aabd58247692cb9ab8395f4993078b9bcdf
Transaction relay is primarily optimized for balancing redundancy/robustness
with bandwidth minimization -- as a result transaction relay leaks information
that adversaries can use to infer the network topology.
Network topology is better kept private for (at least) two reasons:
(a) Knowledge of the network graph can make it easier to find the source IP of
a given transaction.
(b) Knowledge of the network graph could be used to split a target node or
nodes from the honest network (eg by knowing which peers to attack in order to
achieve a network split).
We can eliminate the risks of (b) by separating block relay from transaction
relay; inferring network connectivity from the relay of blocks/block headers is
much more expensive for an adversary.
After this commit, bitcoind will make 2 additional outbound connections that
are only used for block relay. (In the future, we might consider rotating our
transaction-relay peers to help limit the effects of (a).)