dcf0cb4776 tor: make a TORv3 hidden service instead of TORv2 (Vasil Dimov)
353a3fdaad net: advertise support for ADDRv2 via new message (Vasil Dimov)
201a4596d9 net: CAddress & CAddrMan: (un)serialize as ADDRv2 (Vasil Dimov)
1d3ec2a1fd Support bypassing range check in ReadCompactSize (Pieter Wuille)
Pull request description:
This PR contains the two remaining commits from #19031 to complete the [BIP155](https://github.com/bitcoin/bips/blob/master/bip-0155.mediawiki) implementation:
`net: CAddress & CAddrMan: (un)serialize as ADDRv2`
`net: advertise support for ADDRv2 via new message`
plus one more commit:
`tor: make a TORv3 hidden service instead of TORv2`
ACKs for top commit:
jonatack:
re-ACK dcf0cb4776 per `git diff 9b56a68 dcf0cb4` only change since last review is an update to the release notes which partially picked up the suggested text. Running a node on this branch and addnode-ing to 6 other Tor v3 nodes, I see "addrv2" and "sendaddrv2" messages in getpeerinfo in both the "bytesrecv_per_msg" and "bytessent_per_msg" JSON objects.
sipa:
ACK dcf0cb4776
hebasto:
re-ACK dcf0cb4776, the node works flawlessly in all of the modes: Tor-only, clearnet-only, mixed.
laanwj:
Edit: I have to retract this ACK for now, I'm having some problems with this PR on a FreeBSD node. It drops all outgoing connections with this dcf0cb4776 merged on master (12a1c3ad1a).
ariard:
Code Review ACK dcf0cb4
Tree-SHA512: 28d4d0d817b8664d2f4b18c0e0f31579b2f0f2d23310ed213f1f436a4242afea14dfbf99e07e15889bc5c5c71ad50056797e9307ff8a90e96704f588a6171308
Introduce a new message `sendaddrv2` to signal support for ADDRv2.
Send the new message immediately after sending the `VERACK` message.
Add support for receiving and parsing ADDRv2 messages.
Send ADDRv2 messages (instead of ADDR) to a peer if he has
advertised support for it.
Co-authored-by: Carl Dong <contact@carldong.me>
Recognizing addresses from those networks allows us to accept and gossip
them, even though we don't know how to connect to them (yet).
Co-authored-by: eriknylund <erik@daychanged.com>
Before this change `CNetAddr::ip` was a fixed-size array of 16 bytes,
not being able to store larger addresses (e.g. TORv3) and encoded
smaller ones as 16-byte IPv6 addresses.
Change its type to `prevector`, so that it can hold larger addresses and
do not disguise non-IPv6 addresses as IPv6. So the IPv4 address
`1.2.3.4` is now encoded as `01020304` instead of
`00000000000000000000FFFF01020304`.
Rename `CNetAddr::ip` to `CNetAddr::m_addr` because it is not an "IP" or
"IP address" (TOR addresses are not IP addresses).
In order to preserve backward compatibility with serialization (where
e.g. `1.2.3.4` is serialized as `00000000000000000000FFFF01020304`)
introduce `CNetAddr` dedicated legacy serialize/unserialize methods.
Adjust `CSubNet` accordingly. Still use `CSubNet::netmask[]` of fixed 16
bytes, but use the first 4 for IPv4 (not the last 4). Only allow
subnetting for IPv4 and IPv6.
Co-authored-by: Carl Dong <contact@carldong.me>
Before this change, we would analyze the contents of `CNetAddr::ip[16]`
in order to tell which type is an address. Change this by introducing a
new member `CNetAddr::m_net` that explicitly tells the type of the
address.
This is necessary because in BIP155 we will not be able to tell the
address type by just looking at its raw representation (e.g. both TORv3
and I2P are "seemingly random" 32 bytes).
As a side effect of this change we no longer need to store IPv4
addresses encoded as IPv6 addresses - we can store them in proper 4
bytes (will be done in a separate commit). Also the code gets
somewhat simplified - instead of
`memcmp(ip, pchIPv4, sizeof(pchIPv4)) == 0` we can use
`m_net == NET_IPV4`.
Co-authored-by: Carl Dong <contact@carldong.me>
Do not access `CNetAddr::ip` directly from `CService` methods.
This improvement will help later when we change the type of
`CNetAddr::ip` (in the BIP155 implementation).
Co-authored-by: Carl Dong <contact@carldong.me>
1cabbddbca refactor: Use uint16_t instead of unsigned short (Aaron Hook)
Pull request description:
I wanted to see if the `up for grabs` label works and looked at PR #17822 originally opend by ahook I saw it had many acks for example by jonatack and practicalswift but needed rebasing.
So I checked out the remote branch rebased it resolved three conflicts and continued the rebase.
Hope everything is as expected (:
ACKs for top commit:
sipsorcery:
ACK 1cabbddbca.
practicalswift:
ACK 1cabbddbca -- patch looks correct :)
laanwj:
ACK 1cabbddbca
hebasto:
ACK 1cabbddbca, I have reviewed the code and it looks OK, I agree it can be merged.
Tree-SHA512: 0e6bf64f274aae5dacb188358b4d5f65ccb207d4f70922f039bc4ed7934709418ddad19f8bfb7462517427837c3d2bb3f86ef284bb40e87119aad2a1e148d9d6
This patch improves performance and resource usage around IP
addresses that are banned for misbehavior. They're already not
actually banned, as connections from them are still allowed,
but they are preferred for eviction if the inbound connection
slots are full.
Stop treating these like manually banned IP ranges, and instead
just keep them in a rolling Bloom filter of misbehaving nodes,
which isn't persisted to disk or exposed through the ban
framework. The effect remains the same: preferred for eviction,
avoided for outgoing connections, and not relayed to other peers.
Also change the name of this mechanism to better reflect reality;
they're not banned, just discouraged.
Contains release notes and several interface improvements by
John Newbery.
f9ee0f37c2 Add comments to CustomUintFormatter (Pieter Wuille)
4eb5643e35 Convert everything except wallet/qt to new serialization (Pieter Wuille)
2b1f85e8c5 Convert blockencodings_tests to new serialization (Pieter Wuille)
73747afbbe Convert merkleblock to new serialization (Pieter Wuille)
d06fedd1bc Add SER_READ and SER_WRITE for read/write-dependent statements (Russell Yanofsky)
6f9a1e5ad0 Extend CustomUintFormatter to support enums (Russell Yanofsky)
769ee5fa00 Merge BigEndian functionality into CustomUintFormatter (Pieter Wuille)
Pull request description:
The next step of changes from #10785.
This:
* Adds support for enum serialization to `CustomUintFormatter`, used in `CAddress` for service flags.
* Merges `BigEndian` into `CustomUintFormatter`, used in `CNetAddr` for port numbers.
* Converts everything (except wallet and gui) to use the new serialization framework.
ACKs for top commit:
MarcoFalke:
re-ACK f9ee0f37c2, only change is new documentation commit for CustomUintFormatter 📂
ryanofsky:
Code review ACK f9ee0f37c2. Just new commit adding comment since last review
jonatack:
Code review re-ACK f9ee0f37c2 only change since last review is an additional commit adding Doxygen documentation for `CustomUintFormatter`.
Tree-SHA512: e7a0a36afae592d5a4ff8c81ae04d858ac409388e361f2bc197d9a78abca45134218497ab2dfd6d031e0cce0ca586cf857077b7c6ce17fccf67e2d367c1b6cd4
748977690e Add asmap_direct fuzzer that tests Interpreter directly (Pieter Wuille)
7cf97fda15 Make asmap Interpreter errors fatal and fuzz test it (Pieter Wuille)
c81aefc537 Add additional effiency checks to sanity checker (Pieter Wuille)
fffd8dca2d Add asmap sanity checker (Pieter Wuille)
5feefbe6e7 Improve asmap Interpret checks and document failures (Pieter Wuille)
2b3dbfa5a6 Deal with decoding failures explicitly in asmap Interpret (Pieter Wuille)
1479007a33 Introduce Instruction enum in asmap (Pieter Wuille)
Pull request description:
This improves/documents the failure cases inside the asmap interpreter. None of the changes are bug fixes (they only change behavior for corrupted asmap files), but they may make things easier to follow.
In a second step, a sanity checker is added that effectively executes every potential code path through the asmap file, checking the same failure cases as the interpreter, and more. It takes around 30 ms to run for me for a 1.2 MB asmap file.
I've verified that this accepts asmap files constructed by https://github.com/sipa/asmap/blob/master/buildmap.py with a large dataset, and no longer accepts it with 1 bit changed in it.
ACKs for top commit:
practicalswift:
ACK 748977690e modulo feedback below.
jonatack:
ACK 748977690e code review, regular build/tests/ran bitcoin with -asmap, fuzz build/ran both fuzzers overnight.
fjahr:
ACK 748977690e
Tree-SHA512: d876df3859735795c857c83e7155ba6851ce839bdfa10c18ce2698022cc493ce024b5578c1828e2a94bcdf2552c2f46c392a251ed086691b41959e62a6970821
c86bc14408 Make asmap Interpret tolerant of malicious map data (Pieter Wuille)
38c2395d7a Use ASNs for mapped IPv4 addresses correctly (Pieter Wuille)
6f8c937312 Mark asmap const in statistics code (Pieter Wuille)
d58bcdc4b5 Avoid asmap copies in initialization (Pieter Wuille)
Pull request description:
Here are a few things to improve in the asmap implementation. The first two commits are just code improvements. The last one is a bugfix (the exsting code wouldn't correctly apply ASN lookups to mapped/embedded IPv4 addresses).
ACKs for top commit:
practicalswift:
ACK c86bc14408 -- patch looks correct
naumenkogs:
utACK c86bc14
laanwj:
ACK c86bc14408
jonatack:
ACK c86bc14408 code looks correct, built/ran tests, bitcoind with -asmap pointed to asmap/demo.map
Tree-SHA512: 1036f43152754d621bfbecfd3b7c7276e4670598fcaed42a3d275e51fa2cf3653e2c9e9cfa714f6c7719362541510e92171e076ac4169b55a0cc8908b2d514c0
3c1bc40205 Add extra logging of asmap use and bucketing (Gleb Naumenko)
e4658aa8ea Return mapped AS in RPC call getpeerinfo (Gleb Naumenko)
ec45646de9 Integrate ASN bucketing in Addrman and add tests (Gleb Naumenko)
8feb4e4b66 Add asmap utility which queries a mapping (Gleb Naumenko)
Pull request description:
This PR attempts to solve the problem explained in #16599.
A particular attack which encouraged us to work on this issue is explained here [[Erebus Attack against Bitcoin Peer-to-Peer Network](https://erebus-attack.comp.nus.edu.sg/)] (by @muoitranduc)
Instead of relying on /16 prefix to diversify the connections every node creates, we would instead rely on the (ip -> ASN) mapping, if this mapping is provided.
A .map file can be created by every user independently based on a router dump, or provided along with the Bitcoin release. Currently we use the python scripts written by @sipa to create a .map file, which is no larger than 2MB (awesome!).
Here I suggest adding a field to peers.dat which would represent a hash of asmap file used while serializing addrman (or 0 for /16 prefix legacy approach).
In this case, every time the file is updated (or grouping method changed), all buckets will be re-computed.
I believe that alternative selective re-bucketing for only updated ranges would require substantial changes.
TODO:
- ~~more unit tests~~
- ~~find a way to test the code without including >1 MB mapping file in the repo.~~
- find a way to check that mapping file is not corrupted (checksum?)
- comments and separate tests for asmap.cpp
- make python code for .map generation public
- figure out asmap distribution (?)
~Interesting corner case: I’m using std::hash to compute a fingerprint of asmap, and std::hash returns size_t. I guess if a user updates the OS to 64-bit, then the hash of asap will change? Does it even matter?~
ACKs for top commit:
laanwj:
re-ACK 3c1bc40205
jamesob:
ACK 3c1bc40205 ([`jamesob/ackr/16702.3.naumenkogs.p2p_supplying_and_using`](https://github.com/jamesob/bitcoin/tree/ackr/16702.3.naumenkogs.p2p_supplying_and_using))
jonatack:
ACK 3c1bc40205
Tree-SHA512: e2dc6171188d5cdc2ab2c022fa49ed73a14a0acb8ae4c5ffa970172a0365942a249ad3d57e5fb134bc156a3492662c983f74bd21e78d316629dcadf71576800c
Instead of using /16 netgroups to bucket nodes in Addrman for connection
diversification, ASN, which better represents an actor in terms
of network-layer infrastructure, is used.
For testing, asmap.raw is used. It represents a minimal
asmap needed for testing purposes.
The original ORCHID prefix was deprecated as of 2014-03, the new
ORCHIDv2 prefix was allocated by RFC7343 as of 2014-07. We did not
consider the original ORCHID prefix routable, and I don't see any reason
to consider the new one to be either.
b7b36decaf fix uninitialized read when stringifying an addrLocal (Kaz Wesley)
8ebbef0169 add test demonstrating addrLocal UB (Kaz Wesley)
Pull request description:
Reachable from either place where SetIP is used when all of:
- our best-guess addrLocal for a peer is IPv4
- the peer tells us it's reaching us at an IPv6 address
- NET logging is enabled
In that case, SetIP turns an IPv4 address into an IPv6 address without
setting the scopeId, which is subsequently read in GetSockAddr during
CNetAddr::ToStringIP and passed to getnameinfo. Fix by ensuring every
constructor initializes the scopeId field with something.
Tree-SHA512: 8f0159750995e08b985335ccf60a273ebd09003990bcf2c3838b550ed8dc2659552ac7611650e6dd8e29d786fe52ed57674f5880f2e18dc594a7a863134739e3
Reachable from either place where SetIP is used when our best-guess
addrLocal for a peer is IPv4, but the peer tells us it's reaching us at
an IPv6 address.
In that case, SetIP turns an IPv4 address into an IPv6 address without
setting the scopeId, which is subsequently read in GetSockAddr during
CNetAddr::ToStringIP and passed to getnameinfo. Fix by ensuring every
constructor initializes the scopeId field with something.
-BEGIN VERIFY SCRIPT-
sed --in-place'' --expression='s/NET_TOR/NET_ONION/g' $(git grep -I --files-with-matches 'NET_TOR')
-END VERIFY SCRIPT-
The --in-place'' hack is required for sed on macOS to edit files in-place without passing a backup extension.
Support is added to serialize arrays of type char or unsigned char directly,
without any wrappers. All invocations of the FLATDATA wrappers that are
obsoleted by this are removed.
This includes a patch by Russell Yanofsky to make char casting type safe.
The serialization of CSubNet is changed to serialize a bool directly rather
than though FLATDATA. This makes the serialization independent of the size
of the bool type (and will use 1 byte everywhere).
Identified with `cppcheck --enable=unusedFunction .`.
- GetSendBufferSize()'s last use removed in
991955ee81
- SetPort()'s last use removed in
7e195e8459
- GetfLargeWorkInvalidChainFound() was introduced in
e3ba0ef956 and never used
We currently do two resolves for dns seeds: one for the results, and one to
serve in addrman as the source for those addresses.
There's no requirement that the source hostname resolves to the stored
identifier, only that the mapping is unique. So rather than incurring the
second lookup, combine a private subnet with a hash of the hostname.
The resulting v6 ip is guaranteed not to be publicy routable, and has only a
negligible chance of colliding with a user's internal network (which would be
of no consequence anyway).
Remove the nType and nVersion as parameters to all serialization methods
and functions. There is only one place where it's read and has an impact
(in CAddress), and even there it does not impact any of the recursively
invoked serializers.
Instead, the few places that need nType or nVersion are changed to read
it directly from the stream object, through GetType() and GetVersion()
methods which are added to all stream classes.