9220a0fdd0 tests: Add one specialized ProcessMessage(...) fuzzing binary per message type for optimal results when using coverage-guided fuzzing (practicalswift)
fd1dae10b4 tests: Add fuzzing harness for ProcessMessage(...) (practicalswift)
Pull request description:
Add fuzzing harness for `ProcessMessage(...)`. Enables high-level fuzzing of the P2P layer.
All code paths reachable from this fuzzer can be assumed to be reachable for an untrusted peer.
Seeded from thin air (an empty corpus) this fuzzer reaches roughly 20 000 lines of code.
To test this PR:
```
$ make distclean
$ ./autogen.sh
$ CC=clang CXX=clang++ ./configure --enable-fuzz \
--with-sanitizers=address,fuzzer,undefined
$ make
$ src/test/fuzz/process_message
…
```
Worth noting about this fuzzing harness:
* To achieve a reasonable number of executions per seconds the state of the fuzzer is unfortunately not entirely reset between `test_one_input` calls. The set-up (`FuzzingSetup` ctor) and tear-down (`~FuzzingSetup`) work is simply too costly to be run on every iteration. There is a trade-off to handle here between a.) achieving high executions/second and b.) giving the fuzzer a totally blank slate for each call. Please let me know if you have any suggestion on how to improve this situation while maintaining >1000 executions/second.
* To achieve optimal results when using coverage-guided fuzzing I've chosen to create one specialised fuzzing binary per message type (`process_message_addr`, `process_message_block`, `process_message_blocktxn `, etc.) and one general fuzzing binary (`process_message`) which handles all messages types. The latter general fuzzer can be seeded with inputs generated by the former specialised fuzzers.
Happy fuzzing friends!
ACKs for top commit:
MarcoFalke:
ACK 9220a0fdd0🏊
Tree-SHA512: c314ef12b0db17b53cbf3abfb9ecc10ce420fb45b17c1db0b34cabe7c30e453947b3ae462020b0c9f30e2c67a7ef1df68826238687dc2479cd816f0addb530e5
9ff41f6419 tests: Add float to FUZZERS_MISSING_CORPORA (temporarily) (practicalswift)
8f6fb0a85a tests: Add serialization/deserialization fuzzing for integral types (practicalswift)
3c82b92d2e tests: Add fuzzing harness for functions taking floating-point types as input (practicalswift)
c2bd588860 Add missing includes (practicalswift)
Pull request description:
Add simple fuzzing harness for functions with floating-point parameters (such as `ser_double_to_uint64(double)`, etc.).
Add serialization/deserialization fuzzing for integral types.
Add missing includes.
To test this PR:
```
$ make distclean
$ ./autogen.sh
$ CC=clang CXX=clang++ ./configure --enable-fuzz \
--with-sanitizers=address,fuzzer,undefined
$ make
$ src/test/fuzz/float
…
```
Top commit has no ACKs.
Tree-SHA512: 9b5a0c4838ad18d715c7398e557d2a6d0fcc03aa842f76d7a8ed716170a28f17f249eaede4256998aa3417afe2935e0ffdfaa883727d71ae2d2d18a41ced24b5
fac52dafa0 test: Set catch_system_errors=no on boost unit tests (MarcoFalke)
Pull request description:
Closes#16700
Can be tested by adding an `assert(0)` and then running either `make check` or `./src/test/test_bitcoin -t bla_tests --catch_system_errors=no/yes`
ACKs for top commit:
practicalswift:
ACK fac52dafa0
Empact:
Tested ACK fac52dafa0
Tree-SHA512: ec00636951b2c1137aaf43610739d78d16f823f7da76a726d47f93b8b089766fb66b21504b3c5413bcf8b6b5c3db0ad74027d677db24a44487d6d79a6bdee2e0
Make LegacyScriptPubKeyMan::CanProvide method able to recognize p2sh scripts
when the redeem script is present in the mapScripts map without the p2sh script
also having to be added to the mapScripts map. This restores behavior prior to
https://github.com/bitcoin/bitcoin/pull/17261, which I think broke backwards
compatibility with old wallet files by no longer treating addresses created by
`addmultisigaddress` calls before #17261 as solvable.
The reason why tests didn't fail with the CanProvide implementation in #17261
is because of a workaround added in 4a7e43e846
"Store p2sh scripts in AddAndGetDestinationForScript", which masked the problem
for new `addmultisigaddress` RPC calls without fixing it for multisig addresses
already created in old wallet files.
This change adds a lot of comments and allows reverting commit
4a7e43e846 "Store p2sh scripts in
AddAndGetDestinationForScript", so the AddAndGetDestinationForScript() function,
CanProvide() method, and mapScripts map should all be more comprehensible
cc668d06fb tests: Add fuzzing harness for strprintf(...) (practicalswift)
ccc3c76e2b tests: Add fuzzer strprintf to FUZZERS_MISSING_CORPORA (temporarily) (practicalswift)
6ef04912af tests: Update FuzzedDataProvider.h from upstream (LLVM) (practicalswift)
Pull request description:
Add fuzzing harness for `strprintf(…)`.
Update `FuzzedDataProvider.h`.
Avoid hitting some issues in tinyformat (reported upstreams in https://github.com/c42f/tinyformat/issues/70).
---
Found issues in tinyformat:
**Issue 1.** The following causes a signed integer overflow followed by an allocation of 9 GB of RAM (or an OOM in memory constrained environments):
```
strprintf("%.777777700000000$", 1.0);
```
**Issue 2.** The following causes a stack overflow:
```
strprintf("%987654321000000:", 1);
```
**Issue 3.** The following causes a stack overflow:
```
strprintf("%1$*1$*", -11111111);
```
**Issue 4.** The following causes a `NULL` pointer dereference:
```
strprintf("%.1s", (char *)nullptr);
```
**Issue 5.** The following causes a float cast overflow:
```
strprintf("%c", -1000.0);
```
**Issue 6.** The following causes a float cast overflow followed by an invalid integer negation:
```
strprintf("%*", std::numeric_limits<double>::lowest());
```
Top commit has no ACKs.
Tree-SHA512: 9b765559281470f4983eb5aeca94bab1b15ec9837c0ee01a20f4348e9335e4ee4e4fecbd7a1a5a8ac96aabe0f9eeb597b8fc9a2c8faf1bab386e8225d5cdbc18
3c1bc40205 Add extra logging of asmap use and bucketing (Gleb Naumenko)
e4658aa8ea Return mapped AS in RPC call getpeerinfo (Gleb Naumenko)
ec45646de9 Integrate ASN bucketing in Addrman and add tests (Gleb Naumenko)
8feb4e4b66 Add asmap utility which queries a mapping (Gleb Naumenko)
Pull request description:
This PR attempts to solve the problem explained in #16599.
A particular attack which encouraged us to work on this issue is explained here [[Erebus Attack against Bitcoin Peer-to-Peer Network](https://erebus-attack.comp.nus.edu.sg/)] (by @muoitranduc)
Instead of relying on /16 prefix to diversify the connections every node creates, we would instead rely on the (ip -> ASN) mapping, if this mapping is provided.
A .map file can be created by every user independently based on a router dump, or provided along with the Bitcoin release. Currently we use the python scripts written by @sipa to create a .map file, which is no larger than 2MB (awesome!).
Here I suggest adding a field to peers.dat which would represent a hash of asmap file used while serializing addrman (or 0 for /16 prefix legacy approach).
In this case, every time the file is updated (or grouping method changed), all buckets will be re-computed.
I believe that alternative selective re-bucketing for only updated ranges would require substantial changes.
TODO:
- ~~more unit tests~~
- ~~find a way to test the code without including >1 MB mapping file in the repo.~~
- find a way to check that mapping file is not corrupted (checksum?)
- comments and separate tests for asmap.cpp
- make python code for .map generation public
- figure out asmap distribution (?)
~Interesting corner case: I’m using std::hash to compute a fingerprint of asmap, and std::hash returns size_t. I guess if a user updates the OS to 64-bit, then the hash of asap will change? Does it even matter?~
ACKs for top commit:
laanwj:
re-ACK 3c1bc40205
jamesob:
ACK 3c1bc40205 ([`jamesob/ackr/16702.3.naumenkogs.p2p_supplying_and_using`](https://github.com/jamesob/bitcoin/tree/ackr/16702.3.naumenkogs.p2p_supplying_and_using))
jonatack:
ACK 3c1bc40205
Tree-SHA512: e2dc6171188d5cdc2ab2c022fa49ed73a14a0acb8ae4c5ffa970172a0365942a249ad3d57e5fb134bc156a3492662c983f74bd21e78d316629dcadf71576800c
02b9511d6b tests: add tests for GetCoinsCacheSizeState (James O'Beirne)
b17e91d842 refactoring: introduce CChainState::GetCoinsCacheSizeState (James O'Beirne)
Pull request description:
This is part of the [assumeutxo project](https://github.com/bitcoin/bitcoin/projects/11):
Parent PR: #15606
Issue: #15605
Specification: https://github.com/jamesob/assumeutxo-docs/tree/master/proposal
---
This pulls out the routine for detection of how full the coins cache is from
FlushStateToDisk. We use this logic independently when deciding when to flush
the coins cache during UTXO snapshot activation ([see here](231fb5f17e (diff-24efdb00bfbe56b140fb006b562cc70bR5275))).
ACKs for top commit:
ariard:
Code review ACK 02b9511.
ryanofsky:
Code review ACK 02b9511d6b. Just rebase, new COIN_SIZE comment, and new test message since last review
Tree-SHA512: 8bdd78bf68a4a5d33a776e73fcc2857f050d6d102caa4997ed19ca25468c1358e6e728199d61b423033c02e6bc8f00a1d9da52cf17a2d37d70860fca9237ea7c
Instead of using /16 netgroups to bucket nodes in Addrman for connection
diversification, ASN, which better represents an actor in terms
of network-layer infrastructure, is used.
For testing, asmap.raw is used. It represents a minimal
asmap needed for testing purposes.