189cf35f3e Add simple bech32 benchmarks (Karl-Johan Alm)
Pull request description:
This PR adds benchmarks to `Encode()`/`Decode()`.
The benchmark commit is duplicated in #13632.
Tree-SHA512: 102a193e4af58c9cb23c66d3dc7e174aa6328edab0ed74f92deb7804db5c3d0601807b3e25a5472b5c72d6113cde0dbc9976315644671a8f14ecf349967dbaaa
1fc605a8ae fix bench/prevector.cpp (Akio Nakamura)
Pull request description:
This patch intends to fix some incorrect action of bench/prevector.cpp.
1. PrevectorClear()
2nd call of ```clear()``` should to operate t1 instead of t0.
This patch changes t0 to t1.
2. PREVECTOR_TEST()
PREVECTOR_TEST macro should to call both
```PrevectorXX<nontrivial_t>(state)``` and ```PrevectorXX<trivial_t>(state)```
by specific ```"name"``` which given by parameter instead of calling
```PrevectorResize<>()``` regardless of ```"name"```.
This patch changes ```"PrevectorResize<"``` of this macro to
```"Prevector ## name<"```.
Tree-SHA512: d0498c6d627d7e96fc8ccfb329ca0be2641535b1ce1923d9b1fc720825f9bf4d7281dc8d5ae929038e37b3e625189af9807cb62e6d20933d73832a6dff4b5596
b81560029 Remove CombineSignatures and replace tests (Andrew Chow)
ed94c8b55 Replace CombineSignatures with ProduceSignature (Andrew Chow)
0422beb9b Make SignatureData able to store signatures and scripts (Andrew Chow)
b6edb4f5e Inline Sign1 and SignN (Andrew Chow)
Pull request description:
Currently CombineSignatures is used to create the final scriptSig or an input. However ProduceSignature is capable of doing this itself. Using both CombineSignatures and ProduceSignature results in code duplication which is unnecessary.
To move the scriptSig construction to ProduceSignatures, the SignatureData class contains two maps to hold pubkeys mapped to signatures, and script ids mapped to scripts. DataFromTransaction is extended to be able to extract signatures, their public keys, and scripts from existing ScriptSigs.
The SignaureData are then passed down to SignStep which can use the aforementioned maps to get the signatures, pubkeys, and scripts that it needs, falling back to the actual SigningProvider and SignatureCreator if the data are not available in the SignatureData.
Additionally, Sign1 and SignN have been removed and their functionality inlined into SignStep since Sign1 is really just a wrapper around CreateSig.
Since ProduceSignature can produce the final scriptSig or scriptWitness by using SignatureData which has extracted data from the transaction, CombineSignatures is unnecessary as ProduceSignature is able to replicate all of CombineSignatures' functionality.
This also furthers BIP 174 support and begins moving towards a BIP 174 style backend.
The tests have also been updated to use the new combining methodology.
Tree-SHA512: 78cd58a4ebe37f79229bd5eee2958a0bb45cd7f36d0e993eee13ff685b3665dd76ef2dfd5f47d34678995bb587f5594100ee5f6c09b1c69ee96d3684d470d01e
1. PrevectorClear()
2nd call of clear() should to operate t1 instead of t0.
This patch changes t0 to t1.
2. PREVECTOR_TEST()
PREVECTOR_TEST macro should to call both
PrevectorXX<nontrivial_t>(state) and PrevectorXX<trivial_t>(state)
by specific "name" which given by parameter instead of calling
PrevectorResize<>() regardless of "name".
This patch changes "PrevectorResize<" of this macro to
"Prevector ## name<".
In addition to having the scriptSig and scriptWitness, have SignatureData
also be able to store just the signatures (pubkeys mapped to sigs) and
scripts (script ids mapped to scripts).
Also have DataFromTransaction be able to extract signatures and scripts
from the scriptSig and scriptWitness of an input to put them in SignatureData.
Adds a new SignatureChecker which takes a SignatureData and puts pubkeys
and signatures into it when it successfully verifies a signature.
Adds a new field in SignatureData which stores whether the SignatureData
was complete. This allows us to also update the scriptSig and
scriptWitness to the final one when updating a SignatureData with another
one.
Fix a build error introduced in #13219.
```
.../bitcoin/src/bench/block_assemble.cpp:42:13:error: use of undeclared identifier 'CheckProofOfWork'
while (!CheckProofOfWork(block->GetHash(), block->nBits, Params().GetConsensus())) {
```
e56771365b Do not use uppercase characters in source code filenames (practicalswift)
419a1983ca docs: Add a note about the source code filename naming convention (practicalswift)
Pull request description:
Add a note about the source code filename naming convention.
Tree-SHA512: 8d329bd9e19bcd26e74b0862fb0bc2369b46095dbd3e69d34859908632763abd7c3d00ccc44ee059772ad4bae4460c2bcc1c0e22fd9d8876d57e5fcd346cea4b
4defdfab94 [MOVEONLY] Move unused Merkle branch code to tests (Pieter Wuille)
4437d6e1f3 8-way AVX2 implementation for double SHA256 on 64-byte inputs (Pieter Wuille)
230294bf5f 4-way SSE4.1 implementation for double SHA256 on 64-byte inputs (Pieter Wuille)
1f0e7ca09c Use SHA256D64 in Merkle root computation (Pieter Wuille)
d0c9632883 Specialized double sha256 for 64 byte inputs (Pieter Wuille)
57f34630fb Refactor SHA256 code (Pieter Wuille)
0df017889b Benchmark Merkle root computation (Pieter Wuille)
Pull request description:
This introduces a framework for specialized double-SHA256 with 64 byte inputs. 4 different implementations are provided:
* Generic C++ (reusing the normal SHA256 code)
* Specialized C++ for 64-byte inputs, but no special instructions
* 4-way using SSE4.1 intrinsics
* 8-way using AVX2 intrinsics
On my own system (AVX2 capable), I get these benchmarks for computing the Merkle root of 9001 leaves (supported lengths / special instructions / parallellism):
* 7.2 ms with varsize/naive/1way (master, non-SSE4 hardware)
* 5.8 ms with size64/naive/1way (this PR, non-SSE4 capable systems)
* 4.8 ms with varsize/SSE4/1way (master, SSE4 hardware)
* 2.9 ms with size64/SSE4/4way (this PR, SSE4 hardware)
* 1.1 ms with size64/AVX2/8way (this PR, AVX2 hardware)
Tree-SHA512: efa32d48b32820d9ce788ead4eb583949265be8c2e5f538c94bc914e92d131a57f8c1ee26c6f998e81fb0e30675d4e2eddc3360bcf632676249036018cff343e
Return `EXIT_SUCCESS` from `main()` on error, not the bool `false`
(introduced in #13112). This is the correct value to return on error,
and also shuts up a clang warning.
Also add a final return for clarity.
If an unknown option is given via either the command line args or
the conf file, throw an error and exit
Update tests for ArgsManager knowing args
Ignore unknown options in the config file for bitcoin-cli
Fix tests and bitcoin-cli to match actual options used
Many options are extremely technical, and refer internals, making it
difficult to translate usefully. This came up in discussion of e.g.
#10949. If a message is not understood by translators (which are
typically end-users, not developers) they'll either translate it
literally, making it harder to understand instead of easier, with the
added drawback of the user no longer being able to google it.
Also the translation was only working for bitcoin-qt as with
the console programs, there is no translation backend. So it was
injecting never-used translation messages for bitcoin-cli, -tx.
For these reasons, stop translating options help completely. This should
not affect the output **in any way** except for bitcoin-qt when a
non-English language is configured in the locale.
This implements #10962.
This trivial change adds the "override" keyword to some methods of
subclasses meant to override interface methods. This ensures that any
future change to the interface' method signatures which are not correctly
mirrored in the subclass will break at compile time with a clear error message,
rather than fail at runtime (which is harder to debug).
fa3bb183ad bench: Amend mempool_eviction test for witness txs (MarcoFalke)
962d223e5c bench: Move constructors out of mempool_eviction hot loop (MarcoFalke)
Pull request description:
Tree-SHA512: 997a07e067623bc2c0904a21bd490d164045cf51393af260fc79882ed010636dce82c9ebe35aae8fa5db5e73c9f3ecb6232353a0939c295034f9be574f1fcff2
1bf3f33b46 node: Removed unused wallet-related methods from the Node interface. (Thomas Snider)
b38200459f benchmark: Removed bench/perf.cpp (Thomas Snider)
Pull request description:
Not sure if these should be separate PRs.
First is removal of a platform abstraction for getting cycle counters where possible. Since the benchmarking switch to counting number of iterations over a fixed window instead of counting cycles per iteration, these are unused.
Second is removal of a few methods from the Node interface that seem vestigial from when the concepts of wallet/node were not as clearly separated.
Tree-SHA512: de1460a7d4473ca19db4e2ca845185c63c765d12462c2685044a1f27dedab266cd908bc52235a881a7ad98bc251a4abf4eae523e5f599c169e3511e489f19a0d
1f45e21 scripted-diff: Convert 11 enums into scoped enums (C++11) (practicalswift)
Pull request description:
Rationale (from Bjarne Stroustrup's ["C++11 FAQ"](http://www.stroustrup.com/C++11FAQ.html#enum)):
>
> The enum classes ("new enums", "strong enums") address three problems with traditional C++ enumerations:
>
> * conventional enums implicitly convert to int, causing errors when someone does not want an enumeration to act as an integer.
> * conventional enums export their enumerators to the surrounding scope, causing name clashes.
> * the underlying type of an enum cannot be specified, causing confusion, compatibility problems, and makes forward declaration impossible.
>
> The new enums are "enum class" because they combine aspects of traditional enumerations (names values) with aspects of classes (scoped members and absence of conversions).
Tree-SHA512: 9656e1cf4c3cabd4378c7a38d0c2eaf79e4a54d204a3c5762330840e55ee7e141e188a3efb2b4daf0ef3110bbaff80d8b9253abf2a9b015cdc4d60b49ac2b914
5fbf7c4 fix nits: variable naming, typos (Martin Ankerl)
1e0ee90 Use best-fit strategy in Arena, now O(log(n)) instead O(n) (Martin Ankerl)
Pull request description:
This replaces the first-fit algorithm used in the Arena with a best-fit. According to "Dynamic Storage Allocation: A Survey and Critical Review", Wilson et. al. 1995, http://www.scs.stanford.edu/14wi-cs140/sched/readings/wilson.pdf, both startegies work well in practice.
The advantage of using best-fit is that we can switch the O(n) allocation to O(log(n)). Additionally, some previously O(log(n)) operations are now O(1) operations by using hash maps. The end effect is that the benchmark runs about 2.5 times faster on my machine:
# Benchmark, evals, iterations, total, min, max, median
old: BenchLockedPool, 5, 530, 5.25749, 0.00196938, 0.00199755, 0.00198172
new: BenchLockedPool, 5, 1300, 5.11313, 0.000781493, 0.000793314, 0.00078606
I've run all unit tests and benchmarks, and increased the number of iterations so that BenchLockedPool takes about 5 seconds again.
Tree-SHA512: 6551e384671f93f10c60df530a29a1954bd265cc305411f665a8756525e5afe2873a8032c797d00b6e8c07e16d9827465d0b662875433147381474a44119ccce
Allows SelectCoinsMinConf and SelectCoins be able to switch between
using BnB or Knapsack for choosing coins.
Has SelectCoinsMinConf do the preprocessing necessary to support either
BnB or Knapsack. This includes calculating the filtering the effective
values for each input.
Uses BnB in CreateTransaction to find an exact match for the output.
If BnB fails, it will fallback to the Knapsack solver.
Remove requirement that two wallet files can only be opened at the same time if
they are contained in the same directory.
This change mostly consists of updates to function signatures (updating
functions to take fs::path arguments, instead of combinations of strings,
fs::path, and CDBEnv / CWalletDBWrapper arguments).
Log whether the starting instance of bitcoin core is a debug or release
build (--enable-debug).
Also warn when running the benchmarks with a debug build, to prevent
mistakes comparing debug to non-debug results.
This replaces the first-fit algorithm used in the Arena with a best-fit. According to "Dynamic Storage Allocation: A Survey and Critical Review", Wilson et. al. 1995, http://www.scs.stanford.edu/14wi-cs140/sched/readings/wilson.pdf, both startegies work well in practice.
The advantage of using best-fit is that we can switch the slow O(n) algorithm to O(log(n)) operations. Additionally, some previously O(log(n)) operations are now replaced with O(1) operations by using a hash map. The end effect is that the benchmark runs about 2.5 times faster on my machine:
old: BenchLockedPool, 5, 530, 5.25749, 0.00196938, 0.00199755, 0.00198172
new: BenchLockedPool, 5, 1300, 5.11313, 0.000781493, 0.000793314, 0.00078606
I've run all unit tests and benchmarks.
* inline performance critical code
* Average runtime is specified and used to calculate iterations.
* Console: show median of multiple runs
* plot: show box plot
* filter benchmarks
* specify scaling factor
* ignore src/test and src/bench in command line check script
* number of iterations instead of time
* Replaced runtime in BENCHMARK makro number of iterations.
* Added -? to bench_bitcoin
* Benchmark plotly.js URL, width, height can be customized
* Fixed incorrect precision warning
fbf327b Minimal code changes to allow msvc compilation. (Aaron Clauson)
Pull request description:
These changes are required to allow the Bitcoin source to build with Microsoft's C++ compiler (#11562 is also required).
I looked around for a better place for the typedef of ssize_t which is in random.h. The best candidate looks like src/compat.h but I figured including that header in random.h is a bigger change than the typedef. Note that the same typedef is in at least two other places including the OpenSSL and Berkeley DB headers so some of the Bitcoin code already picks it up.
Tree-SHA512: aa6cc6283015e08ab074641f9abdc116c4dc58574dc90f75e7a5af4cc82946d3052370e5cbe855fb6180c00f8dc66997d3724ff0412e4b7417e51b6602154825
069215e Initialize recently introduced non-static class member lastCycles to zero in constructor (practicalswift)
Pull request description:
Initialize recently introduced non-static class member `lastCycles` to zero in constructor.
`lastCycles` was introduced in 3532818746 which was merged into master yesterday.
Friendly ping @laanwj :-)
Tree-SHA512: cb93b6a8f6e2e3b06cd05a635da95c84f3df64c21fc23fe82f98306ea571badc32040315b563e46ddb5203128226bc334269acd497beead5a5777c434060fd85
std::chrono removes portability issues.
Rather than storing doubles, store the untouched time_points. Then
convert to nanoseconds for display. This allows for maximum precision, while
keeping results comparable between differing hardware/operating systems.
Also, display full nanosecond counts rather than sub-second floats.
We were saving a div by caching the inverse as a float, but this
ended up requiring a int -> float -> int conversion, which takes
almost as much time as the difference between float mul and div.
There are lots of other more pressing issues with the bench
framework which probably require simply removing the adaptive
iteration count stuff anyway.
90d4d89 scripted-diff: Use the C++11 keyword nullptr to denote the pointer literal instead of the macro NULL (practicalswift)
Pull request description:
Since C++11 the macro `NULL` may be:
* an integer literal with value zero, or
* a prvalue of type `std::nullptr_t`
By using the C++11 keyword `nullptr` we are guaranteed a prvalue of type `std::nullptr_t`.
For a more thorough discussion, see "A name for the null pointer: nullptr" (Sutter &
Stroustrup), http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2431.pdf
With this patch applied there are no `NULL` macro usages left in the repo:
```
$ git grep NULL -- "*.cpp" "*.h" | egrep -v '(/univalue/|/secp256k1/|/leveldb/|_NULL|NULLDUMMY|torcontrol.*NULL|NULL cert)' | wc -l
0
```
The road towards `nullptr` (C++11) is split into two PRs:
* `NULL` → `nullptr` is handled in PR #10483 (scripted, this PR)
* `0` → `nullptr` is handled in PR #10645 (manual)
Tree-SHA512: 3c395d66f2ad724a8e6fed74b93634de8bfc0c0eafac94e64e5194c939499fefd6e68f047de3083ad0b4eff37df9a8a3a76349aa17d55eabbd8e0412f140a297
Avoid static analyzer warnings regarding "Function call argument
is a pointer to uninitialized value" in cases where we are
intentionally using such arguments.
This is achieved by using ...
`f(b.begin(), b.end())` (`std::array<char, N>`)
... instead of ...
`f(b, b + N)` (`char b[N]`)
Rationale:
* Reduce false positives by guiding static analyzers regarding our
intentions.
Before this commit:
```
$ clang-tidy-3.5 -checks=* src/bench/base58.cpp
bench/base58.cpp:23:9: warning: Function call argument is a pointer to uninitialized value [clang-analyzer-core.CallAndMessage]
EncodeBase58(b, b + 32);
^
$ clang-tidy-3.5 -checks=* src/bench/verify_script.cpp
bench/verify_script.cpp:59:5: warning: Function call argument is a pointer to uninitialized value [clang-analyzer-core.CallAndMessage]
key.Set(vchKey, vchKey + 32, false);
^
$
```
After this commit:
```
$ clang-tidy-3.5 -checks=* src/bench/base58.cpp
$ clang-tidy-3.5 -checks=* src/bench/verify_script.cpp
$
```
5a9b508 [trivial] Add end of namespace comments (practicalswift)
Tree-SHA512: 92b0fcae4d1d3f4da9e97569ae84ef2d6e09625a5815cd0e5f0eb6dd2ecba9852fa85c184c5ae9de5117050330ce995e9867b451fa8cd5512169025990541a2b
c37e32a [Wallet] Prevent CInputCoin to be in a null state (NicolasDorier)
f597dcb [Wallet] Simplify code using CInputCoin (NicolasDorier)
e78bc45 [Wallet] Decouple CInputCoin from CWalletTx (NicolasDorier)
fd44ac1 [Wallet] Rename std::pair<const CWalletTx*, unsigned int> to CInputCoin (NicolasDorier)
Tree-SHA512: d24361fc514a0566bce1c3953d766dfe4fece79c549cb4db2600695a4ce08e85caa61b7717812618e523a2f2a1669877dad2752ed079e2ed2d27249f9bc8590e
218d915 [bench] Avoid function call arguments which are pointers to uninitialized values (practicalswift)
Tree-SHA512: 68d62e9442094f171433291b7f13dba20fc7ead5fd7f2292e1eb97ae51aa2345d40224c4a65c2e5d3552802b3cd0f675a82b6181cf5b77e964355650b25089f0
45a5aaf Only call clear on prevector if it isn't trivially destructible and don't loop in clear (Jeremy Rubin)
aaa02e7 Add prevector destructor benchmark (Jeremy Rubin)
Tree-SHA512: 52bc8163b65b71310252f2d578349d0ddc364a6c23795c5e06e101f5449f04c96cbdca41c0cffb1974b984b8e33006471137d92b8dd4a81a98e922610a94132a
db07f91 Assert that what might look like a possible division by zero is actually unreachable (practicalswift)
Tree-SHA512: f1652eb37196a5b72f356503a1fbb44fb98aa8a94954ad1765f86d81ebf41a2337d4eb58c4f19937fda3752f5d2d642756e44afdbd438015b87ac20801246bff
ad1ae7a Check and enable -Wshadow by default. (Pavel Janík)
9de90bb Do not shadow variables (gcc set) (Pavel Janík)
Tree-SHA512: 9517feb423dc8ddd63896016b25324673bfbe0bffa97f22996f59d7a3fcbdc2ebf2e43ac02bc067546f54e293e9b2f2514be145f867321e9031f895c063d9fb8
The initialization order of global data structures in different
implementation units is undefined. Making use of this is essentially
gambling on what the linker does, the so-called [Static initialization
order fiasco](https://isocpp.org/wiki/faq/ctors#static-init-order).
In this case it apparently worked on Linux but failed on OpenBSD and
FreeBSD.
To create it on first use, make the registration structure local to
a function.
Fixes#8910.
5113474 wallet: Use CDataStream.data() (Wladimir J. van der Laan)
e2300ff bench: Use CDataStream.data() (Wladimir J. van der Laan)
adff950 dbwrapper: Use new .data() method of CDataStream (Wladimir J. van der Laan)
a2141e4 streams: Remove special cases for ancient MSVC (Wladimir J. van der Laan)
af4c44c streams: Add data() method to CDataStream (Wladimir J. van der Laan)
Fee estimation can just check its own mapMemPoolTxs to determine the same information. Note that now fee estimation for block processing must happen before those transactions are removed, but this shoudl be a speedup.
cee1612 reduce number of lookups in TransactionWithinChainLimit (Gregory Sanders)
af9bedb Test for fix of txn chaining in wallet (Gregory Sanders)
5882c09 CreateTransaction: Don't return success with too-many-ancestor txn (Gregory Sanders)
0b2294a SelectCoinsMinConf: Prefer coins with fewer ancestors (Gregory Sanders)
Make sure that the count is a zero modulo the new mask before
scaling, otherwise the next time until a measure triggers
will take only 1/2 as long as accounted for. This caused
the 'min time' to be potentially off by as much as 100%.
Three categories of modifications:
1)
1 instance of 'The Bitcoin Core developers \n',
1 instance of 'the Bitcoin Core developers\n',
3 instances of 'Bitcoin Core Developers\n', and
12 instances of 'The Bitcoin developers\n'
are made uniform with the 443 instances of 'The Bitcoin Core developers\n'
2)
3 instances of 'BitPay, Inc\.\n' are made uniform with the other 6
instances of 'BitPay Inc\.\n'
3)
4 instances where there was no '(c)' between the 'Copyright' and the year
where it deviates from the style of the local directory.
The new benchmarks exercise script validation, CCoinsDBView caching,
mempool eviction, and wallet coin selection code.
All of the benchmarks added here are extremely simple and don't
necessarily mirror common real world conditions or interesting
performance edge cases. Details about how specific benchmarks can be
improved are noted in comments.
Github-Issue: #7883
Previously the benchmark code used an integer division (%) with
a non-constant in the inner-loop. This is quite slow on many
processors, especially ones like ARM that lack a hardware divide.
Even on fairly recent x86_64 like haswell an integer division can
take something like 100 cycles-- making it comparable to the
runtime of siphash.
This change avoids the division by using bitmasking instead. This
was especially easy since the count was only increased by doubling.
This change also restarts the timing when the execution time was
very low this avoids mintimes of zero in cases where one execution
ends up below the timer resolution. It also reduces the impact of
the overhead on the final result.
The formatting of the prints is changed to not use scientific
notation make it more machine readable (in particular, gnuplot
croaks on the non-fixedpoint, and it doesn't sort correctly).
This also hoists out all the floating point divisions out of the
semi-hot path because it was easy to do so.
It might be prudent to break out the critical test into a macro
just to guarantee that it gets inlined. It might also make sense
to just save out the intermediate counts and times and get the
floating point completely out of the timing loop (because e.g.
on hardware without a fast hardware FPU like some ARM it will
still be slow enough to distort the results). I haven't done
either of these in this commit.
Avoid calling gettimeofday every time through the benchmarking loop, by keeping
track of how long each loop takes and doubling the number of iterations done
between time checks when they take less than 1/16'th of the total elapsed time.
Benchmarking framework, loosely based on google's micro-benchmarking
library (https://github.com/google/benchmark)
Wny not use the Google Benchmark framework? Because adding Even More Dependencies
isn't worth it. If we get a dozen or three benchmarks and need nanosecond-accurate
timings of threaded code then switching to the full-blown Google Benchmark library
should be considered.
The benchmark framework is hard-coded to run each benchmark for one wall-clock second,
and then spits out .csv-format timing information to stdout. It is left as an
exercise for later (or maybe never) to add command-line arguments to specify which
benchmark(s) to run, how long to run them for, how to format results, etc etc etc.
Again, see the Google Benchmark framework for where that might end up.
See src/bench/MilliSleep.cpp for a sanity-test benchmark that just benchmarks
'sleep 100 milliseconds.'
To compile and run benchmarks:
cd src; make bench
Sample output:
Benchmark,count,min,max,average
Sleep100ms,10,0.101854,0.105059,0.103881