Using swap() was rather wasteful because it had to copy the whole direct
memory data twice. Also, due to the swap() in move assignment the moved-from
object might hold on to unused memory for longer than necessary.
Move operations already are `noexcept`, so add the keyword to the methods.
This makes the `PrevectorFillVectorIndirect...` benchmarks about twice
as fast on my machine, because otherwise `std::vector` has to use a copy
when the vector resizes.
11e7908484 prevector: only allow trivially copyable types (Martin Leitner-Ankerl)
Pull request description:
prevector uses `memmove` to move around data, that means it can only be used with types that are trivially copyable. That implies that the types are trivially destructible, thus the checks for `is_trivially_destructible` are not needed.
ACKs for top commit:
w0xlt:
ACK 11e7908484
MarcoFalke:
review ACK 11e7908484🏯
ajtowns:
ACK 11e7908484 -- code review only
Tree-SHA512: cbb4d8bfa095100677874b552d92c324c7d6354fcf7adab2ed52f57bd1793762871798b5288064ed1af2d2903a0ec9dbfec48d99955fc428f18cc28d6840dccc
The prevector implementation currently can't be used with types that are
not trivially copyable, due to the use of memmove. Trivially copyable
implies that it is trivially destructible, see
https://eel.is/c++draft/class.prop#1.3
That means that the checks for std::is_trivially_destructible are not
necessary, and in fact where used it wouldn't be enough. E.g. in
`erase(iterator, iterator)` the elements in range first-last are destructed,
but it does not destruct elements left after `memmove`.
This commit removes the checks for `std::is_trivially_destructible`
and instead adds a `static_assert(std::is_trivially_copyable_v<T>);` to
make sure `prevector` is only used with supported types.
5f26855f10 test: Remove ubsan alignment suppressions (Wladimir J. van der Laan)
9d933ef919 prevector: avoid misaligned member accesses (Anthony Towns)
Pull request description:
Ensure prevector data is appropriately aligned. Earlier discussion in #17530.
**Edit laanwj**: In contrast to #17530, it does this without increase in size of any of the coin cache data structures (x86_64, clang)
| Struct | (size,align) before | (size,align) after |
| ------------- | ------------- | ------- |
| Coin | 48, 8 | 48, 8 |
| CCoinsCacheEntry | 56, 8 | 56, 8 |
| CScript | 32, 1 | 32, 8 |
ACKs for top commit:
laanwj:
ACK 5f26855f10
practicalswift:
ACK 5f26855f10
jonatack:
ACK 5f26855f10
Tree-SHA512: 98d112d6856f683d5b212410b73f3071d2994f1efb046a2418a35890aa1cf1aa7c96a960fc2e963fa15241e861093c1ea41951cf5b4b5431f88345eb1dd0a98a
`#pragma pack(1)` prevents aligning the struct and its members to their
required alignment. This can result in code that performs non-aligned
reads and writes to integers and pointers, which is problematic on some
architectures.
It also triggers UBsan — see
https://github.com/bitcoin/bitcoin/pull/17156#issuecomment-543123631
and #17510.
86b47fa741 speed up Unserialize_impl for prevector (Akio Nakamura)
Pull request description:
The unserializer for prevector uses `resize()` for reserve the area, but it's prefer to use `reserve()` because `resize()` have overhead to call its constructor many times.
However, `reserve()` does not change the value of `_size` (a private member of prevector).
This PR make the logic of read from stream to callback function, and prevector handles initilizing new values with that call-back and ajust the value of `_size`.
The changes are as follows:
1. prevector.h
Add a public member function named 'append'.
This function has 2 params, number of elemenst to append and call-back function that initilizing new appended values.
2. serialize.h
In the following two function:
- `Unserialize_impl(Stream& is, prevector<N, T>& v, const unsigned char&)`
- `Unserialize_impl(Stream& is, prevector<N, T>& v, const V&)`
Make a callback function from each original logic of reading values from stream, and call prevector's `append()`.
3. test/prevector_tests.cpp
Add a test for `append()`.
## A benchmark result is following:
[Machine]
MacBook Pro (macOS 10.13.3/i7 2.2GHz/mem 16GB/SSD)
[result]
DeserializeAndCheckBlockTest => 22% faster
DeserializeBlockTest => 29% faster
[before PR]
# Benchmark, evals, iterations, total, min, max, median
DeserializeAndCheckBlockTest, 60, 160, 94.4901, 0.0094644, 0.0104715, 0.0098339
DeserializeBlockTest, 60, 130, 65.0964, 0.00800362, 0.00895134, 0.00824187
[After PR]
# Benchmark, evals, iterations, total, min, max, median
DeserializeAndCheckBlockTest, 60, 160, 77.1597, 0.00767013, 0.00858959, 0.00805757
DeserializeBlockTest, 60, 130, 49.9443, 0.00613926, 0.00691187, 0.00635527
ACKs for top commit:
laanwj:
utACK 86b47fa741
Tree-SHA512: 62ea121ccd45a306fefc67485a1b03a853435af762607dae2426a87b15a3033d802c8556e1923727ddd1023a1837d0e5f6720c2c77b38196907e750e15fbb902
d2eee87928 Lift prevector default vals to the member declaration (Ben Woosley)
Pull request description:
I overlooked this possibility in #14028
ACKs for commit d2eee8:
promag:
utACK d2eee87, change looks good because members are always initialized.
251Labs:
utACK d2eee87 nice one.
ken2812221:
utACK d2eee87928
practicalswift:
utACK d2eee87928
scravy:
utACK d2eee87928
Tree-SHA512: f2726bae1cf892fd680cf8571027bcdc2e42ba567eaa901fb5fb5423b4d11b29e745e0163d82cb513d8c81399cc85933a16ed66d4a30829382d4721ffc41dc97
The unserializer for prevector uses resize() for reserve the area,
but it's prefer to use reserve() because resize() have overhead
to call its constructor many times.
However, reserve() does not change the value of "_size"
(a private member of prevector).
This PR introduce resize_uninitialized() to prevector that similar to
resize() but does not call constructor, and added elements are
explicitly initialized in Unserialize_imple().
The changes are as follows:
1. prevector.h
Add a public member function named 'resize_uninitialized'.
This function processes like as resize() but does not call constructors.
So added elemensts needs explicitly initialized after this returns.
2. serialize.h
In the following two function:
Unserialize_impl(Stream& is, prevector<N, T>& v, const unsigned char&)
Unserialize_impl(Stream& is, prevector<N, T>& v, const V&)
Calls resize_uninitialized() instead of resize()
3. test/prevector_tests.cpp
Add a test for resize_uninitialized().
Problem:
- IS_TRIVIALLY_CONSTRUCTIBLE macro does not work correctly resulting
in `memset()` usage to set a non-trivial type to 0 when
`nontrivial_t` is passed in from the tests.
- Warning reported by GCC when compiling with `--enable-werror`.
Solution:
- Use the standard algorithm `std::fill_n()` and let the compiler
determine the optimal way of looping or using `memset()`.
Further optimize prevector::resize() (which is called by a number of
other prevector methods) to use memset to initialize memory when the
prevector contains trivial types.
In prevector.h, the code which like item_ptr(size()) apears in the loop.
Both item_ptr() and size() judge whether values are held directly or
indirectly, but in most cases it is sufficient to make that judgement
once outside the loop.
This PR adds 2 private function fill() which has the loop to initialize
by specified value (or iterator of the other prevector's element),
but don't call item_ptr() in their loop.
Other functions(assign(), constructor, operator=(), insert())
that has similar loop, call fill() instead of original loop.
Also, resize() was changed like fill(), but it calls the default
constructor for that element each time.
Warning from gcc 7.1 is ./prevector.h:450:25: warning:
'*((void*)(&<anonymous>)+8).prevector<28, unsigned char>::_union.prevector<28, unsigned char>::direct_or_indirect::<anonymous>.prevector<28, unsigned char>::direct_or_indirect::<unnamed struct>::indirect'
may be used uninitialized in this function [-Wmaybe-uninitialized]
45a5aaf Only call clear on prevector if it isn't trivially destructible and don't loop in clear (Jeremy Rubin)
aaa02e7 Add prevector destructor benchmark (Jeremy Rubin)
Tree-SHA512: 52bc8163b65b71310252f2d578349d0ddc364a6c23795c5e06e101f5449f04c96cbdca41c0cffb1974b984b8e33006471137d92b8dd4a81a98e922610a94132a
Implement `begin_ptr` and `end_ptr` in terms of C++11 code,
and add a comment that they are deprecated.
Follow-up to developer notes update in 654a211622.
swap was using an incorrect condition to determine when to apply an optimization
(not swapping the full direct[] when swapping two indirect prevectors).
Rather than correct the optimization I'm removing it for simplicity. Removing
this optimization minutely improves performance in the typical (currently only)
usage of member swap(), which is swapping with a freshly value-initialized
object.
Fixes a bug in which pop_back did not call the deleted item's destructor.
Using the most general erase() implementation to implement all the others
prevents similar bugs because the coupling between deallocation and destructor
invocation only needs to be maintained in one place.
Also reduces duplication of complex memmove logic.