86b47fa741 speed up Unserialize_impl for prevector (Akio Nakamura)
Pull request description:
The unserializer for prevector uses `resize()` for reserve the area, but it's prefer to use `reserve()` because `resize()` have overhead to call its constructor many times.
However, `reserve()` does not change the value of `_size` (a private member of prevector).
This PR make the logic of read from stream to callback function, and prevector handles initilizing new values with that call-back and ajust the value of `_size`.
The changes are as follows:
1. prevector.h
Add a public member function named 'append'.
This function has 2 params, number of elemenst to append and call-back function that initilizing new appended values.
2. serialize.h
In the following two function:
- `Unserialize_impl(Stream& is, prevector<N, T>& v, const unsigned char&)`
- `Unserialize_impl(Stream& is, prevector<N, T>& v, const V&)`
Make a callback function from each original logic of reading values from stream, and call prevector's `append()`.
3. test/prevector_tests.cpp
Add a test for `append()`.
## A benchmark result is following:
[Machine]
MacBook Pro (macOS 10.13.3/i7 2.2GHz/mem 16GB/SSD)
[result]
DeserializeAndCheckBlockTest => 22% faster
DeserializeBlockTest => 29% faster
[before PR]
# Benchmark, evals, iterations, total, min, max, median
DeserializeAndCheckBlockTest, 60, 160, 94.4901, 0.0094644, 0.0104715, 0.0098339
DeserializeBlockTest, 60, 130, 65.0964, 0.00800362, 0.00895134, 0.00824187
[After PR]
# Benchmark, evals, iterations, total, min, max, median
DeserializeAndCheckBlockTest, 60, 160, 77.1597, 0.00767013, 0.00858959, 0.00805757
DeserializeBlockTest, 60, 130, 49.9443, 0.00613926, 0.00691187, 0.00635527
ACKs for top commit:
laanwj:
utACK 86b47fa741
Tree-SHA512: 62ea121ccd45a306fefc67485a1b03a853435af762607dae2426a87b15a3033d802c8556e1923727ddd1023a1837d0e5f6720c2c77b38196907e750e15fbb902
d2eee87928 Lift prevector default vals to the member declaration (Ben Woosley)
Pull request description:
I overlooked this possibility in #14028
ACKs for commit d2eee8:
promag:
utACK d2eee87, change looks good because members are always initialized.
251Labs:
utACK d2eee87 nice one.
ken2812221:
utACK d2eee87928
practicalswift:
utACK d2eee87928
scravy:
utACK d2eee87928
Tree-SHA512: f2726bae1cf892fd680cf8571027bcdc2e42ba567eaa901fb5fb5423b4d11b29e745e0163d82cb513d8c81399cc85933a16ed66d4a30829382d4721ffc41dc97
The unserializer for prevector uses resize() for reserve the area,
but it's prefer to use reserve() because resize() have overhead
to call its constructor many times.
However, reserve() does not change the value of "_size"
(a private member of prevector).
This PR introduce resize_uninitialized() to prevector that similar to
resize() but does not call constructor, and added elements are
explicitly initialized in Unserialize_imple().
The changes are as follows:
1. prevector.h
Add a public member function named 'resize_uninitialized'.
This function processes like as resize() but does not call constructors.
So added elemensts needs explicitly initialized after this returns.
2. serialize.h
In the following two function:
Unserialize_impl(Stream& is, prevector<N, T>& v, const unsigned char&)
Unserialize_impl(Stream& is, prevector<N, T>& v, const V&)
Calls resize_uninitialized() instead of resize()
3. test/prevector_tests.cpp
Add a test for resize_uninitialized().
Problem:
- IS_TRIVIALLY_CONSTRUCTIBLE macro does not work correctly resulting
in `memset()` usage to set a non-trivial type to 0 when
`nontrivial_t` is passed in from the tests.
- Warning reported by GCC when compiling with `--enable-werror`.
Solution:
- Use the standard algorithm `std::fill_n()` and let the compiler
determine the optimal way of looping or using `memset()`.
Further optimize prevector::resize() (which is called by a number of
other prevector methods) to use memset to initialize memory when the
prevector contains trivial types.
In prevector.h, the code which like item_ptr(size()) apears in the loop.
Both item_ptr() and size() judge whether values are held directly or
indirectly, but in most cases it is sufficient to make that judgement
once outside the loop.
This PR adds 2 private function fill() which has the loop to initialize
by specified value (or iterator of the other prevector's element),
but don't call item_ptr() in their loop.
Other functions(assign(), constructor, operator=(), insert())
that has similar loop, call fill() instead of original loop.
Also, resize() was changed like fill(), but it calls the default
constructor for that element each time.
Warning from gcc 7.1 is ./prevector.h:450:25: warning:
'*((void*)(&<anonymous>)+8).prevector<28, unsigned char>::_union.prevector<28, unsigned char>::direct_or_indirect::<anonymous>.prevector<28, unsigned char>::direct_or_indirect::<unnamed struct>::indirect'
may be used uninitialized in this function [-Wmaybe-uninitialized]
45a5aaf Only call clear on prevector if it isn't trivially destructible and don't loop in clear (Jeremy Rubin)
aaa02e7 Add prevector destructor benchmark (Jeremy Rubin)
Tree-SHA512: 52bc8163b65b71310252f2d578349d0ddc364a6c23795c5e06e101f5449f04c96cbdca41c0cffb1974b984b8e33006471137d92b8dd4a81a98e922610a94132a
Implement `begin_ptr` and `end_ptr` in terms of C++11 code,
and add a comment that they are deprecated.
Follow-up to developer notes update in 654a211622.
swap was using an incorrect condition to determine when to apply an optimization
(not swapping the full direct[] when swapping two indirect prevectors).
Rather than correct the optimization I'm removing it for simplicity. Removing
this optimization minutely improves performance in the typical (currently only)
usage of member swap(), which is swapping with a freshly value-initialized
object.
Fixes a bug in which pop_back did not call the deleted item's destructor.
Using the most general erase() implementation to implement all the others
prevents similar bugs because the coupling between deallocation and destructor
invocation only needs to be maintained in one place.
Also reduces duplication of complex memmove logic.