To get the advantages of faster GetSerializeSize() implementations
back that were removed in "Make GetSerializeSize a wrapper on top of
CSizeComputer", reintroduce them in the few places in the form of a
specialized Serialize() implementation. This actually gets us in a
better state than before, as these even get used when they're invoked
indirectly in the serialization of another object.
The CSerAction's ForRead() method does not depend on any runtime
data, so guarantee that requests to it can be optimized out by
making it constexpr.
Suggested by Cory Fields.
Remove the nType and nVersion as parameters to all serialization methods
and functions. There is only one place where it's read and has an impact
(in CAddress), and even there it does not impact any of the recursively
invoked serializers.
Instead, the few places that need nType or nVersion are changed to read
it directly from the stream object, through GetType() and GetVersion()
methods which are added to all stream classes.
Given that in default GetSerializeSize implementations created by
ADD_SERIALIZE_METHODS we're already using CSizeComputer(), get rid
of the specialized GetSerializeSize methods everywhere, and just use
CSizeComputer. This removes a lot of code which isn't actually used
anywhere.
For CCompactSize and CVarInt this actually removes a more efficient
size computing algorithm, which is brought back in a later commit.
The stream implementations had two cascading layers (the upper one
with operator<< and operator>>, and a lower one with read and write).
The lower layer's functions are never cascaded (nor should they, as
they should only be used from the higher layer), so make them return
void instead.
Implement `begin_ptr` and `end_ptr` in terms of C++11 code,
and add a comment that they are deprecated.
Follow-up to developer notes update in 654a211622.
There's only one case where a vector containing a fundamental type is
serialized all-at-once, unsigned char. Anything else would lead to
strange results.
Use a dummy argument to overload in that case.
- it now takes over the passed file descriptor and closes it in the
destructor
- this fixes a leak in LoadExternalBlockFile(), where an exception could
cause the file to not getting closed
- disallow copies (like recently added for CAutoFile)
- make nType and nVersion private
One might assume that CAutoFile would be ref-counted so that a copied object
would delay closing the underlying file until all copies have gone out of
scope. Since that's not the case with CAutoFile, explicitly disable copying.
Thanks to Pieter Wuille for most of the work on this commit.
I did not fixup the overhaul commit, because a rebase conflicted
with "remove fields of ser_streamplaceholder".
I prefer not to risk making a mistake while resolving it.
The nType and nVersion fields of stream objects are never accessed
from outside the class (or perhaps from the inside too, I haven't checked).
Thus no need to have them in a placeholder, whose only purpose is to
fill the "Stream" template parameter in serialization implementation.
The implementation of each class' serialization/deserialization is no longer
passed within a macro. The implementation now lies within a template of form:
template <typename T, typename Stream, typename Operation>
inline static size_t SerializationOp(T thisPtr, Stream& s, Operation ser_action, int nType, int nVersion) {
size_t nSerSize = 0;
/* CODE */
return nSerSize;
}
In cases when codepath should depend on whether or not we are just deserializing
(old fGetSize, fWrite, fRead flags) an additional clause can be used:
bool fRead = boost::is_same<Operation, CSerActionUnserialize>();
The IMPLEMENT_SERIALIZE macro will now be a freestanding clause added within
class' body (similiar to Qt's Q_OBJECT) to implement GetSerializeSize,
Serialize and Unserialize. These are now wrappers around
the "SerializationOp" template.
- ensures a consistent usage in header files
- also add a blank line after the copyright header where missing
- also remove orphan new-lines at the end of some files
Thus the read(...) and write(...) methods of all stream classes now have identical parameter lists.
This will bring these classes one step closer to a common interface.
Remove the 'state' and 'exceptmask' from serialize.h's stream implementations,
as well as related methods.
As exceptmask always included 'failbit', and setstate was always called with
bits = failbit, all it did was immediately raise an exception. Get rid of
those variables, and replace the setstate with direct exception throwing
(which also removes some dead code).
As a result, good() is never reached after a failure (there are only 2
calls, one of which is in tests), and can just be replaced by !eof().
fail(), clear(n) and exceptions() are just never called. Delete them.
`&vch[vch.size()]` and even `&vch[0]` on vectors can cause assertion
errors with VC in debug mode. This is the problem mentioned in #4239.
The deeper problem with this is that we rely on undefined behavior.
- Add `begin_ptr` and `end_ptr` functions that get the beginning and end
pointer of vector in a reliable way that copes with empty vectors and
doesn't reference outside the vector
(see https://stackoverflow.com/questions/1339470/how-to-get-the-address-of-the-stdvector-buffer-start-most-elegantly/1339767#1339767).
- Add a convenience constructor to CFlatData that wraps a vector.
I added `begin_ptr` and `end_ptr` as separate functions as I imagine
they will be useful in more places.
Use misc methods of avoiding unnecesary header includes.
Replace int typedefs with int##_t from stdint.h.
Replace PRI64[xdu] with PRI[xdu]64 from inttypes.h.
Normalize QT_VERSION ifs where possible.
Resolve some indirect dependencies as direct ones.
Remove extern declarations from .cpp files.
Changed CDataStream::GetAndClear() to use the most obvious
get get and clear instead of a tricky swap().
Added a unit test for CDataStream insert/erase/GetAndClear.
Note: GetAndClear() is not performance critical, it is used only
by the send-a-message-to-the-network code. Bug was not noticed
before now because the send-a-message code never erased from the
stream.
This seems to cause problems on recent clang, and looks totally
redundant and unused.
The const_iterator version is identical to the vector::const_iterator
one (which is a typedef thereof). Marking it private (instead of
removing) compiles fine, so this version is effectively unused even.
The length of vectors, maps, sets, etc are serialized using
Write/ReadCompactSize -- which, unfortunately, do not use a
unique encoding.
So deserializing and then re-serializing a transaction (for example)
can give you different bits than you started with. That doesn't
cause any problems that we are aware of, but it is exactly the type
of subtle mismatch that can lead to exploits.
With this pull, reading a non-canonical CompactSize throws an
exception, which means nodes will ignore 'tx' or 'block' or
other messages that are not properly encoded.
Please check my logic... but this change is safe with respect to
causing a network split. Old clients that receive
non-canonically-encoded transactions or blocks deserialize
them into CTransaction/CBlock structures in memory, and then
re-serialize them before relaying them to peers.
And please check my logic with respect to causing a blockchain
split: there are no CompactSize fields in the block header, so
the block hash is always canonical. The merkle root in the block
header is computed on a vector<CTransaction>, so
any non-canonical encoding of the transactions in 'tx' or 'block'
messages is erased as they are read into memory by old clients,
and does not affect the block hash. And, as noted above, old
clients re-serialize (with canonical encoding) 'tx' and 'block'
messages before relaying to peers.
Variable-length integers: bytes are a MSB base-128 encoding of the number.
The high bit in each byte signifies whether another digit follows. To make
the encoding is one-to-one, one is subtracted from all but the last digit.
Thus, the byte sequence a[] with length len, where all but the last byte
has bit 128 set, encodes the number:
(a[len-1] & 0x7F) + sum(i=1..len-1, 128^i*((a[len-i-1] & 0x7F)+1))
Properties:
* Very small (0-127: 1 byte, 128-16511: 2 bytes, 16512-2113663: 3 bytes)
* Every integer has exactly one encoding
* Encoding does not depend on size of original integer type
We're in a wholly different world now, C++-compiler-wise.
Current std::stringstream implementations don't have the stated problem anymore,
and are just as fast as CDataStream.
The #ifdef'd block does not even compile anymore; CDataStream constructor changed,
and missing some std::. Also timing in whole seconds is also way too granular
to say anything sensible in such microbenchmarks. Just remove it,
it can always be found again in git history.
This commit removes the dependency of serialize.h on PROTOCOL_VERSION,
and makes this parameter required instead of implicit. This is much saner,
as it makes the places where changing a version number can have an
influence obvious.
* move PROTOCOL_VERSION to version.h
* move CLIENT_VERSION* to version.h, make available past cpp stage
* clearly separate client, network version portions of version.h
This turns on most gcc warnings, and removes some unused variables and other code that triggers warnings.
Exceptions are:
-Wno-sign-compare : triggered by lots of comparisons of signed integer to foo.size(), which is unsigned.
-Wno-char-subscripts : triggered by the convert-to-hex functions (I may fix this in a future commit).
Replaced all occurrences of #if* __WXMSW__ with WIN32,
and all occurrences of __WXMAC_OSX__ with MAC_OSX, and made
sure those are defined appropriately in the makefile and bitcoin-qt.pro.
util.h doesn't use REF, serialize.h does, creating a dependency of
serialize.h on util.h, but util.h already depends on serialize.h. To
resolve this circular dependency the function 'REF' has now been moved
closer to one of its two points of use.
Signed-off-by: Giel van Schijndel <me@mortis.eu>