mirror of
https://github.com/bitcoin/bitcoin.git
synced 2025-04-29 14:59:39 -04:00
Merge bitcoin/bitcoin#31551: [IBD] batch block reads/writes during AutoFile
serialization
8d801e3efb
optimization: bulk serialization writes in `WriteBlockUndo` and `WriteBlock` (Lőrinc)520965e293
optimization: bulk serialization reads in `UndoRead`, `ReadBlock` (Lőrinc)056cb3c0d2
refactor: clear up blockstorage/streams in preparation for optimization (Lőrinc)67fcc64802
log: unify error messages for (read/write)[undo]block (Lőrinc)a4de160492
scripted-diff: shorten BLOCK_SERIALIZATION_HEADER_SIZE constant (Lőrinc)6640dd52c9
Narrow scope of undofile write to avoid possible resource management issue (Lőrinc)3197155f91
refactor: collect block read operations into try block (Lőrinc)c77e3107b8
refactor: rename leftover WriteBlockBench (Lőrinc) Pull request description: This change is part of [[IBD] - Tracking PR for speeding up Initial Block Download](https://github.com/bitcoin/bitcoin/pull/32043) ### Summary We can serialize the blocks and undos to any `Stream` which implements the appropriate read/write methods. `AutoFile` is one of these, writing the results "directly" to disk (through the OS file cache). Batching these in memory first and reading/writing these to disk is measurably faster (likely because of fewer native fread calls or less locking, as [observed](https://github.com/bitcoin/bitcoin/pull/28226#issuecomment-1666842501) by Martinus in a similar change). ### Unlocking new optimization opportunities Buffered writes will also enable batched obfuscation calculations (implemented in https://github.com/bitcoin/bitcoin/pull/31144) - especially since currently we need to copy the write input's std::span to do the obfuscation on it, and batching enables doing the operations on the internal buffer directly. ### Measurements (micro benchmarks, full IBDs and reindexes) Microbenchmarks for `[Read|Write]BlockBench` show a ~**30%**/**168%** speedup with `macOS/Clang`, and ~**19%**/**24%** with `Linux/GCC` (the follow-up XOR batching improves these further): <details> <summary>macOS Sequoia - Clang 19.1.7</summary> > Before: | ns/op | op/s | err% | total | benchmark |--------------------:|--------------------:|--------:|----------:|:---------- | 2,271,441.67 | 440.25 | 0.1% | 11.00 | `ReadBlockBench` | 5,149,564.31 | 194.19 | 0.8% | 10.95 | `WriteBlockBench` > After: | ns/op | op/s | err% | total | benchmark |--------------------:|--------------------:|--------:|----------:|:---------- | 1,738,683.04 | 575.15 | 0.2% | 11.04 | `ReadBlockBench` | 3,052,658.88 | 327.58 | 1.0% | 10.91 | `WriteBlockBench` </details> <details> <summary>Ubuntu 24 - GNU 13.3.0</summary> > Before: | ns/op | op/s | err% | ins/op | cyc/op | IPC | bra/op | miss% | total | benchmark |--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:---------- | 6,895,987.11 | 145.01 | 0.0% | 71,055,269.86 | 23,977,374.37 | 2.963 | 5,074,828.78 | 0.4% | 22.00 | `ReadBlockBench` | 5,152,973.58 | 194.06 | 2.2% | 19,350,886.41 | 8,784,539.75 | 2.203 | 3,079,335.21 | 0.4% | 23.18 | `WriteBlockBench` > After: | ns/op | op/s | err% | ins/op | cyc/op | IPC | bra/op | miss% | total | benchmark |--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:---------- | 5,771,882.71 | 173.25 | 0.0% | 65,741,889.82 | 20,453,232.33 | 3.214 | 3,971,321.75 | 0.3% | 22.01 | `ReadBlockBench` | 4,145,681.13 | 241.21 | 4.0% | 15,337,596.85 | 5,732,186.47 | 2.676 | 2,239,662.64 | 0.1% | 23.94 | `WriteBlockBench` </details> 2 full IBD runs against master (compiled with GCC where the gains seem more modest) for **888888** blocks (seeded from real nodes) indicates a ~**7%** total speedup. <details> <summary>Details</summary> ```bash COMMITS="d2b72b13699cf460ffbcb1028bcf5f3b07d3b73a 652b4e3de5c5e09fb812abe265f4a8946fa96b54"; \ STOP_HEIGHT=888888; DBCACHE=1000; \ C_COMPILER=gcc; CXX_COMPILER=g++; \ BASE_DIR="/mnt/my_storage"; DATA_DIR="$BASE_DIR/BitcoinData"; LOG_DIR="$BASE_DIR/logs"; \ (for c in $COMMITS; do git fetch origin $c -q && git log -1 --pretty=format:'%h %s' $c || exit 1; done) && \ hyperfine \ --sort 'command' \ --runs 2 \ --export-json "$BASE_DIR/ibd-${COMMITS// /-}-$STOP_HEIGHT-$DBCACHE-$C_COMPILER.json" \ --parameter-list COMMIT ${COMMITS// /,} \ --prepare "killall bitcoind; rm -rf $DATA_DIR/*; git checkout {COMMIT}; git clean -fxd; git reset --hard; \ cmake -B build -DCMAKE_BUILD_TYPE=Release -DENABLE_WALLET=OFF -DCMAKE_C_COMPILER=$C_COMPILER -DCMAKE_CXX_COMPILER=$CXX_COMPILER && \ cmake --build build -j$(nproc) --target bitcoind && \ ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=1 -printtoconsole=0; sleep 100" \ --cleanup "cp $DATA_DIR/debug.log $LOG_DIR/debug-{COMMIT}-$(date +%s).log" \ "COMPILER=$C_COMPILER COMMIT=${COMMIT:0:10} ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP_HEIGHT -dbcache=$DBCACHE -blocksonly -printtoconsole=0"d2b72b1369
refactor: rename leftover WriteBlockBench652b4e3de5
optimization: Bulk serialization writes in `WriteBlockUndo` and `WriteBlock` Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT =d2b72b1369
) Time (mean ± σ): 41528.104 s ± 354.003 s [User: 44324.407 s, System: 3074.829 s] Range (min … max): 41277.786 s … 41778.421 s 2 runs Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT =652b4e3de5
) Time (mean ± σ): 38771.457 s ± 441.941 s [User: 41930.651 s, System: 3222.664 s] Range (min … max): 38458.957 s … 39083.957 s 2 runs Relative speed comparison 1.07 ± 0.02 COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT =d2b72b1369
) 1.00 COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT =652b4e3de5
) ``` </details> ACKs for top commit: maflcko: re-ACK8d801e3efb
🐦 achow101: ACK8d801e3efb
ryanofsky: Code review ACK8d801e3efb
. Most notable change is switching from BufferedReader to ReadRawBlock for block reads, which makes sense, and there are also various cleanups in blockstorage and test code. hodlinator: re-ACK8d801e3efb
Tree-SHA512: 24e1dee653b927b760c0ba3c69d1aba15fa5d9c4536ad11cfc2d70196ae16b9228ecc3056eef70923364257d72dc929882e73e69c6c426e28139d31299d08adc
This commit is contained in:
commit
33df4aebae
7 changed files with 332 additions and 88 deletions
|
@ -27,7 +27,7 @@ static CBlock CreateTestBlock()
|
|||
return block;
|
||||
}
|
||||
|
||||
static void SaveBlockBench(benchmark::Bench& bench)
|
||||
static void WriteBlockBench(benchmark::Bench& bench)
|
||||
{
|
||||
const auto testing_setup{MakeNoLogFileContext<const TestingSetup>(ChainType::MAIN)};
|
||||
auto& blockman{testing_setup->m_node.chainman->m_blockman};
|
||||
|
@ -63,6 +63,6 @@ static void ReadRawBlockBench(benchmark::Bench& bench)
|
|||
});
|
||||
}
|
||||
|
||||
BENCHMARK(SaveBlockBench, benchmark::PriorityLevel::HIGH);
|
||||
BENCHMARK(WriteBlockBench, benchmark::PriorityLevel::HIGH);
|
||||
BENCHMARK(ReadBlockBench, benchmark::PriorityLevel::HIGH);
|
||||
BENCHMARK(ReadRawBlockBench, benchmark::PriorityLevel::HIGH);
|
||||
|
|
|
@ -529,7 +529,7 @@ bool BlockManager::LoadBlockIndexDB(const std::optional<uint256>& snapshot_block
|
|||
}
|
||||
for (std::set<int>::iterator it = setBlkDataFiles.begin(); it != setBlkDataFiles.end(); it++) {
|
||||
FlatFilePos pos(*it, 0);
|
||||
if (OpenBlockFile(pos, true).IsNull()) {
|
||||
if (OpenBlockFile(pos, /*fReadOnly=*/true).IsNull()) {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
@ -660,27 +660,30 @@ bool BlockManager::ReadBlockUndo(CBlockUndo& blockundo, const CBlockIndex& index
|
|||
const FlatFilePos pos{WITH_LOCK(::cs_main, return index.GetUndoPos())};
|
||||
|
||||
// Open history file to read
|
||||
AutoFile filein{OpenUndoFile(pos, true)};
|
||||
if (filein.IsNull()) {
|
||||
LogError("OpenUndoFile failed for %s", pos.ToString());
|
||||
AutoFile file{OpenUndoFile(pos, true)};
|
||||
if (file.IsNull()) {
|
||||
LogError("OpenUndoFile failed for %s while reading block undo", pos.ToString());
|
||||
return false;
|
||||
}
|
||||
BufferedReader filein{std::move(file)};
|
||||
|
||||
// Read block
|
||||
uint256 hashChecksum;
|
||||
HashVerifier verifier{filein}; // Use HashVerifier as reserializing may lose data, c.f. commit d342424301013ec47dc146a4beb49d5c9319d80a
|
||||
try {
|
||||
// Read block
|
||||
HashVerifier verifier{filein}; // Use HashVerifier, as reserializing may lose data, c.f. commit d3424243
|
||||
|
||||
verifier << index.pprev->GetBlockHash();
|
||||
verifier >> blockundo;
|
||||
|
||||
uint256 hashChecksum;
|
||||
filein >> hashChecksum;
|
||||
} catch (const std::exception& e) {
|
||||
LogError("%s: Deserialize or I/O error - %s at %s\n", __func__, e.what(), pos.ToString());
|
||||
return false;
|
||||
}
|
||||
|
||||
// Verify checksum
|
||||
if (hashChecksum != verifier.GetHash()) {
|
||||
LogError("%s: Checksum mismatch at %s\n", __func__, pos.ToString());
|
||||
LogError("Checksum mismatch at %s while reading block undo", pos.ToString());
|
||||
return false;
|
||||
}
|
||||
} catch (const std::exception& e) {
|
||||
LogError("Deserialize or I/O error - %s at %s while reading block undo", e.what(), pos.ToString());
|
||||
return false;
|
||||
}
|
||||
|
||||
|
@ -931,29 +934,34 @@ bool BlockManager::WriteBlockUndo(const CBlockUndo& blockundo, BlockValidationSt
|
|||
// Write undo information to disk
|
||||
if (block.GetUndoPos().IsNull()) {
|
||||
FlatFilePos pos;
|
||||
const unsigned int blockundo_size{static_cast<unsigned int>(GetSerializeSize(blockundo))};
|
||||
const auto blockundo_size{static_cast<uint32_t>(GetSerializeSize(blockundo))};
|
||||
if (!FindUndoPos(state, block.nFile, pos, blockundo_size + UNDO_DATA_DISK_OVERHEAD)) {
|
||||
LogError("FindUndoPos failed");
|
||||
LogError("FindUndoPos failed for %s while writing block undo", pos.ToString());
|
||||
return false;
|
||||
}
|
||||
|
||||
{
|
||||
// Open history file to append
|
||||
AutoFile fileout{OpenUndoFile(pos)};
|
||||
if (fileout.IsNull()) {
|
||||
LogError("OpenUndoFile failed");
|
||||
AutoFile file{OpenUndoFile(pos)};
|
||||
if (file.IsNull()) {
|
||||
LogError("OpenUndoFile failed for %s while writing block undo", pos.ToString());
|
||||
return FatalError(m_opts.notifications, state, _("Failed to write undo data."));
|
||||
}
|
||||
BufferedWriter fileout{file};
|
||||
|
||||
// Write index header
|
||||
fileout << GetParams().MessageStart() << blockundo_size;
|
||||
// Write undo data
|
||||
pos.nPos += BLOCK_SERIALIZATION_HEADER_SIZE;
|
||||
fileout << blockundo;
|
||||
|
||||
// Calculate & write checksum
|
||||
pos.nPos += STORAGE_HEADER_BYTES;
|
||||
{
|
||||
// Calculate checksum
|
||||
HashWriter hasher{};
|
||||
hasher << block.pprev->GetBlockHash();
|
||||
hasher << blockundo;
|
||||
fileout << hasher.GetHash();
|
||||
hasher << block.pprev->GetBlockHash() << blockundo;
|
||||
// Write undo data & checksum
|
||||
fileout << blockundo << hasher.GetHash();
|
||||
}
|
||||
|
||||
fileout.flush(); // Make sure `AutoFile`/`BufferedWriter` go out of scope before we call `FlushUndoFile`
|
||||
}
|
||||
|
||||
// rev files are written in block height order, whereas blk files are written as blocks come in (often out of order)
|
||||
// we want to flush the rev (undo) file once we've written the last block, which is indicated by the last height
|
||||
|
@ -986,29 +994,28 @@ bool BlockManager::ReadBlock(CBlock& block, const FlatFilePos& pos) const
|
|||
block.SetNull();
|
||||
|
||||
// Open history file to read
|
||||
AutoFile filein{OpenBlockFile(pos, true)};
|
||||
if (filein.IsNull()) {
|
||||
LogError("%s: OpenBlockFile failed for %s\n", __func__, pos.ToString());
|
||||
std::vector<uint8_t> block_data;
|
||||
if (!ReadRawBlock(block_data, pos)) {
|
||||
return false;
|
||||
}
|
||||
|
||||
// Read block
|
||||
try {
|
||||
filein >> TX_WITH_WITNESS(block);
|
||||
// Read block
|
||||
SpanReader{block_data} >> TX_WITH_WITNESS(block);
|
||||
} catch (const std::exception& e) {
|
||||
LogError("%s: Deserialize or I/O error - %s at %s\n", __func__, e.what(), pos.ToString());
|
||||
LogError("Deserialize or I/O error - %s at %s while reading block", e.what(), pos.ToString());
|
||||
return false;
|
||||
}
|
||||
|
||||
// Check the header
|
||||
if (!CheckProofOfWork(block.GetHash(), block.nBits, GetConsensus())) {
|
||||
LogError("%s: Errors in block header at %s\n", __func__, pos.ToString());
|
||||
LogError("Errors in block header at %s while reading block", pos.ToString());
|
||||
return false;
|
||||
}
|
||||
|
||||
// Signet only: check block solution
|
||||
if (GetConsensus().signet_blocks && !CheckSignetBlockSolution(block, GetConsensus())) {
|
||||
LogError("%s: Errors in block solution at %s\n", __func__, pos.ToString());
|
||||
LogError("Errors in block solution at %s while reading block", pos.ToString());
|
||||
return false;
|
||||
}
|
||||
|
||||
|
@ -1023,7 +1030,7 @@ bool BlockManager::ReadBlock(CBlock& block, const CBlockIndex& index) const
|
|||
return false;
|
||||
}
|
||||
if (block.GetHash() != index.GetBlockHash()) {
|
||||
LogError("%s: GetHash() doesn't match index for %s at %s\n", __func__, index.ToString(), block_pos.ToString());
|
||||
LogError("GetHash() doesn't match index for %s at %s while reading block", index.ToString(), block_pos.ToString());
|
||||
return false;
|
||||
}
|
||||
return true;
|
||||
|
@ -1031,17 +1038,16 @@ bool BlockManager::ReadBlock(CBlock& block, const CBlockIndex& index) const
|
|||
|
||||
bool BlockManager::ReadRawBlock(std::vector<uint8_t>& block, const FlatFilePos& pos) const
|
||||
{
|
||||
FlatFilePos hpos = pos;
|
||||
// If nPos is less than 8 the pos is null and we don't have the block data
|
||||
// Return early to prevent undefined behavior of unsigned int underflow
|
||||
if (hpos.nPos < 8) {
|
||||
LogError("%s: OpenBlockFile failed for %s\n", __func__, pos.ToString());
|
||||
if (pos.nPos < STORAGE_HEADER_BYTES) {
|
||||
// If nPos is less than STORAGE_HEADER_BYTES, we can't read the header that precedes the block data
|
||||
// This would cause an unsigned integer underflow when trying to position the file cursor
|
||||
// This can happen after pruning or default constructed positions
|
||||
LogError("Failed for %s while reading raw block storage header", pos.ToString());
|
||||
return false;
|
||||
}
|
||||
hpos.nPos -= 8; // Seek back 8 bytes for meta header
|
||||
AutoFile filein{OpenBlockFile(hpos, true)};
|
||||
AutoFile filein{OpenBlockFile({pos.nFile, pos.nPos - STORAGE_HEADER_BYTES}, /*fReadOnly=*/true)};
|
||||
if (filein.IsNull()) {
|
||||
LogError("%s: OpenBlockFile failed for %s\n", __func__, pos.ToString());
|
||||
LogError("OpenBlockFile failed for %s while reading raw block", pos.ToString());
|
||||
return false;
|
||||
}
|
||||
|
||||
|
@ -1052,22 +1058,21 @@ bool BlockManager::ReadRawBlock(std::vector<uint8_t>& block, const FlatFilePos&
|
|||
filein >> blk_start >> blk_size;
|
||||
|
||||
if (blk_start != GetParams().MessageStart()) {
|
||||
LogError("%s: Block magic mismatch for %s: %s versus expected %s\n", __func__, pos.ToString(),
|
||||
HexStr(blk_start),
|
||||
HexStr(GetParams().MessageStart()));
|
||||
LogError("Block magic mismatch for %s: %s versus expected %s while reading raw block",
|
||||
pos.ToString(), HexStr(blk_start), HexStr(GetParams().MessageStart()));
|
||||
return false;
|
||||
}
|
||||
|
||||
if (blk_size > MAX_SIZE) {
|
||||
LogError("%s: Block data is larger than maximum deserialization size for %s: %s versus %s\n", __func__, pos.ToString(),
|
||||
blk_size, MAX_SIZE);
|
||||
LogError("Block data is larger than maximum deserialization size for %s: %s versus %s while reading raw block",
|
||||
pos.ToString(), blk_size, MAX_SIZE);
|
||||
return false;
|
||||
}
|
||||
|
||||
block.resize(blk_size); // Zeroing of memory is intentional here
|
||||
filein.read(MakeWritableByteSpan(block));
|
||||
} catch (const std::exception& e) {
|
||||
LogError("%s: Read from block file failed: %s for %s\n", __func__, e.what(), pos.ToString());
|
||||
LogError("Read from block file failed: %s for %s while reading raw block", e.what(), pos.ToString());
|
||||
return false;
|
||||
}
|
||||
|
||||
|
@ -1077,22 +1082,23 @@ bool BlockManager::ReadRawBlock(std::vector<uint8_t>& block, const FlatFilePos&
|
|||
FlatFilePos BlockManager::WriteBlock(const CBlock& block, int nHeight)
|
||||
{
|
||||
const unsigned int block_size{static_cast<unsigned int>(GetSerializeSize(TX_WITH_WITNESS(block)))};
|
||||
FlatFilePos pos{FindNextBlockPos(block_size + BLOCK_SERIALIZATION_HEADER_SIZE, nHeight, block.GetBlockTime())};
|
||||
FlatFilePos pos{FindNextBlockPos(block_size + STORAGE_HEADER_BYTES, nHeight, block.GetBlockTime())};
|
||||
if (pos.IsNull()) {
|
||||
LogError("FindNextBlockPos failed");
|
||||
LogError("FindNextBlockPos failed for %s while writing block", pos.ToString());
|
||||
return FlatFilePos();
|
||||
}
|
||||
AutoFile fileout{OpenBlockFile(pos)};
|
||||
if (fileout.IsNull()) {
|
||||
LogError("OpenBlockFile failed");
|
||||
AutoFile file{OpenBlockFile(pos, /*fReadOnly=*/false)};
|
||||
if (file.IsNull()) {
|
||||
LogError("OpenBlockFile failed for %s while writing block", pos.ToString());
|
||||
m_opts.notifications.fatalError(_("Failed to write block."));
|
||||
return FlatFilePos();
|
||||
}
|
||||
BufferedWriter fileout{file};
|
||||
|
||||
// Write index header
|
||||
fileout << GetParams().MessageStart() << block_size;
|
||||
pos.nPos += STORAGE_HEADER_BYTES;
|
||||
// Write block
|
||||
pos.nPos += BLOCK_SERIALIZATION_HEADER_SIZE;
|
||||
fileout << TX_WITH_WITNESS(block);
|
||||
return pos;
|
||||
}
|
||||
|
@ -1201,7 +1207,7 @@ void ImportBlocks(ChainstateManager& chainman, std::span<const fs::path> import_
|
|||
if (!fs::exists(chainman.m_blockman.GetBlockPosFilename(pos))) {
|
||||
break; // No block files left to reindex
|
||||
}
|
||||
AutoFile file{chainman.m_blockman.OpenBlockFile(pos, true)};
|
||||
AutoFile file{chainman.m_blockman.OpenBlockFile(pos, /*fReadOnly=*/true)};
|
||||
if (file.IsNull()) {
|
||||
break; // This error is logged in OpenBlockFile
|
||||
}
|
||||
|
|
|
@ -75,10 +75,10 @@ static const unsigned int UNDOFILE_CHUNK_SIZE = 0x100000; // 1 MiB
|
|||
static const unsigned int MAX_BLOCKFILE_SIZE = 0x8000000; // 128 MiB
|
||||
|
||||
/** Size of header written by WriteBlock before a serialized CBlock (8 bytes) */
|
||||
static constexpr size_t BLOCK_SERIALIZATION_HEADER_SIZE{std::tuple_size_v<MessageStartChars> + sizeof(unsigned int)};
|
||||
static constexpr uint32_t STORAGE_HEADER_BYTES{std::tuple_size_v<MessageStartChars> + sizeof(unsigned int)};
|
||||
|
||||
/** Total overhead when writing undo data: header (8 bytes) plus checksum (32 bytes) */
|
||||
static constexpr size_t UNDO_DATA_DISK_OVERHEAD{BLOCK_SERIALIZATION_HEADER_SIZE + uint256::size()};
|
||||
static constexpr uint32_t UNDO_DATA_DISK_OVERHEAD{STORAGE_HEADER_BYTES + uint256::size()};
|
||||
|
||||
// Because validation code takes pointers to the map's CBlockIndex objects, if
|
||||
// we ever switch to another associative container, we need to either use a
|
||||
|
@ -164,7 +164,7 @@ private:
|
|||
* blockfile info, and checks if there is enough disk space to save the block.
|
||||
*
|
||||
* The nAddSize argument passed to this function should include not just the size of the serialized CBlock, but also the size of
|
||||
* separator fields (BLOCK_SERIALIZATION_HEADER_SIZE).
|
||||
* separator fields (STORAGE_HEADER_BYTES).
|
||||
*/
|
||||
[[nodiscard]] FlatFilePos FindNextBlockPos(unsigned int nAddSize, unsigned int nHeight, uint64_t nTime);
|
||||
[[nodiscard]] bool FlushChainstateBlockFile(int tip_height);
|
||||
|
@ -400,7 +400,7 @@ public:
|
|||
void UpdatePruneLock(const std::string& name, const PruneLockInfo& lock_info) EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
|
||||
|
||||
/** Open a block file (blk?????.dat) */
|
||||
AutoFile OpenBlockFile(const FlatFilePos& pos, bool fReadOnly = false) const;
|
||||
AutoFile OpenBlockFile(const FlatFilePos& pos, bool fReadOnly) const;
|
||||
|
||||
/** Translation to a filesystem path */
|
||||
fs::path GetBlockPosFilename(const FlatFilePos& pos) const;
|
||||
|
|
|
@ -87,21 +87,29 @@ void AutoFile::write(std::span<const std::byte> src)
|
|||
}
|
||||
if (m_position.has_value()) *m_position += src.size();
|
||||
} else {
|
||||
if (!m_position.has_value()) throw std::ios_base::failure("AutoFile::write: position unknown");
|
||||
std::array<std::byte, 4096> buf;
|
||||
while (src.size() > 0) {
|
||||
while (src.size()) {
|
||||
auto buf_now{std::span{buf}.first(std::min<size_t>(src.size(), buf.size()))};
|
||||
std::copy(src.begin(), src.begin() + buf_now.size(), buf_now.begin());
|
||||
util::Xor(buf_now, m_xor, *m_position);
|
||||
if (std::fwrite(buf_now.data(), 1, buf_now.size(), m_file) != buf_now.size()) {
|
||||
throw std::ios_base::failure{"XorFile::write: failed"};
|
||||
}
|
||||
std::copy_n(src.begin(), buf_now.size(), buf_now.begin());
|
||||
write_buffer(buf_now);
|
||||
src = src.subspan(buf_now.size());
|
||||
*m_position += buf_now.size();
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
void AutoFile::write_buffer(std::span<std::byte> src)
|
||||
{
|
||||
if (!m_file) throw std::ios_base::failure("AutoFile::write_buffer: file handle is nullptr");
|
||||
if (m_xor.size()) {
|
||||
if (!m_position) throw std::ios_base::failure("AutoFile::write_buffer: obfuscation position unknown");
|
||||
util::Xor(src, m_xor, *m_position); // obfuscate in-place
|
||||
}
|
||||
if (std::fwrite(src.data(), 1, src.size(), m_file) != src.size()) {
|
||||
throw std::ios_base::failure("AutoFile::write_buffer: write failed");
|
||||
}
|
||||
if (m_position) *m_position += src.size();
|
||||
}
|
||||
|
||||
bool AutoFile::Commit()
|
||||
{
|
||||
return ::FileCommit(m_file);
|
||||
|
|
|
@ -445,6 +445,9 @@ public:
|
|||
/** Wrapper around TruncateFile(). */
|
||||
bool Truncate(unsigned size);
|
||||
|
||||
//! Write a mutable buffer more efficiently than write(), obfuscating the buffer in-place.
|
||||
void write_buffer(std::span<std::byte> src);
|
||||
|
||||
//
|
||||
// Stream subset
|
||||
//
|
||||
|
@ -467,6 +470,8 @@ public:
|
|||
}
|
||||
};
|
||||
|
||||
using DataBuffer = std::vector<std::byte>;
|
||||
|
||||
/** Wrapper around an AutoFile& that implements a ring buffer to
|
||||
* deserialize from. It guarantees the ability to rewind a given number of bytes.
|
||||
*
|
||||
|
@ -481,7 +486,7 @@ private:
|
|||
uint64_t m_read_pos{0}; //!< how many bytes have been read from this
|
||||
uint64_t nReadLimit; //!< up to which position we're allowed to read
|
||||
uint64_t nRewind; //!< how many bytes we guarantee to rewind
|
||||
std::vector<std::byte> vchBuf; //!< the buffer
|
||||
DataBuffer vchBuf;
|
||||
|
||||
//! read data from the source to fill the buffer
|
||||
bool Fill() {
|
||||
|
@ -523,7 +528,7 @@ private:
|
|||
}
|
||||
|
||||
public:
|
||||
BufferedFile(AutoFile& file, uint64_t nBufSize, uint64_t nRewindIn)
|
||||
BufferedFile(AutoFile& file LIFETIMEBOUND, uint64_t nBufSize, uint64_t nRewindIn)
|
||||
: m_src{file}, nReadLimit{std::numeric_limits<uint64_t>::max()}, nRewind{nRewindIn}, vchBuf(nBufSize, std::byte{0})
|
||||
{
|
||||
if (nRewindIn >= nBufSize)
|
||||
|
@ -614,4 +619,87 @@ public:
|
|||
}
|
||||
};
|
||||
|
||||
/**
|
||||
* Wrapper that buffers reads from an underlying stream.
|
||||
* Requires underlying stream to support read() and detail_fread() calls
|
||||
* to support fixed-size and variable-sized reads, respectively.
|
||||
*/
|
||||
template <typename S>
|
||||
class BufferedReader
|
||||
{
|
||||
S& m_src;
|
||||
DataBuffer m_buf;
|
||||
size_t m_buf_pos;
|
||||
|
||||
public:
|
||||
//! Requires stream ownership to prevent leaving the stream at an unexpected position after buffered reads.
|
||||
explicit BufferedReader(S&& stream LIFETIMEBOUND, size_t size = 1 << 16)
|
||||
requires std::is_rvalue_reference_v<S&&>
|
||||
: m_src{stream}, m_buf(size), m_buf_pos{size} {}
|
||||
|
||||
void read(std::span<std::byte> dst)
|
||||
{
|
||||
if (const auto available{std::min(dst.size(), m_buf.size() - m_buf_pos)}) {
|
||||
std::copy_n(m_buf.begin() + m_buf_pos, available, dst.begin());
|
||||
m_buf_pos += available;
|
||||
dst = dst.subspan(available);
|
||||
}
|
||||
if (dst.size()) {
|
||||
assert(m_buf_pos == m_buf.size());
|
||||
m_src.read(dst);
|
||||
|
||||
m_buf_pos = 0;
|
||||
m_buf.resize(m_src.detail_fread(m_buf));
|
||||
}
|
||||
}
|
||||
|
||||
template <typename T>
|
||||
BufferedReader& operator>>(T&& obj)
|
||||
{
|
||||
Unserialize(*this, obj);
|
||||
return *this;
|
||||
}
|
||||
};
|
||||
|
||||
/**
|
||||
* Wrapper that buffers writes to an underlying stream.
|
||||
* Requires underlying stream to support write_buffer() method
|
||||
* for efficient buffer flushing and obfuscation.
|
||||
*/
|
||||
template <typename S>
|
||||
class BufferedWriter
|
||||
{
|
||||
S& m_dst;
|
||||
DataBuffer m_buf;
|
||||
size_t m_buf_pos{0};
|
||||
|
||||
public:
|
||||
explicit BufferedWriter(S& stream LIFETIMEBOUND, size_t size = 1 << 16) : m_dst{stream}, m_buf(size) {}
|
||||
|
||||
~BufferedWriter() { flush(); }
|
||||
|
||||
void flush()
|
||||
{
|
||||
if (m_buf_pos) m_dst.write_buffer(std::span{m_buf}.first(m_buf_pos));
|
||||
m_buf_pos = 0;
|
||||
}
|
||||
|
||||
void write(std::span<const std::byte> src)
|
||||
{
|
||||
while (const auto available{std::min(src.size(), m_buf.size() - m_buf_pos)}) {
|
||||
std::copy_n(src.begin(), available, m_buf.begin() + m_buf_pos);
|
||||
m_buf_pos += available;
|
||||
if (m_buf_pos == m_buf.size()) flush();
|
||||
src = src.subspan(available);
|
||||
}
|
||||
}
|
||||
|
||||
template <typename T>
|
||||
BufferedWriter& operator<<(const T& obj)
|
||||
{
|
||||
Serialize(*this, obj);
|
||||
return *this;
|
||||
}
|
||||
};
|
||||
|
||||
#endif // BITCOIN_STREAMS_H
|
||||
|
|
|
@ -17,7 +17,7 @@
|
|||
#include <test/util/logging.h>
|
||||
#include <test/util/setup_common.h>
|
||||
|
||||
using node::BLOCK_SERIALIZATION_HEADER_SIZE;
|
||||
using node::STORAGE_HEADER_BYTES;
|
||||
using node::BlockManager;
|
||||
using node::KernelNotifications;
|
||||
using node::MAX_BLOCKFILE_SIZE;
|
||||
|
@ -40,12 +40,12 @@ BOOST_AUTO_TEST_CASE(blockmanager_find_block_pos)
|
|||
};
|
||||
BlockManager blockman{*Assert(m_node.shutdown_signal), blockman_opts};
|
||||
// simulate adding a genesis block normally
|
||||
BOOST_CHECK_EQUAL(blockman.WriteBlock(params->GenesisBlock(), 0).nPos, BLOCK_SERIALIZATION_HEADER_SIZE);
|
||||
BOOST_CHECK_EQUAL(blockman.WriteBlock(params->GenesisBlock(), 0).nPos, STORAGE_HEADER_BYTES);
|
||||
// simulate what happens during reindex
|
||||
// simulate a well-formed genesis block being found at offset 8 in the blk00000.dat file
|
||||
// the block is found at offset 8 because there is an 8 byte serialization header
|
||||
// consisting of 4 magic bytes + 4 length bytes before each block in a well-formed blk file.
|
||||
const FlatFilePos pos{0, BLOCK_SERIALIZATION_HEADER_SIZE};
|
||||
const FlatFilePos pos{0, STORAGE_HEADER_BYTES};
|
||||
blockman.UpdateBlockInfo(params->GenesisBlock(), 0, pos);
|
||||
// now simulate what happens after reindex for the first new block processed
|
||||
// the actual block contents don't matter, just that it's a block.
|
||||
|
@ -54,7 +54,7 @@ BOOST_AUTO_TEST_CASE(blockmanager_find_block_pos)
|
|||
// 8 bytes (for serialization header) + 285 (for serialized genesis block) = 293
|
||||
// add another 8 bytes for the second block's serialization header and we get 293 + 8 = 301
|
||||
FlatFilePos actual{blockman.WriteBlock(params->GenesisBlock(), 1)};
|
||||
BOOST_CHECK_EQUAL(actual.nPos, BLOCK_SERIALIZATION_HEADER_SIZE + ::GetSerializeSize(TX_WITH_WITNESS(params->GenesisBlock())) + BLOCK_SERIALIZATION_HEADER_SIZE);
|
||||
BOOST_CHECK_EQUAL(actual.nPos, STORAGE_HEADER_BYTES + ::GetSerializeSize(TX_WITH_WITNESS(params->GenesisBlock())) + STORAGE_HEADER_BYTES);
|
||||
}
|
||||
|
||||
BOOST_FIXTURE_TEST_CASE(blockmanager_scan_unlink_already_pruned_files, TestChain100Setup)
|
||||
|
@ -172,19 +172,19 @@ BOOST_AUTO_TEST_CASE(blockmanager_flush_block_file)
|
|||
FlatFilePos pos2{blockman.WriteBlock(block2, /*nHeight=*/2)};
|
||||
|
||||
// Two blocks in the file
|
||||
BOOST_CHECK_EQUAL(blockman.CalculateCurrentUsage(), (TEST_BLOCK_SIZE + BLOCK_SERIALIZATION_HEADER_SIZE) * 2);
|
||||
BOOST_CHECK_EQUAL(blockman.CalculateCurrentUsage(), (TEST_BLOCK_SIZE + STORAGE_HEADER_BYTES) * 2);
|
||||
|
||||
// First two blocks are written as expected
|
||||
// Errors are expected because block data is junk, thrown AFTER successful read
|
||||
CBlock read_block;
|
||||
BOOST_CHECK_EQUAL(read_block.nVersion, 0);
|
||||
{
|
||||
ASSERT_DEBUG_LOG("ReadBlock: Errors in block header");
|
||||
ASSERT_DEBUG_LOG("Errors in block header");
|
||||
BOOST_CHECK(!blockman.ReadBlock(read_block, pos1));
|
||||
BOOST_CHECK_EQUAL(read_block.nVersion, 1);
|
||||
}
|
||||
{
|
||||
ASSERT_DEBUG_LOG("ReadBlock: Errors in block header");
|
||||
ASSERT_DEBUG_LOG("Errors in block header");
|
||||
BOOST_CHECK(!blockman.ReadBlock(read_block, pos2));
|
||||
BOOST_CHECK_EQUAL(read_block.nVersion, 2);
|
||||
}
|
||||
|
@ -199,7 +199,7 @@ BOOST_AUTO_TEST_CASE(blockmanager_flush_block_file)
|
|||
// Metadata is updated...
|
||||
BOOST_CHECK_EQUAL(block_data->nBlocks, 3);
|
||||
// ...but there are still only two blocks in the file
|
||||
BOOST_CHECK_EQUAL(blockman.CalculateCurrentUsage(), (TEST_BLOCK_SIZE + BLOCK_SERIALIZATION_HEADER_SIZE) * 2);
|
||||
BOOST_CHECK_EQUAL(blockman.CalculateCurrentUsage(), (TEST_BLOCK_SIZE + STORAGE_HEADER_BYTES) * 2);
|
||||
|
||||
// Block 2 was not overwritten:
|
||||
blockman.ReadBlock(read_block, pos2);
|
||||
|
|
|
@ -2,6 +2,8 @@
|
|||
// Distributed under the MIT software license, see the accompanying
|
||||
// file COPYING or http://www.opensource.org/licenses/mit-license.php.
|
||||
|
||||
#include <flatfile.h>
|
||||
#include <node/blockstorage.h>
|
||||
#include <streams.h>
|
||||
#include <test/util/random.h>
|
||||
#include <test/util/setup_common.h>
|
||||
|
@ -553,6 +555,146 @@ BOOST_AUTO_TEST_CASE(streams_buffered_file_rand)
|
|||
fs::remove(streams_test_filename);
|
||||
}
|
||||
|
||||
BOOST_AUTO_TEST_CASE(buffered_reader_matches_autofile_random_content)
|
||||
{
|
||||
const size_t file_size{1 + m_rng.randrange<size_t>(1 << 17)};
|
||||
const size_t buf_size{1 + m_rng.randrange(file_size)};
|
||||
const FlatFilePos pos{0, 0};
|
||||
|
||||
const FlatFileSeq test_file{m_args.GetDataDirBase(), "buffered_file_test_random", node::BLOCKFILE_CHUNK_SIZE};
|
||||
const std::vector obfuscation{m_rng.randbytes<std::byte>(8)};
|
||||
|
||||
// Write out the file with random content
|
||||
{
|
||||
AutoFile{test_file.Open(pos, /*read_only=*/false), obfuscation}.write(m_rng.randbytes<std::byte>(file_size));
|
||||
}
|
||||
BOOST_CHECK_EQUAL(fs::file_size(test_file.FileName(pos)), file_size);
|
||||
|
||||
{
|
||||
AutoFile direct_file{test_file.Open(pos, /*read_only=*/true), obfuscation};
|
||||
|
||||
AutoFile buffered_file{test_file.Open(pos, /*read_only=*/true), obfuscation};
|
||||
BufferedReader buffered_reader{std::move(buffered_file), buf_size};
|
||||
|
||||
for (size_t total_read{0}; total_read < file_size;) {
|
||||
const size_t read{Assert(std::min(1 + m_rng.randrange(m_rng.randbool() ? buf_size : 2 * buf_size), file_size - total_read))};
|
||||
|
||||
DataBuffer direct_file_buffer{read};
|
||||
direct_file.read(direct_file_buffer);
|
||||
|
||||
DataBuffer buffered_buffer{read};
|
||||
buffered_reader.read(buffered_buffer);
|
||||
|
||||
BOOST_CHECK_EQUAL_COLLECTIONS(
|
||||
direct_file_buffer.begin(), direct_file_buffer.end(),
|
||||
buffered_buffer.begin(), buffered_buffer.end()
|
||||
);
|
||||
|
||||
total_read += read;
|
||||
}
|
||||
|
||||
{
|
||||
DataBuffer excess_byte{1};
|
||||
BOOST_CHECK_EXCEPTION(direct_file.read(excess_byte), std::ios_base::failure, HasReason{"end of file"});
|
||||
}
|
||||
|
||||
{
|
||||
DataBuffer excess_byte{1};
|
||||
BOOST_CHECK_EXCEPTION(buffered_reader.read(excess_byte), std::ios_base::failure, HasReason{"end of file"});
|
||||
}
|
||||
}
|
||||
|
||||
fs::remove(test_file.FileName(pos));
|
||||
}
|
||||
|
||||
BOOST_AUTO_TEST_CASE(buffered_writer_matches_autofile_random_content)
|
||||
{
|
||||
const size_t file_size{1 + m_rng.randrange<size_t>(1 << 17)};
|
||||
const size_t buf_size{1 + m_rng.randrange(file_size)};
|
||||
const FlatFilePos pos{0, 0};
|
||||
|
||||
const FlatFileSeq test_buffered{m_args.GetDataDirBase(), "buffered_write_test", node::BLOCKFILE_CHUNK_SIZE};
|
||||
const FlatFileSeq test_direct{m_args.GetDataDirBase(), "direct_write_test", node::BLOCKFILE_CHUNK_SIZE};
|
||||
const std::vector obfuscation{m_rng.randbytes<std::byte>(8)};
|
||||
|
||||
{
|
||||
DataBuffer test_data{m_rng.randbytes<std::byte>(file_size)};
|
||||
|
||||
AutoFile direct_file{test_direct.Open(pos, /*read_only=*/false), obfuscation};
|
||||
|
||||
AutoFile buffered_file{test_buffered.Open(pos, /*read_only=*/false), obfuscation};
|
||||
BufferedWriter buffered{buffered_file, buf_size};
|
||||
|
||||
for (size_t total_written{0}; total_written < file_size;) {
|
||||
const size_t write_size{Assert(std::min(1 + m_rng.randrange(m_rng.randbool() ? buf_size : 2 * buf_size), file_size - total_written))};
|
||||
|
||||
auto current_span = std::span{test_data}.subspan(total_written, write_size);
|
||||
direct_file.write(current_span);
|
||||
buffered.write(current_span);
|
||||
|
||||
total_written += write_size;
|
||||
}
|
||||
// Destructors of AutoFile and BufferedWriter will flush/close here
|
||||
}
|
||||
|
||||
// Compare the resulting files
|
||||
DataBuffer direct_result{file_size};
|
||||
{
|
||||
AutoFile verify_direct{test_direct.Open(pos, /*read_only=*/true), obfuscation};
|
||||
verify_direct.read(direct_result);
|
||||
|
||||
DataBuffer excess_byte{1};
|
||||
BOOST_CHECK_EXCEPTION(verify_direct.read(excess_byte), std::ios_base::failure, HasReason{"end of file"});
|
||||
}
|
||||
|
||||
DataBuffer buffered_result{file_size};
|
||||
{
|
||||
AutoFile verify_buffered{test_buffered.Open(pos, /*read_only=*/true), obfuscation};
|
||||
verify_buffered.read(buffered_result);
|
||||
|
||||
DataBuffer excess_byte{1};
|
||||
BOOST_CHECK_EXCEPTION(verify_buffered.read(excess_byte), std::ios_base::failure, HasReason{"end of file"});
|
||||
}
|
||||
|
||||
BOOST_CHECK_EQUAL_COLLECTIONS(
|
||||
direct_result.begin(), direct_result.end(),
|
||||
buffered_result.begin(), buffered_result.end()
|
||||
);
|
||||
|
||||
fs::remove(test_direct.FileName(pos));
|
||||
fs::remove(test_buffered.FileName(pos));
|
||||
}
|
||||
|
||||
BOOST_AUTO_TEST_CASE(buffered_writer_reader)
|
||||
{
|
||||
const uint32_t v1{m_rng.rand32()}, v2{m_rng.rand32()}, v3{m_rng.rand32()};
|
||||
const fs::path test_file{m_args.GetDataDirBase() / "test_buffered_write_read.bin"};
|
||||
|
||||
// Write out the values through a precisely sized BufferedWriter
|
||||
{
|
||||
AutoFile file{fsbridge::fopen(test_file, "w+b")};
|
||||
BufferedWriter f(file, sizeof(v1) + sizeof(v2) + sizeof(v3));
|
||||
f << v1 << v2;
|
||||
f.write(std::as_bytes(std::span{&v3, 1}));
|
||||
}
|
||||
// Read back and verify using BufferedReader
|
||||
{
|
||||
uint32_t _v1{0}, _v2{0}, _v3{0};
|
||||
AutoFile file{fsbridge::fopen(test_file, "rb")};
|
||||
BufferedReader f(std::move(file), sizeof(v1) + sizeof(v2) + sizeof(v3));
|
||||
f >> _v1 >> _v2;
|
||||
f.read(std::as_writable_bytes(std::span{&_v3, 1}));
|
||||
BOOST_CHECK_EQUAL(_v1, v1);
|
||||
BOOST_CHECK_EQUAL(_v2, v2);
|
||||
BOOST_CHECK_EQUAL(_v3, v3);
|
||||
|
||||
DataBuffer excess_byte{1};
|
||||
BOOST_CHECK_EXCEPTION(f.read(excess_byte), std::ios_base::failure, HasReason{"end of file"});
|
||||
}
|
||||
|
||||
fs::remove(test_file);
|
||||
}
|
||||
|
||||
BOOST_AUTO_TEST_CASE(streams_hashed)
|
||||
{
|
||||
DataStream stream{};
|
||||
|
|
Loading…
Add table
Reference in a new issue