Compare commits

...

14 commits

Author SHA1 Message Date
Pieter Wuille
eeb40c9257
Merge 72d3ca13b5 into 433412fd84 2025-01-07 01:14:29 +01:00
Pieter Wuille
72d3ca13b5 txgraph: (feature) expose ability to compare transactions
In order to make it possible for higher layers to compare transaction quality
(ordering within the implicit total ordering on the mempool), expose a comparison
function and test it.
2024-12-22 09:49:55 -05:00
Pieter Wuille
a16630a49c txgraph: (feature) destroying Ref means removing transaction
Before this commit, if a TxGraph::Ref object is destroyed, it becomes impossible
to refer to, but the actual corresponding transaction node in the TxGraph remains,
and remains indefinitely as there is no way to remove it.

Fix this by making the destruction of TxGraph::Ref trigger immediate removal of
the corresponding transaction in TxGraph, both in main and staging if it exists.
2024-12-22 09:49:54 -05:00
Pieter Wuille
82c947f165 txgraph: (feature) add staging support
In order to make it easy to evaluate proposed changes to a TxGraph, introduce a
"staging" mode, where mutators (AddTransaction, AddDependency, RemoveTransaction)
do not modify the actual graph, but just a staging version of it. That staging
graph can then be commited (replacing the main one with it), or aborted (discarding
the staging).
2024-12-22 09:49:13 -05:00
Pieter Wuille
b6a14f74c3 txgraph: (refactor) abstract out ClearLocator
Move a number of related modifications to TxGraphImpl into a separate
function for removal of transactions. This is preparation for a later
commit where this will be useful in more than one place.
2024-12-22 09:25:17 -05:00
Pieter Wuille
4c9fa2278b txgraph: (refactor) group per-graph data in ClusterSet
This is a preparation for a next commit where a TxGraph will start representing
potentially two distinct graphs (a main one, and a staging one with proposed
changes).
2024-12-22 09:25:17 -05:00
Pieter Wuille
b37d322447 txgraph: (optimization) special-case removal of tail of cluster
When transactions are removed from the tail of a cluster, we know the existing
linearization remains acceptable/optimal (if it already was), but may just need
splitting, so special case these into separate quality levels.
2024-12-22 09:25:17 -05:00
Pieter Wuille
60f4e41254 txgraph: (optimization) delay chunking while sub-acceptable
Chunk-based information (primarily, chunk feerates) are never accessed without
first bringing the relevant Clusters to an "acceptable" quality level. Thus,
while operations are ongoing and Clusters are not acceptable, we can omit
computing the chunkings and chunk feerates for Clusters.
2024-12-22 09:25:17 -05:00
Pieter Wuille
2d2cb1dc4c txgraph: (feature) make max cluster count configurable and "oversize" state
Instead of leaving the responsibility on higher layers to guarantee that
no connected component within TxGraph (a barely exposed concept, except through
GetCluster()) exceeds the cluster count limit, move this responsibility to
TxGraph itself:
* TxGraph retains a cluster count limit, but it becomes configurable at construction
  time (this primarily helps with testing that it is properly enforced).
* It is always allowed to perform mutators on TxGraph, even if they would cause the
  cluster count limit to be exceeded. Instead, TxGraph exposes an IsOversized()
  function, which queries whether it is in a special "oversize" state.
* During oversize state, many inspectors are unavailable, but mutators remain valid,
  so the higher layer can "fix" the oversize state before continuing.
2024-12-22 09:25:17 -05:00
Pieter Wuille
17b76ed4e1 txgraph: (tests) add internal sanity check function
To make testing more powerful, expose a function to perform an internal sanity
check on the state of a TxGraph. This is especially important as TxGraphImpl
contains many redundantly represented pieces of information:

* graph contains clusters, which refer to entries, but the entries refer back
* graph maintains pointers to Ref objects, which point back to the graph.

This lets us make sure they are always in sync.
2024-12-22 09:25:17 -05:00
Pieter Wuille
543a981912 txgraph: (tests) add simulation fuzz test
This adds a simulation fuzz test for txgraph, by comparing with a naive
reimplementation that models the entire graph as a single DepGraph, and
clusters in TxGraph as connected components within that DepGraph.
2024-12-22 09:25:14 -05:00
Pieter Wuille
b487030297 txgraph: (feature) add initial version
This adds an initial version of the txgraph module, with the TxGraph class.
It encapsulates knowledge about the fees, sizes, and dependencies between all
mempool transactions, but nothing else.

In particular, it lacks knowledge about txids, inputs, outputs, CTransactions,
... and so for. Instead, it exposes a generic TxGraph::Ref type to reference
nodes in the TxGraph, which can be passed around and stored by layers on top.
2024-12-22 09:22:49 -05:00
Pieter Wuille
5f3d8d1f40 clusterlin: make IsAcyclic() a DepGraph member function
... instead of being a separate test-only function.
2024-12-21 19:20:56 -05:00
Pieter Wuille
29e3d06975 clusterlin: add FixLinearization function + fuzz test
This function takes an existing ordering for transactions in a DepGraph, and
makes it a valid linearization for it (i.e., topological). Any topological
prefix of the input remains untouched.
2024-12-21 19:20:56 -05:00
8 changed files with 2739 additions and 14 deletions

View file

@ -280,6 +280,7 @@ add_library(bitcoin_node STATIC EXCLUDE_FROM_ALL
signet.cpp
torcontrol.cpp
txdb.cpp
txgraph.cpp
txmempool.cpp
txorphanage.cpp
txrequest.cpp

View file

@ -309,6 +309,17 @@ public:
return a < b;
});
}
/** Check if this graph is acyclic. */
bool IsAcyclic() const noexcept
{
for (auto i : Positions()) {
if ((Ancestors(i) & Descendants(i)) != SetType::Singleton(i)) {
return false;
}
}
return true;
}
};
/** A set of transactions together with their aggregate feerate. */
@ -1336,6 +1347,35 @@ std::vector<ClusterIndex> MergeLinearizations(const DepGraph<SetType>& depgraph,
return ret;
}
/** Make linearization topological, retaining its ordering where possible. */
template<typename SetType>
void FixLinearization(const DepGraph<SetType>& depgraph, Span<ClusterIndex> linearization) noexcept
{
// This algorithm can be summarized as moving every element in the linearization backwards
// until it is placed after all this ancestors.
SetType done;
const auto len = linearization.size();
// Iterate over the elements of linearization from back to front (i is distance from back).
for (ClusterIndex i = 0; i < len; ++i) {
/** The element at that position. */
ClusterIndex elem = linearization[len - 1 - i];
/** j represents how far from the back of the linearization elem should be placed. */
ClusterIndex j = i;
// Figure out which elements elem needs to be placed before.
SetType place_before = done & depgraph.Ancestors(elem);
// Find which position to place elem in (updating j), continuously moving the elements
// in between forward.
while (place_before.Any()) {
auto to_swap = linearization[len - 1 - (j - 1)];
place_before.Reset(to_swap);
linearization[len - 1 - (j--)] = to_swap;
}
// Put elem in its final position and mark it as done.
linearization[len - 1 - j] = elem;
done.Set(elem);
}
}
} // namespace cluster_linearize
#endif // BITCOIN_CLUSTER_LINEARIZE_H

View file

@ -122,6 +122,7 @@ add_executable(fuzz
tx_in.cpp
tx_out.cpp
tx_pool.cpp
txgraph.cpp
txorphan.cpp
txrequest.cpp
# Visual Studio 2022 version 17.12 introduced a bug

View file

@ -407,7 +407,7 @@ FUZZ_TARGET(clusterlin_depgraph_serialization)
SanityCheck(depgraph);
// Verify the graph is a DAG.
assert(IsAcyclic(depgraph));
assert(depgraph.IsAcyclic());
}
FUZZ_TARGET(clusterlin_components)
@ -1118,3 +1118,58 @@ FUZZ_TARGET(clusterlin_merge)
auto cmp2 = CompareChunks(chunking_merged, chunking2);
assert(cmp2 >= 0);
}
FUZZ_TARGET(clusterlin_fix_linearization)
{
// Verify expected properties of FixLinearization() on arbitrary linearizations.
// Retrieve a depgraph from the fuzz input.
SpanReader reader(buffer);
DepGraph<TestBitSet> depgraph;
try {
reader >> Using<DepGraphFormatter>(depgraph);
} catch (const std::ios_base::failure&) {}
// Construct an arbitrary linearization (not necessarily topological for depgraph).
std::vector<ClusterIndex> linearization;
/** Which transactions of depgraph are yet to be included in linearization. */
TestBitSet todo = depgraph.Positions();
/** Whether the linearization constructed so far is topological. */
bool topological{true};
/** How long the prefix of the constructed linearization is which is topological. */
size_t topo_prefix = 0;
while (todo.Any()) {
// Figure out the index in all elements of todo to append to linearization next.
uint64_t val{0};
try {
reader >> VARINT(val);
} catch (const std::ios_base::failure&) {}
val %= todo.Count();
// Find which element in todo that corresponds to.
for (auto i : todo) {
if (val == 0) {
// Found it.
linearization.push_back(i);
// Track whether or not the linearization is topological for depgraph.
todo.Reset(i);
if (todo.Overlaps(depgraph.Ancestors(i))) topological = false;
topo_prefix += topological;
break;
}
--val;
}
}
assert(linearization.size() == depgraph.TxCount());
// Then make a fixed copy of linearization.
auto linearization_fixed = linearization;
FixLinearization(depgraph, linearization_fixed);
// Sanity check it (which includes testing whether it is topological).
SanityCheck(depgraph, linearization_fixed);
// If the linearization was topological already, FixLinearization cannot have modified it.
if (topological) assert(linearization_fixed == linearization);
// In any case, the topo_prefix long prefix of linearization cannot be changed.
assert(std::equal(linearization.begin(), linearization.begin() + topo_prefix,
linearization_fixed.begin()));
}

651
src/test/fuzz/txgraph.cpp Normal file
View file

@ -0,0 +1,651 @@
// Copyright (c) The Bitcoin Core developers
// Distributed under the MIT software license, see the accompanying
// file COPYING or http://www.opensource.org/licenses/mit-license.php.
#include <txgraph.h>
#include <cluster_linearize.h>
#include <test/fuzz/fuzz.h>
#include <test/fuzz/FuzzedDataProvider.h>
#include <test/util/random.h>
#include <util/bitset.h>
#include <util/feefrac.h>
#include <algorithm>
#include <map>
#include <memory>
#include <set>
#include <stdint.h>
#include <utility>
using namespace cluster_linearize;
namespace {
/** Data type representing a naive simulated TxGraph, keeping all transactions (even from
* disconnected components) in a single DepGraph. Unlike the real TxGraph, this only models
* a single graph, and multiple instances are used to simulate main/staging. */
struct SimTxGraph
{
/** Maximum number of transactions to support simultaneously. Set this higher than txgraph's
* cluster count, so we can exercise situations with more transactions than fit in one
* cluster. */
static constexpr unsigned MAX_TRANSACTIONS = MAX_CLUSTER_COUNT_LIMIT * 2;
/** Set type to use in the simulation. */
using SetType = BitSet<MAX_TRANSACTIONS>;
/** Data type for representing positions within SimTxGraph::graph. */
using Pos = ClusterIndex;
/** Constant to mean "missing in this graph". */
static constexpr auto MISSING = Pos(-1);
/** The dependency graph (for all transactions in the simulation, regardless of
* connectivity/clustering). */
DepGraph<SetType> graph;
/** For each position in graph, which TxGraph::Ref it corresponds with (if any). Use shared_ptr
* so that a SimTxGraph can be copied to create a staging one, while sharing Refs with
* the main graph. */
std::array<std::shared_ptr<TxGraph::Ref>, MAX_TRANSACTIONS> simmap;
/** For each TxGraph::Ref in graph, the position it corresponds with. */
std::map<const TxGraph::Ref*, Pos> simrevmap;
/** The set of TxGraph::Ref entries that have been removed, but not yet Cleanup()'ed in
* the real TxGraph. */
std::vector<std::shared_ptr<TxGraph::Ref>> removed;
/** Whether the graph is oversized (true = yes, false = no, std::nullopt = unknown). */
std::optional<bool> oversized;
/** The configured maximum number of transactions per cluster. */
ClusterIndex max_cluster_count;
/** Construct a new SimTxGraph with the specified maximum cluster count. */
explicit SimTxGraph(ClusterIndex max_cluster) : max_cluster_count(max_cluster) {}
// Permit copying and moving.
SimTxGraph(const SimTxGraph&) noexcept = default;
SimTxGraph& operator=(const SimTxGraph&) noexcept = default;
SimTxGraph(SimTxGraph&&) noexcept = default;
SimTxGraph& operator=(SimTxGraph&&) noexcept = default;
/** Check whether this graph is oversized (contains a connected component whose number of
* transactions exceeds max_cluster_count. */
bool IsOversized()
{
if (!oversized.has_value()) {
// Only recompute when oversized isn't already known.
oversized = false;
auto todo = graph.Positions();
// Iterate over all connected components of the graph.
while (todo.Any()) {
auto component = graph.FindConnectedComponent(todo);
if (component.Count() > max_cluster_count) oversized = true;
todo -= component;
}
}
return *oversized;
}
/** Determine the number of (non-removed) transactions in the graph. */
ClusterIndex GetTransactionCount() const { return graph.TxCount(); }
/** Get the position where ref occurs in this simulated graph, or -1 if it does not. */
Pos Find(const TxGraph::Ref& ref) const
{
if (!ref) return MISSING;
auto it = simrevmap.find(&ref);
if (it != simrevmap.end()) return it->second;
return MISSING;
}
/** Given a position in this simulated graph, get the corresponding TxGraph::Ref. */
TxGraph::Ref& GetRef(Pos pos)
{
assert(graph.Positions()[pos]);
assert(simmap[pos]);
return *simmap[pos].get();
}
/** Add a new transaction to the simulation. */
TxGraph::Ref& AddTransaction(const FeeFrac& feerate)
{
assert(graph.TxCount() < MAX_TRANSACTIONS);
auto simpos = graph.AddTransaction(feerate);
assert(graph.Positions()[simpos]);
simmap[simpos] = std::make_shared<TxGraph::Ref>();
auto ptr = simmap[simpos].get();
simrevmap[ptr] = simpos;
return *ptr;
}
/** Add a dependency between two positions in this graph. */
void AddDependency(TxGraph::Ref& parent, TxGraph::Ref& child)
{
auto par_pos = Find(parent);
if (par_pos == MISSING) return;
auto chl_pos = Find(child);
if (chl_pos == MISSING) return;
graph.AddDependencies(SetType::Singleton(par_pos), chl_pos);
// This may invalidate our cached oversized value.
if (oversized.has_value() && !*oversized) oversized = std::nullopt;
}
/** Modify the transaction fee of a ref, if it exists. */
void SetTransactionFee(TxGraph::Ref& ref, int64_t fee)
{
auto pos = Find(ref);
if (pos == MISSING) return;
graph.FeeRate(pos).fee = fee;
}
/** Remove the transaction in the specified position from the graph. */
void RemoveTransaction(TxGraph::Ref& ref)
{
auto pos = Find(ref);
if (pos == MISSING) return;
graph.RemoveTransactions(SetType::Singleton(pos));
simrevmap.erase(simmap[pos].get());
// Remember the TxGraph::Ref corresponding to this position, because we still expect
// to see it when calling Cleanup().
removed.push_back(std::move(simmap[pos]));
simmap[pos].reset();
// This may invalidate our cached oversized value.
if (oversized.has_value() && *oversized) oversized = std::nullopt;
}
/** Destroy the transaction from the graph, including from the removed set. This will
* trigger TxGraph::Ref::~Ref. reset_oversize controls whether the cached oversized
* value is cleared (destroying does not clear oversizedness in TxGraph of the main
* graph while staging exists). */
void DestroyTransaction(TxGraph::Ref& ref, bool reset_oversize)
{
// Special case the empty Ref.
if (!ref) return;
auto pos = Find(ref);
if (pos == MISSING) {
// Wipe the ref, if it exists, from the removed vector. Use std::partition rather
// than std::erase because we don't care about the order of the entries that
// remain.
auto remove = std::partition(removed.begin(), removed.end(), [&](auto& arg) { return arg.get() != &ref; });
removed.erase(remove, removed.end());
} else {
graph.RemoveTransactions(SetType::Singleton(pos));
simrevmap.erase(simmap[pos].get());
simmap[pos].reset();
// This may invalidate our cached oversized value.
if (reset_oversize && oversized.has_value() && *oversized) {
oversized = std::nullopt;
}
}
}
/** Construct the set with all positions in this graph corresponding to the specified
* TxGraph::Refs. All of them must occur in this graph and not be removed. */
SetType MakeSet(std::span<TxGraph::Ref* const> arg)
{
SetType ret;
for (TxGraph::Ref* ptr : arg) {
auto pos = Find(*ptr);
assert(pos != Pos(-1));
ret.Set(pos);
}
return ret;
}
/** Get the set of ancestors (desc=false) or descendants (desc=true) in this graph. */
SetType GetAncDesc(TxGraph::Ref& arg, bool desc)
{
auto pos = Find(arg);
if (pos == MISSING) return {};
return desc ? graph.Descendants(pos) : graph.Ancestors(pos);
}
/** Given a set of Refs (given as a vector of pointers), expand the set to include all its
* ancestors (desc=false) or all its descendants (desc=true) in this graph. */
void IncludeAncDesc(std::vector<TxGraph::Ref*>& arg, bool desc)
{
std::vector<TxGraph::Ref*> ret;
for (auto ptr : arg) {
auto simpos = Find(*ptr);
if (simpos != MISSING) {
for (auto i : desc ? graph.Descendants(simpos) : graph.Ancestors(simpos)) {
ret.push_back(simmap[i].get());
}
} else {
ret.push_back(ptr);
}
}
// Deduplicate.
std::sort(ret.begin(), ret.end());
ret.erase(std::unique(ret.begin(), ret.end()), ret.end());
// Replace input.
arg = std::move(ret);
}
};
} // namespace
FUZZ_TARGET(txgraph)
{
SeedRandomStateForTest(SeedRand::ZEROS);
FuzzedDataProvider provider(buffer.data(), buffer.size());
/** Internal test RNG, used only for decisions which would require significant amount of data
* to be read from the provider, without realistically impacting test sensitivity. */
InsecureRandomContext rng(0xdecade2009added + buffer.size());
/** Variable used whenever an empty TxGraph::Ref is needed. */
TxGraph::Ref empty_ref;
// Decide the maximum number of transactions per cluster we will use in this simulation.
auto max_count = provider.ConsumeIntegralInRange<ClusterIndex>(1, MAX_CLUSTER_COUNT_LIMIT);
// Construct a real graph, and a vector of simulated graphs (main, and possibly staging).
auto real = MakeTxGraph(max_count);
std::vector<SimTxGraph> sims;
sims.reserve(2);
sims.emplace_back(max_count);
/** Function to pick any Ref (in either sim graph, either sim.removed, or empty). */
auto pick_fn = [&]() noexcept -> TxGraph::Ref& {
size_t tx_count[2] = {sims[0].GetTransactionCount(), 0};
/** The number of possible choices. */
size_t choices = tx_count[0] + sims[0].removed.size() + 1;
if (sims.size() == 2) {
tx_count[1] = sims[1].GetTransactionCount();
choices += tx_count[1] + sims[1].removed.size();
}
/** Pick one of them. */
auto choice = provider.ConsumeIntegralInRange<size_t>(0, choices - 1);
// Consider both main and (if it exists) staging.
for (size_t level = 0; level < sims.size(); ++level) {
auto& sim = sims[level];
if (choice < tx_count[level]) {
// Return from graph.
for (auto i : sim.graph.Positions()) {
if (choice == 0) return sim.GetRef(i);
--choice;
}
assert(false);
} else {
choice -= tx_count[level];
}
if (choice < sim.removed.size()) {
// Return from removed.
return *sim.removed[choice];
} else {
choice -= sim.removed.size();
}
}
// Return empty.
assert(choice == 0);
return empty_ref;
};
LIMITED_WHILE(provider.remaining_bytes() > 0, 200) {
// Read a one-byte command.
int command = provider.ConsumeIntegral<uint8_t>();
// Treat the lowest bit of a command as a flag (which selects a variant of some of the
// operations), and the second-lowest bit as a way of selecting main vs. staging, and leave
// the rest of the bits in command.
bool alt = command & 1;
bool use_main = command & 2;
command >>= 2;
// Provide convenient aliases for the top simulated graph (main, or staging if it exists),
// one for the simulated graph selected based on use_main (for operations that can operate
// on both graphs), and one that always refers to the main graph.
auto& top_sim = sims.back();
auto& sel_sim = use_main ? sims[0] : top_sim;
auto& main_sim = sims[0];
// Keep decrementing command for each applicable operation, until one is hit. Multiple
// iterations may be necessary.
while (true) {
if (top_sim.GetTransactionCount() < SimTxGraph::MAX_TRANSACTIONS && command-- == 0) {
// AddTransaction.
int64_t fee;
int32_t size;
if (alt) {
fee = provider.ConsumeIntegralInRange<int64_t>(-0x8000000000000, 0x7ffffffffffff);
size = provider.ConsumeIntegralInRange<int32_t>(1, 0x3fffff);
} else {
fee = provider.ConsumeIntegral<uint8_t>();
size = provider.ConsumeIntegral<uint8_t>() + 1;
}
FeeFrac feerate{fee, size};
// Create a real TxGraph::Ref.
auto ref = real->AddTransaction(feerate);
// Create a shared_ptr place in the simulation to put the Ref in.
auto& ref_loc = top_sim.AddTransaction(feerate);
// Move it in place.
ref_loc = std::move(ref);
break;
} else if (top_sim.GetTransactionCount() + top_sim.removed.size() > 1 && command-- == 0) {
// AddDependency.
auto& par = pick_fn();
auto& chl = pick_fn();
auto pos_par = top_sim.Find(par);
auto pos_chl = top_sim.Find(chl);
if (pos_par != SimTxGraph::MISSING && pos_chl != SimTxGraph::MISSING) {
// Determine if adding this would introduce a cycle (not allowed by TxGraph),
// and if so, skip.
if (top_sim.graph.Ancestors(pos_par)[pos_chl]) break;
}
top_sim.AddDependency(par, chl);
real->AddDependency(par, chl);
break;
} else if (top_sim.removed.size() < 100 && command-- == 0) {
// RemoveTransaction. Either all its ancestors or all its descendants are also
// removed (if any), to make sure TxGraph's reordering of removals and dependencies
// has no effect.
std::vector<TxGraph::Ref*> to_remove;
to_remove.push_back(&pick_fn());
top_sim.IncludeAncDesc(to_remove, alt);
// The order in which these ancestors/descendants are removed should not matter;
// randomly shuffle them.
std::shuffle(to_remove.begin(), to_remove.end(), rng);
for (TxGraph::Ref* ptr : to_remove) {
real->RemoveTransaction(*ptr);
top_sim.RemoveTransaction(*ptr);
}
break;
} else if (sel_sim.GetTransactionCount() > 0 && command-- == 0) {
// SetTransactionFee.
int64_t fee;
if (alt) {
fee = provider.ConsumeIntegralInRange<int64_t>(-0x8000000000000, 0x7ffffffffffff);
} else {
fee = provider.ConsumeIntegral<uint8_t>();
}
auto& ref = pick_fn();
real->SetTransactionFee(ref, fee);
for (auto& sim : sims) {
sim.SetTransactionFee(ref, fee);
}
break;
} else if (command-- == 0) {
// ~Ref.
std::vector<TxGraph::Ref*> to_destroy;
to_destroy.push_back(&pick_fn());
while (true) {
// Keep adding either the ancestors or descendants the already picked
// transactions have in both graphs (main and staging) combined. Destroying
// will trigger deletions in both, so to have consistent TxGraph behavior, the
// set must be closed under ancestors, or descendants, in both graphs.
auto old_size = to_destroy.size();
for (auto& sim : sims) sim.IncludeAncDesc(to_destroy, alt);
if (to_destroy.size() == old_size) break;
}
// The order in which these ancestors/descendants are destroyed should not matter;
// randomly shuffle them.
std::shuffle(to_destroy.begin(), to_destroy.end(), rng);
for (TxGraph::Ref* ptr : to_destroy) {
for (size_t level = 0; level < sims.size(); ++level) {
sims[level].DestroyTransaction(*ptr, level == sims.size() - 1);
}
}
break;
} else if (command-- == 0) {
// Cleanup.
auto cleaned = real->Cleanup();
if (sims.size() == 1 && !top_sim.IsOversized()) {
assert(top_sim.removed.size() == cleaned.size());
std::sort(cleaned.begin(), cleaned.end());
std::sort(top_sim.removed.begin(), top_sim.removed.end());
for (size_t i = 0; i < top_sim.removed.size(); ++i) {
assert(cleaned[i] == top_sim.removed[i].get());
}
top_sim.removed.clear();
} else {
assert(cleaned.empty());
}
break;
} else if (command-- == 0) {
// GetTransactionCount.
assert(real->GetTransactionCount(use_main) == sel_sim.GetTransactionCount());
break;
} else if (command-- == 0) {
// Exists.
auto& ref = pick_fn();
bool exists = real->Exists(ref, use_main);
bool should_exist = sel_sim.Find(ref) != SimTxGraph::MISSING;
assert(exists == should_exist);
break;
} else if (command-- == 0) {
// IsOversized.
assert(sel_sim.IsOversized() == real->IsOversized(use_main));
break;
} else if (command-- == 0) {
// GetIndividualFeerate.
auto& ref = pick_fn();
auto feerate = real->GetIndividualFeerate(ref);
bool found{false};
for (auto& sim : sims) {
auto simpos = sim.Find(ref);
if (simpos != SimTxGraph::MISSING) {
found = true;
assert(feerate == sim.graph.FeeRate(simpos));
}
}
if (!found) assert(feerate.IsEmpty());
break;
} else if (!main_sim.IsOversized() && command-- == 0) {
// GetMainChunkFeerate.
auto& ref = pick_fn();
auto feerate = real->GetMainChunkFeerate(ref);
auto simpos = main_sim.Find(ref);
if (simpos == SimTxGraph::MISSING) {
assert(feerate.IsEmpty());
} else {
// Just do some quick checks that the reported value is in range. A full
// recomputation of expected chunk feerates is done at the end.
assert(feerate.size >= main_sim.graph.FeeRate(simpos).size);
}
break;
} else if (!sel_sim.IsOversized() && command-- == 0) {
// GetAncestors/GetDescendants.
auto& ref = pick_fn();
auto result = alt ? real->GetDescendants(ref, use_main)
: real->GetAncestors(ref, use_main);
assert(result.size() <= max_count);
auto result_set = sel_sim.MakeSet(result);
assert(result.size() == result_set.Count());
auto expect_set = sel_sim.GetAncDesc(ref, alt);
assert(result_set == expect_set);
break;
} else if (!sel_sim.IsOversized() && command-- == 0) {
// GetCluster.
auto& ref = pick_fn();
auto result = real->GetCluster(ref, use_main);
// Check cluster count limit.
assert(result.size() <= max_count);
// Require the result to be topologically valid and not contain duplicates.
auto left = sel_sim.graph.Positions();
for (auto refptr : result) {
auto simpos = sel_sim.Find(*refptr);
assert(simpos != SimTxGraph::MISSING);
assert(left[simpos]);
left.Reset(simpos);
assert(!sel_sim.graph.Ancestors(simpos).Overlaps(left));
}
// Require the set to be connected.
auto result_set = sel_sim.MakeSet(result);
assert(sel_sim.graph.IsConnected(result_set));
// If ref exists, the result must contain it. If not, it must be empty.
auto simpos = sel_sim.Find(ref);
if (simpos != SimTxGraph::MISSING) {
assert(result_set[simpos]);
} else {
assert(result_set.None());
}
// Require the set not to have ancestors or descendants outside of it.
for (auto i : result_set) {
assert(sel_sim.graph.Ancestors(i).IsSubsetOf(result_set));
assert(sel_sim.graph.Descendants(i).IsSubsetOf(result_set));
}
break;
} else if (command-- == 0) {
// HaveStaging.
assert((sims.size() == 2) == real->HaveStaging());
break;
} else if (sims.size() < 2 && command-- == 0) {
// StartStaging.
sims.emplace_back(sims.back());
real->StartStaging();
break;
} else if (sims.size() > 1 && command-- == 0) {
// AbortStaging/CommitStaging.
if (alt) {
real->AbortStaging();
sims.pop_back();
// Reset the cached oversized value (if TxGraph::Ref destructions triggered
// removals of main transactions while staging was active, then aborting will
// cause it to be re-evaluated in TxGraph).
sims.back().oversized = std::nullopt;
} else {
real->CommitStaging();
sims.erase(sims.begin());
}
break;
} else if (main_sim.GetTransactionCount() > 0 && !main_sim.IsOversized() && command-- == 0) {
// CompareMainOrder.
auto& ref_a = pick_fn();
auto& ref_b = pick_fn();
auto sim_a = main_sim.Find(ref_a);
auto sim_b = main_sim.Find(ref_b);
// Both transactions must exist in the main graph.
if (sim_a == SimTxGraph::MISSING || sim_b == SimTxGraph::MISSING) break;
auto cmp = real->CompareMainOrder(ref_a, ref_b);
// Distinct transactions have distinct places.
if (sim_a != sim_b) assert(cmp != 0);
// Ancestors go before descendants.
if (main_sim.graph.Ancestors(sim_a)[sim_b]) assert(cmp >= 0);
if (main_sim.graph.Descendants(sim_a)[sim_b]) assert(cmp <= 0);
// Do not verify consistency with chunk feerates, as we cannot easily determine
// these here without making more calls to real, which could affect its internal
// state. A full comparison is done at the end.
break;
}
}
}
// After running all modifications, perform an internal sanity check (before invoking
// inspectors that may modify the internal state).
real->SanityCheck();
if (!sims[0].IsOversized()) {
// If the main graph is not oversized, verify the total ordering implied by
// CompareMainOrder.
// First construct two distinct randomized permutations of the positions in sims[0].
std::vector<SimTxGraph::Pos> vec1;
for (auto i : sims[0].graph.Positions()) vec1.push_back(i);
std::shuffle(vec1.begin(), vec1.end(), rng);
auto vec2 = vec1;
std::shuffle(vec2.begin(), vec2.end(), rng);
if (vec1 == vec2) std::next_permutation(vec2.begin(), vec2.end());
// Sort both according to CompareMainOrder. By having randomized starting points, the order
// of CompareMainOrder invocations is somewhat randomized as well.
auto cmp = [&](SimTxGraph::Pos a, SimTxGraph::Pos b) noexcept {
return real->CompareMainOrder(sims[0].GetRef(a), sims[0].GetRef(b)) < 0;
};
std::sort(vec1.begin(), vec1.end(), cmp);
std::sort(vec2.begin(), vec2.end(), cmp);
// Verify the resulting orderings are identical. This could only fail if the ordering was
// not total.
assert(vec1 == vec2);
// Verify that the ordering is topological.
auto todo = sims[0].graph.Positions();
for (auto i : vec1) {
todo.Reset(i);
assert(!sims[0].graph.Ancestors(i).Overlaps(todo));
}
assert(todo.None());
// For every transaction in the total ordering, find a random one before it and after it,
// and compare their chunk feerates, which must be consistent with the ordering.
for (size_t pos = 0; pos < vec1.size(); ++pos) {
auto pos_feerate = real->GetMainChunkFeerate(sims[0].GetRef(vec1[pos]));
if (pos > 0) {
size_t before = rng.randrange<size_t>(pos);
auto before_feerate = real->GetMainChunkFeerate(sims[0].GetRef(vec1[before]));
assert(FeeRateCompare(before_feerate, pos_feerate) >= 0);
}
if (pos + 1 < vec1.size()) {
size_t after = pos + 1 + rng.randrange<size_t>(vec1.size() - 1 - pos);
auto after_feerate = real->GetMainChunkFeerate(sims[0].GetRef(vec1[after]));
assert(FeeRateCompare(after_feerate, pos_feerate) <= 0);
}
}
}
assert(real->HaveStaging() == (sims.size() > 1));
// Try to run a full comparison, for both main_only=false and main_only=true in TxGraph
// inspector functions that support both.
for (int main_only = 0; main_only < 2; ++main_only) {
auto& sim = main_only ? sims[0] : sims.back();
// Compare simple properties of the graph with the simulation.
assert(real->IsOversized(main_only) == sim.IsOversized());
assert(real->GetTransactionCount(main_only) == sim.GetTransactionCount());
// If the graph (and the simulation) are not oversized, perform a full comparison.
if (!sim.IsOversized()) {
auto todo = sim.graph.Positions();
// Iterate over all connected components of the resulting (simulated) graph, each of which
// should correspond to a cluster in the real one.
while (todo.Any()) {
auto component = sim.graph.FindConnectedComponent(todo);
todo -= component;
// Iterate over the transactions in that component.
for (auto i : component) {
// Check its individual feerate against simulation.
assert(sim.graph.FeeRate(i) == real->GetIndividualFeerate(sim.GetRef(i)));
// Check its ancestors against simulation.
auto expect_anc = sim.graph.Ancestors(i);
auto anc = sim.MakeSet(real->GetAncestors(sim.GetRef(i), main_only));
assert(anc.Count() <= max_count);
assert(anc == expect_anc);
// Check its descendants against simulation.
auto expect_desc = sim.graph.Descendants(i);
auto desc = sim.MakeSet(real->GetDescendants(sim.GetRef(i), main_only));
assert(desc.Count() <= max_count);
assert(desc == expect_desc);
// Check the cluster the transaction is part of.
auto cluster = real->GetCluster(sim.GetRef(i), main_only);
assert(cluster.size() <= max_count);
assert(sim.MakeSet(cluster) == component);
// Check that the cluster is reported in a valid topological order (its
// linearization).
std::vector<ClusterIndex> simlin;
SimTxGraph::SetType done;
for (TxGraph::Ref* ptr : cluster) {
auto simpos = sim.Find(*ptr);
assert(sim.graph.Descendants(simpos).IsSubsetOf(component - done));
done.Set(simpos);
assert(sim.graph.Ancestors(simpos).IsSubsetOf(done));
simlin.push_back(simpos);
}
// Construct a chunking object for the simulated graph, using the reported cluster
// linearization as ordering, and compare it against the reported chunk feerates.
if (sims.size() == 1 || main_only) {
cluster_linearize::LinearizationChunking simlinchunk(sim.graph, simlin);
ClusterIndex idx{0};
for (unsigned chunknum = 0; chunknum < simlinchunk.NumChunksLeft(); ++chunknum) {
auto chunk = simlinchunk.GetChunk(chunknum);
// Require that the chunks of cluster linearizations are connected (this must
// be the case as all linearizations inside are PostLinearized).
assert(sim.graph.IsConnected(chunk.transactions));
// Check the chunk feerates of all transactions in the cluster.
while (chunk.transactions.Any()) {
assert(chunk.transactions[simlin[idx]]);
chunk.transactions.Reset(simlin[idx]);
assert(chunk.feerate == real->GetMainChunkFeerate(*cluster[idx]));
++idx;
}
}
}
}
}
}
}
// Sanity check again (because invoking inspectors may modify internal unobservable state).
real->SanityCheck();
}

View file

@ -23,18 +23,6 @@ using namespace cluster_linearize;
using TestBitSet = BitSet<32>;
/** Check if a graph is acyclic. */
template<typename SetType>
bool IsAcyclic(const DepGraph<SetType>& depgraph) noexcept
{
for (ClusterIndex i : depgraph.Positions()) {
if ((depgraph.Ancestors(i) & depgraph.Descendants(i)) != SetType::Singleton(i)) {
return false;
}
}
return true;
}
/** A formatter for a bespoke serialization for acyclic DepGraph objects.
*
* The serialization format outputs information about transactions in a topological order (parents
@ -337,7 +325,7 @@ void SanityCheck(const DepGraph<SetType>& depgraph)
assert((depgraph.Descendants(child) & children).IsSubsetOf(SetType::Singleton(child)));
}
}
if (IsAcyclic(depgraph)) {
if (depgraph.IsAcyclic()) {
// If DepGraph is acyclic, serialize + deserialize must roundtrip.
std::vector<unsigned char> ser;
VectorWriter writer(ser, 0);

1823
src/txgraph.cpp Normal file

File diff suppressed because it is too large Load diff

166
src/txgraph.h Normal file
View file

@ -0,0 +1,166 @@
// Copyright (c) The Bitcoin Core developers
// Distributed under the MIT software license, see the accompanying
// file COPYING or http://www.opensource.org/licenses/mit-license.php.
#include <compare>
#include <stdint.h>
#include <memory>
#include <vector>
#include <util/feefrac.h>
#ifndef BITCOIN_TXGRAPH_H
#define BITCOIN_TXGRAPH_H
static constexpr unsigned MAX_CLUSTER_COUNT_LIMIT{64};
/** Data structure to encapsulate fees, sizes, and dependencies for a set of transactions. */
class TxGraph
{
public:
/** Internal identifier for a transaction within a TxGraph. */
using GraphIndex = uint32_t;
/** Data type used to reference transactions within a TxGraph.
*
* Every transaction within a TxGraph has exactly one corresponding TxGraph::Ref, held by users
* of the class. Destroying the TxGraph::Ref removes the corresponding transaction.
*
* Users of the class can inherit from TxGraph::Ref. If all Refs are inherited this way, the
* Ref* pointers returned by TxGraph functions can be used as this inherited type.
*/
class Ref
{
// Allow TxGraph's GetRefGraph and GetRefIndex to access internals.
friend class TxGraph;
/** Which Graph the Entry lives in. nullptr if this Ref is empty. */
TxGraph* m_graph = nullptr;
/** Index into the Graph's m_entries. Only used if m_graph != nullptr. */
GraphIndex m_index = GraphIndex(-1);
public:
/** Construct an empty Ref (not pointing to any Entry). */
Ref() noexcept = default;
/** Test if this Ref is not empty. */
explicit operator bool() const noexcept { return m_graph != nullptr; }
/** Destroy this Ref. If it is not empty, the corresponding transaction is removed (in both
* main and staging, if it exists). */
virtual ~Ref();
// Support moving a Ref.
Ref& operator=(Ref&& other) noexcept;
Ref(Ref&& other) noexcept;
// Do not permit copy constructing or copy assignment. A TxGraph entry can have at most one
// Ref pointing to it.
Ref& operator=(const Ref&) = delete;
Ref(const Ref&) = delete;
};
protected:
// Allow TxGraph::Ref to call UpdateRef and UnlinkRef.
friend class TxGraph::Ref;
/** Inform the TxGraph implementation that a TxGraph::Ref has moved. */
virtual void UpdateRef(GraphIndex index, Ref& new_location) noexcept = 0;
/** Inform the TxGraph implementation that a TxGraph::Ref was destroyed. */
virtual void UnlinkRef(GraphIndex index) noexcept = 0;
// Allow TxGraph implementations (inheriting from it) to access Ref internals.
static TxGraph*& GetRefGraph(Ref& arg) noexcept { return arg.m_graph; }
static TxGraph* GetRefGraph(const Ref& arg) noexcept { return arg.m_graph; }
static GraphIndex& GetRefIndex(Ref& arg) noexcept { return arg.m_index; }
static GraphIndex GetRefIndex(const Ref& arg) noexcept { return arg.m_index; }
public:
/** Virtual destructor, so inheriting is safe. */
virtual ~TxGraph() = default;
/** Construct a new transaction with the specified feerate, and return a Ref to it.
* If a staging graph exists, the new transaction is only created there. */
[[nodiscard]] virtual Ref AddTransaction(const FeeFrac& feerate) noexcept = 0;
/** Remove the specified transaction. If a staging graph exists, the removal only happens
* there. This is a no-op if the transaction was already removed.
*
* TxGraph may internally reorder transaction removals with dependency additions for
* performance reasons. If together with any transaction removal all its descendants, or all
* its ancestors, are removed as well (which is what always happens in realistic scenarios),
* this reordering will not affect the behavior of TxGraph.
*
* As an example, imagine 3 transactions A,B,C where B depends on A. If a dependency of C on B
* is added, and then B is deleted, C will still depend on A. If the deletion of B is reordered
* before the C->B dependency is added, it has no effect instead. If, together with the
* deletion of B also either A or C is deleted, there is no distinction.
*/
virtual void RemoveTransaction(Ref& arg) noexcept = 0;
/** Add a dependency between two specified transactions. If a staging graph exists, the
* dependency is only added there. Parent may not be a descendant of child already (but may
* be an ancestor of it already, in which case this is a no-op). If either transaction is
* already removed, this is a no-op. */
virtual void AddDependency(Ref& parent, Ref& child) noexcept = 0;
/** Modify the fee of the specified transaction, in both the main graph and the staging
* graph if it exists. Wherever the transaction does not exist (or was removed), this has no
* effect. */
virtual void SetTransactionFee(Ref& arg, int64_t fee) noexcept = 0;
/** Return a vector of pointers to Ref objects for transactions which have been removed from
* the graph, and have not been destroyed yet. This has no effect if a staging graph exists,
* or if the graph is oversized (see below). Each transaction is only reported once by
* Cleanup(). Afterwards, all Refs will be empty. */
[[nodiscard]] virtual std::vector<Ref*> Cleanup() noexcept = 0;
/** Create a staging graph (which cannot exist already). This acts as if a full copy of
* the transaction graph is made, upon which further modifications are made. This copy can
* be inspected, and then either discarded, or the main graph can be replaced by it by
* commiting it. */
virtual void StartStaging() noexcept = 0;
/** Discard the existing active staging graph (which must exist). */
virtual void AbortStaging() noexcept = 0;
/** Replace the main graph with the staging graph (which must exist). */
virtual void CommitStaging() noexcept = 0;
/** Check whether a staging graph exists. */
virtual bool HaveStaging() const noexcept = 0;
/** Determine whether arg exists in the graph (i.e., was not removed). If main_only is false
* and a staging graph exists, it is queried; otherwise the main graph is queried. */
virtual bool Exists(const Ref& arg, bool main_only = false) noexcept = 0;
/** Determine whether the graph is oversized (contains a connected component of more than the
* configured maximum cluster count). If main_only is false and a staging graph exists, it is
* queried; otherwise the main graph is queried. Some of the functions below are not available
* for oversized graphs. The mutators above are always available. Removing a transaction by
* destroying its Ref while staging exists will not clear main's oversizedness until staging
* is aborted or committed. */
virtual bool IsOversized(bool main_only = false) noexcept = 0;
/** Get the feerate of the chunk which transaction arg is in in the main graph. Returns the
* empty FeeFrac if arg does not exist in the main graph. The main graph must not be
* oversized. */
virtual FeeFrac GetMainChunkFeerate(const Ref& arg) noexcept = 0;
/** Get the individual transaction feerate of transaction arg. Returns the empty FeeFrac if
* arg does not exist in either main or staging. This is available even for oversized
* graphs. */
virtual FeeFrac GetIndividualFeerate(const Ref& arg) noexcept = 0;
/** Get pointers to all transactions in the connected component ("cluster") which arg is in.
* The transactions will be returned in a topologically-valid order of acceptable quality.
* If main_only is false and a staging graph exists, it is queried; otherwise the main graph
* is queried. The queried graph must not be oversized. Returns {} if arg does not exist in
* the queried graph. */
virtual std::vector<Ref*> GetCluster(const Ref& arg, bool main_only = false) noexcept = 0;
/** Get pointers to all ancestors of the specified transaction. If main_only is false and a
* staging graph exists, it is queried; otherwise the main graph is queried. The queried
* graph must not be oversized. Returns {} if arg does not exist in the queried graph. */
virtual std::vector<Ref*> GetAncestors(const Ref& arg, bool main_only = false) noexcept = 0;
/** Get pointers to all descendants of the specified transaction. If main_only is false and a
* staging graph exists, it is queried; otherwise the main graph is queried. The queried
* graph must not be oversized. Returns {} if arg does not exist in the queried graph. */
virtual std::vector<Ref*> GetDescendants(const Ref& arg, bool main_only = false) noexcept = 0;
/** Get the total number of transactions in the graph. If main_only is false and a staging
* graph exists, it is queried; otherwise the main graph is queried. This is available even
* for oversized graphs. */
virtual GraphIndex GetTransactionCount(bool main_only = false) noexcept = 0;
/** Compare two transactions according to the total order in the main graph (topological, and
* from high to low chunk feerate). Both transactions must be in the main graph. The main
* graph must not be oversized. */
virtual std::strong_ordering CompareMainOrder(const Ref& a, const Ref& b) noexcept = 0;
/** Perform an internal consistency check on this object. */
virtual void SanityCheck() const = 0;
};
/** Construct a new TxGraph with the specified limit on transactions within a cluster. That
* number cannot exceed MAX_CLUSTER_COUNT_LIMIT. */
std::unique_ptr<TxGraph> MakeTxGraph(unsigned max_cluster_count) noexcept;
#endif // BITCOIN_TXGRAPH_H