Before this commit, we would always prepare tracepoint arguments
regardless of the tracepoint being used or not. While we already made
sure not to include expensive arguments in our tracepoints, this
commit introduces gating to make sure the arguments are only prepared
if the tracepoints are actually used. This is a win-win improvement
to our tracing framework. For users not interested in tracing, the
overhead is reduced to a cheap 'greater than 0' compare. As the
semaphore-gating technique used here is available in bpftrace, bcc,
and libbpf, users interested in tracing don't have to change their
tracing scripts while profiting from potential future tracepoints
passing slightly more expensive arguments. An example are mempool
tracepoints that pass serialized transactions. We've avoided the
serialization in the past as it was too expensive.
Under the hood, the semaphore-gating works by placing a 2-byte
semaphore in the '.probes' ELF section. The address of the semaphore
is contained in the ELF note providing the tracepoint information
(`readelf -n ./src/bitcoind | grep NT_STAPSDT`). Tracing toolkits
like bpftrace, bcc, and libbpf increase the semaphore at the address
upon attaching to the tracepoint. We only prepare the arguments and
reach the tracepoint if the semaphore is greater than zero. The
semaphore is decreased when detaching from the tracepoint.
This also extends the "Adding a new tracepoint" documentation to
include information about the semaphores and updated step-by-step
instructions on how to add a new tracepoint.
When the tracepoint was introduced in 8f37f5c2a5,
the connect_block duration was passed in microseconds `µs`.
By starting to use steady clock in fabf1cdb20
this changed to nanoseconds `ns`. As the test only checked if the
duration value is `> 0` as a plausibility check, this went unnoticed.
I detected this when setting up monitoring for block validation time
as part of the Great Consensus Cleanup Revival discussion.
This change casts the duration explicitly to nanoseconds (as it has been
nanoseconds for the last three releases; switching back now would 'break'
the broken API again; there don't seem to be many users affected), updates
the documentation and adds a check for an upper bound to the tracepoint
interface tests. The upper bound is quite lax as mining the block takes
much longer than connecting the empty test block. It's however able to
detect incorrect duration units passed.
Tracepoints for added, removed, replaced, and rejected transactions.
The removal reason is passed as string instead of a numeric value, since
the benefits of not having to maintain a redundant enum-string mapping
seem to outweigh the small cost of string generation. The reject reason
is passed as string as well, although here the string does not have to
be generated but is readily available.
So far, tracepoint PRs typically included two demo scripts: a naive
bpftrace script to show raw tracepoint data and a bcc script for a more
refined view. However, as some of the ongoing changes to bpftrace
introduce a certain degree of unreliability (running some of the
existing bpftrace scripts was not possible with standard kernels and
bpftrace packages on latest stable Ubuntu, Debian, and NixOS), this PR
includes only a single bcc script that fuses the functionality of former
bpftrace and bcc scripts.
- mention 'Lost X events' workaround
- clarify flush tracepoint docs
- fix typo in tracepoint context
- clarify flush for prune
The documentation and examples for the `fFlushForPrune` argument
of the utxocache flush tracepoint weren't clear without looking
at the code.
See these comments: https://github.com/bitcoin/bitcoin/pull/22902#issuecomment-987094612
- doc: note that there can be temporary UTXO caches
Bitcoin Core uses temporary clones of it's _main_ UTXO cache in some
places. The utxocache:add and :spent tracepoints are triggered when
temporary caches are changed too. This is documented.
Previously, the `utxocache:flush` tracepoint was in the wrong scope and
reached every time `CChainState::FlushStateToDisk` was called, even when
there was no flushing of the cache. The tracepoint is now properly scoped
and will be reached during a full flush.
Inside the scope, the `fDoFullFlush` value will always be `true`, so it
doesn't need to be logged separately. Hence, it's dropped from the
tracepoint arguments.
The tracepoint `validation:block_connected` was introduced in #22006.
The first argument was the hash of the connected block as a pointer
to a C-like String. The last argument passed the hash of the
connected block as a pointer to 32 bytes. The hash was only passed as
string to allow `bpftrace` scripts to print the hash. It was
(incorrectly) assumed that `bpftrace` cannot hex-format and print the
block hash given only the hash as bytes.
The block hash can be printed in `bpftrace` by calling
`printf("%02x")` for each byte of the hash in an `unroll () {...}`.
By starting from the last byte of the hash, it can be printed in
big-endian (the block-explorer format).
```C
$p = $hash + 31;
unroll(32) {
$b = *(uint8*)$p;
printf("%02x", $b);
$p -= 1;
}
```
See also: https://github.com/bitcoin/bitcoin/pull/22902#discussion_r705176691
This is a breaking change to the block_connected tracepoint API, however
this tracepoint has not yet been included in a release.
Both added files are extended in the following commits.
doc/usdt.md is based on earlier work by laanwj.
Co-authored-by: W. J. van der Laan <laanwj@protonmail.com>