doc: Fix and clarify description of ZMQ message format

This change stresses that all ZMQ messages share the same structure
and that they differ only in the format of the bodies. Previously this
was not clear.

Further it removes the notion of endianness of 32-byte hashes,
as it was misleading, and replaces it with the term 'reversed byte
order' (as opposed to natural or normal byte order produced by hashing
functions).

Additionally, it states that ZMQ 32-byte hashes are in the same format
as in RPC. Previously it incorrectly stated that the two were in
different formats.
This commit is contained in:
Jiri Jakes 2025-02-14 15:18:12 +08:00
parent 2549fc6fd1
commit 7a93544cdc
No known key found for this signature in database
GPG key ID: 01F5B01A5D686F31

View file

@ -87,40 +87,69 @@ For instance:
-zmqpubrawtx=ipc:///tmp/bitcoind.tx.raw \
-zmqpubhashtxhwm=10000
Each PUB notification has a topic and body, where the header
corresponds to the notification type. For instance, for the
notification `-zmqpubhashtx` the topic is `hashtx` (no null
terminator). These options can also be provided in bitcoin.conf.
Notification types correspond to message topics (details in next section). For instance,
for the notification `-zmqpubhashtx` the topic is `hashtx`. These options can also be
provided in bitcoin.conf.
The topics are:
### Message format
`sequence`: the body is structured as the following based on the type of message:
All ZMQ messages share the same structure with three parts: _topic_ string,
message _body_, and _message sequence number_:
<32-byte hash>C : Blockhash connected
<32-byte hash>D : Blockhash disconnected
<32-byte hash>R<8-byte LE uint> : Transactionhash removed from mempool for non-block inclusion reason
<32-byte hash>A<8-byte LE uint> : Transactionhash added mempool
| topic | body | message sequence number |
|-----------+------------------------------------------------------+--------------------------|
| rawtx | <serialized transaction> | <4-byte LE uint> |
| hashtx | <reversed 32-byte transaction hash> | <4-byte LE uint> |
| rawblock | <serialized block> | <4-byte LE uint> |
| hashblock | <reversed 32-byte block hash> | <4-byte LE uint> |
| sequence | <reversed 32-byte block hash>C | <4-byte LE uint> |
| sequence | <reversed 32-byte block hash>D | <4-byte LE uint> |
| sequence | <reversed 32-byte transaction hash>R<8-byte LE uint> | <4-byte LE uint> |
| sequence | <reversed 32-byte transaction hash>A<8-byte LE uint> | <4-byte LE uint> |
Where the 8-byte uints correspond to the mempool sequence number.
where:
`rawtx`: Notifies about all transactions, both when they are added to mempool or when a new block arrives. This means a transaction could be published multiple times. First, when it enters the mempool and then again in each block that includes it. The messages are ZMQ multipart messages with three parts. The first part is the topic (`rawtx`), the second part is the serialized transaction, and the last part is a sequence number (representing the message count to detect lost messages).
- message sequence number represents message count to detect lost messages, distinct for each topic
- all transaction and block hashes are in _reversed byte order_ (i. e. with bytes
produced by hashing function reversed), the same format as the RPC interface and block
explorers use to display transaction and block hashes
| rawtx | <serialized transaction> | <uint32 sequence number in Little Endian>
#### rawtx
`hashtx`: Notifies about all transactions, both when they are added to mempool or when a new block arrives. This means a transaction could be published multiple times. First, when it enters the mempool and then again in each block that includes it. The messages are ZMQ multipart messages with three parts. The first part is the topic (`hashtx`), the second part is the 32-byte transaction hash, and the last part is a sequence number (representing the message count to detect lost messages).
Notifies about all transactions, both when they are added to mempool or when a new block
arrives. This means a transaction could be published multiple times: first when it enters
mempool and then again in each block that includes it. The body part of the message is the
serialized transaction.
| hashtx | <32-byte transaction hash in Little Endian> | <uint32 sequence number in Little Endian>
#### hashtx
Notifies about all transactions, both when they are added to mempool or when a new block
arrives. This means a transaction could be published multiple times: first when it enters
mempool and then again in each block that includes it. The body part of the mesage is the
32-byte transaction hash in reversed byte order.
`rawblock`: Notifies when the chain tip is updated. When assumeutxo is in use, this notification will not be issued for historical blocks connected to the background validation chainstate. Messages are ZMQ multipart messages with three parts. The first part is the topic (`rawblock`), the second part is the serialized block, and the last part is a sequence number (representing the message count to detect lost messages).
#### rawblock
| rawblock | <serialized block> | <uint32 sequence number in Little Endian>
Notifies when the chain tip is updated. When assumeutxo is in use, this notification will
not be issued for historical blocks connected to the background validation chainstate. The
body part of the message is the serialized block.
`hashblock`: Notifies when the chain tip is updated. When assumeutxo is in use, this notification will not be issued for historical blocks connected to the background validation chainstate. Messages are ZMQ multipart messages with three parts. The first part is the topic (`hashblock`), the second part is the 32-byte block hash, and the last part is a sequence number (representing the message count to detect lost messages).
#### hashblock
| hashblock | <32-byte block hash in Little Endian> | <uint32 sequence number in Little Endian>
Notifies when the chain tip is updated. When assumeutxo is in use, this notification will
not be issued for historical blocks connected to the background validation chainstate. The
body part of the message is the 32-byte block hash in reversed byte order.
**_NOTE:_** Note that the 32-byte hashes are in Little Endian and not in the Big Endian format that the RPC interface and block explorers use to display transaction and block hashes.
#### sequence
The 8-byte LE uints correspond to _mempool sequence number_ and the types of bodies are:
- `C` : block with this hash connected
- `D` : block with this hash disconnected
- `R` : transaction with this hash removed from mempool for non-block inclusion reason
- `A` : transaction with this hash added to mempool
### Implementing ZMQ client
ZeroMQ endpoint specifiers for TCP (and others) are documented in the
[ZeroMQ API](http://api.zeromq.org/4-0:_start).
@ -138,7 +167,7 @@ operating system configuration and must be configured prior to connection establ
For example, when running on GNU/Linux, one might use the following
to lower the keepalive setting to 10 minutes:
sudo sysctl -w net.ipv4.tcp_keepalive_time=600
sudo sysctl -w net.ipv4.tcp_keepalive_time=600
Setting the keepalive values appropriately for your operating environment may
improve connectivity in situations where long-lived connections are silently