Transaction Relay policies

mtrycz · July 10, 2020, 2:13pm

The latest BCHN release (v0.21.2) still has a 5 second random poisson delay for relaying transactions, inherited from Core and ABC.

This slows down the propagation of transactions across the network. It is not a major problem on the live BCH mainnet, because other implementations have a faster relay policy, so it’s not urgent. Nontheless it’s impolite to rely on other implementations for making things work.

BCHN is looking at removing or at least reducing the propagation delay. There is a tradeoff to be made with bandwidth usage, because batching several transactions will have a smaller use that forwarding all transactions right away (because of TCP overhead and round trips). This tradeoff has not been investigated yet.

corgi a novel test framework has been used to determine the effect of the local transaction delay on the propagation times across a whole network. A “private” network of 256 nodes (with 8 outgoing connections each) in regtest mode was created in 20 AWS regions. Then several thousand transactions were propagated across the network and the time of arrival of each transaction on each node was registered.

The results were compiled and the metric of “transaction propagation time” was simply defined as the time it takes for a transaction to reach all nodes.

The results get some cleaning for outliers and charted. The mean propagation time on this small network is around 9 seconds, which is a lot. This is expected to increase with network size (all other parameters being equal).

In contrast, block are relayed instantly, and block propagation takes less than 0.5 seconds.

Further research on the tradeoff between the delay and bandwidth is yet to be made.

tom · July 10, 2020, 9:23pm

The design of the delay is to make sure that the peers that don’t instantly get the transaction from “me” are getting it from another peer if the distance is short.

So, for any INV we can divide our peers into 2 groups. The ones we instantly send it to, and peers that get it from another peer (or us, after the poison delay).

If we can measure the time it takes for the peers in that second group to send us the INV, then we can determine a better time for the delay.

So, for instance, if measurement shows that the 65% of our peers in the second group have send us the INV after 2100ms, then we should set the poisson to 2250ms or so.

That way we get both the advantage of spreading the load and a faster relay. Theoretically, that is. You can then measure the thoughput after that.