BCH RPA — UltrafastSecp256k1 Performance Data (2026-05-18)
Hardware: Intel i5-14400F (16 cores), RTX 5060 Ti
Compiler: GCC 14.2.0, -O3 -march=native, Release+LTO
CPU Benchmarks
EC Grinding (CT ECDSA sign + double-SHA256 + prefix check)
| Threads |
Speed |
8-bit grind |
16-bit grind |
| 1 core |
78k/s |
~3ms |
~840ms |
| 16 cores |
590k/s |
<0.5ms |
~111ms |
RPA Scan (ECDH + SHA256 midstate + pubkey derivation)
| Threads |
Speed |
BCH mainnet (11.5 tx/s) |
| 1 core |
20k tx/s |
1,739× real-time |
| 16 cores |
153k tx/s |
13,300× real-time |
| Historical sync (~2B txs) |
— |
~4 hours |
GPU Benchmarks (RTX 5060 Ti)
EC Grinding
| Backend |
Speed |
16-bit grind time |
| CPU 16 cores |
590k/s |
~111ms |
| GPU CUDA |
~7-8M/s |
~9ms |
| Speedup |
— |
12× vs CPU |
Scan (BIP-352 pipeline adapted for RPA)
| Backend |
Speed |
| CPU 16 cores |
153k tx/s |
| GPU CUDA |
~11M tx/s |
| Speedup |
72× |
BCH Mainnet Context
- Mainnet tx/day: ~1M = 11.5 tx/s
- CPU scan (16c): 153k tx/s → 13,300× real-time
- GPU scan: ~11M tx/s → 957,000× real-time
- Historical sync (all-time ~2B txs):
- CPU 16 cores: ~4 hours
- GPU: ~3 minutes
Usage Example
// === RECEIVER: Generate paycode once ===
#include <secp256k1/bch/rpa.hpp>
#include <secp256k1/bch/bch_scan.hpp>
#include <secp256k1/ct/point.hpp>
// Generate keys
secp256k1::fast::Scalar scan_sk = /* your scan private key */;
secp256k1::fast::Scalar spend_sk = /* your spend private key */;
// Build paycode (publish this — like a Bitcoin address but reusable)
secp256k1::bch::RpaPaycode pc{};
pc.version = 1; // P2PKH mainnet
pc.prefix_bits = 8; // 8-bit prefix filter (1/256 false positives)
pc.expiry = 0; // never expires
auto scan_pk = secp256k1::ct::generator_mul(scan_sk).to_compressed();
auto spend_pk = secp256k1::ct::generator_mul(spend_sk).to_compressed();
std::memcpy(pc.scan_pubkey.data(), scan_pk.data(), 33);
std::memcpy(pc.spend_pubkey.data(), spend_pk.data(), 33);
std::string paycode = secp256k1::bch::rpa_encode_paycode(pc);
// → "paycode:qyy..." publish on website, Twitter, etc.
// === SENDER: Create payment ===
uint8_t outpoint[36]; // txid[32] || vout[4 LE]
// Grind signature until prefix matches (CPU)
auto grind = secp256k1::bch::rpa_grind_cpu(
sender_input_privkey,
sighash32, // SIGHASH of the spending input
pc.prefix_bits,
pc.scan_pubkey.data(),
0 /* unlimited */);
if (grind.found) {
// Compute shared secret and derive payment address
auto secret = secp256k1::bch::rpa_sender_shared_secret(
sender_input_privkey, pc.scan_pubkey.data(), outpoint, 36);
auto pay_pk = secp256k1::bch::rpa_derive_payment_pubkey(
pc.spend_pubkey.data(), secret, 0);
// → pay_pk is the P2PKH address to send BCH to
// → grind.signature is the winning input signature
}
// GPU grinding (when SECP256K1_BUILD_CUDA enabled):
// auto result = secp256k1::cuda::bch::rpa_grind_gpu(
// sk32, msg32, prefix_bits, prefix_data, 0);
// === RECEIVER: Scan for payments ===
secp256k1::bch::RpaScanner scanner(pc, scan_sk);
// For each tx that passed prefix filter:
secp256k1::bch::ScanTx tx{};
tx.txid = /* 32-byte txid */;
tx.vout = 0;
tx.input_pubkey = /* sender's compressed input pubkey */;
tx.outputs.push_back(/* output pubkey to check */);
if (auto match = scanner.scan_tx(tx, 30)) {
// match->cashaddr → "bitcoincash:q..."
// match->key_index → which derivation index matched
// Spend: derive privkey = CKDpriv(spend_sk, shared_secret, match->key_index)
}
// Multi-threaded scan — 153k tx/s on 16 cores:
auto matches = scanner.scan_batch_cpu(all_txs, /*max_key_index=*/30);
Build
# CPU only (default)
cmake -B build -DSECP256K1_BUILD_BCH=ON
cmake --build build --target rpa_wallet_example bench_bch
# CPU + CUDA grinding
cmake -B build -DSECP256K1_BUILD_BCH=ON -DSECP256K1_BUILD_CUDA=ON
cmake --build build
# Run example
./build/src/bch/rpa_wallet_example
# Run benchmark (16 cores)
taskset -c 0-15 ./build/src/bch/bench_bch 16
Repository
https://github.com/shrec/UltrafastSecp256k1 (branch: dev) MIT License