BCH RPA — UltrafastSecp256k1 Performance Data

shrec · May 18, 2026, 5:54pm

BCH RPA — UltrafastSecp256k1 Performance Data (2026-05-18)

Hardware: Intel i5-14400F (16 cores), RTX 5060 Ti

Compiler: GCC 14.2.0, -O3 -march=native, Release+LTO

CPU Benchmarks

EC Grinding (CT ECDSA sign + double-SHA256 + prefix check)

Threads	Speed	8-bit grind	16-bit grind
1 core	78k/s	~3ms	~840ms
16 cores	590k/s	<0.5ms	~111ms

RPA Scan (ECDH + SHA256 midstate + pubkey derivation)

Threads	Speed	BCH mainnet (11.5 tx/s)
1 core	20k tx/s	1,739× real-time
16 cores	153k tx/s	13,300× real-time
Historical sync (~2B txs)	—	~4 hours

GPU Benchmarks (RTX 5060 Ti)

EC Grinding

Backend	Speed	16-bit grind time
CPU 16 cores	590k/s	~111ms
GPU CUDA	~7-8M/s	~9ms
Speedup	—	12× vs CPU

Scan (BIP-352 pipeline adapted for RPA)

Backend	Speed
CPU 16 cores	153k tx/s
GPU CUDA	~11M tx/s
Speedup	72×

BCH Mainnet Context

Mainnet tx/day: ~1M = 11.5 tx/s
CPU scan (16c): 153k tx/s → 13,300× real-time
GPU scan: ~11M tx/s → 957,000× real-time
Historical sync (all-time ~2B txs):
- CPU 16 cores: ~4 hours
- GPU: ~3 minutes

Usage Example

// === RECEIVER: Generate paycode once ===
#include <secp256k1/bch/rpa.hpp>
#include <secp256k1/bch/bch_scan.hpp>
#include <secp256k1/ct/point.hpp>

// Generate keys
secp256k1::fast::Scalar scan_sk  = /* your scan private key */;
secp256k1::fast::Scalar spend_sk = /* your spend private key */;

// Build paycode (publish this — like a Bitcoin address but reusable)
secp256k1::bch::RpaPaycode pc{};
pc.version     = 1;       // P2PKH mainnet
pc.prefix_bits = 8;       // 8-bit prefix filter (1/256 false positives)
pc.expiry      = 0;       // never expires
auto scan_pk  = secp256k1::ct::generator_mul(scan_sk).to_compressed();
auto spend_pk = secp256k1::ct::generator_mul(spend_sk).to_compressed();
std::memcpy(pc.scan_pubkey.data(),  scan_pk.data(),  33);
std::memcpy(pc.spend_pubkey.data(), spend_pk.data(), 33);

std::string paycode = secp256k1::bch::rpa_encode_paycode(pc);
// → "paycode:qyy..."  publish on website, Twitter, etc.


// === SENDER: Create payment ===
uint8_t outpoint[36]; // txid[32] || vout[4 LE]

// Grind signature until prefix matches (CPU)
auto grind = secp256k1::bch::rpa_grind_cpu(
    sender_input_privkey,
    sighash32,           // SIGHASH of the spending input
    pc.prefix_bits,
    pc.scan_pubkey.data(),
    0 /* unlimited */);

if (grind.found) {
    // Compute shared secret and derive payment address
    auto secret = secp256k1::bch::rpa_sender_shared_secret(
        sender_input_privkey, pc.scan_pubkey.data(), outpoint, 36);
    auto pay_pk = secp256k1::bch::rpa_derive_payment_pubkey(
        pc.spend_pubkey.data(), secret, 0);
    // → pay_pk is the P2PKH address to send BCH to
    // → grind.signature is the winning input signature
}

// GPU grinding (when SECP256K1_BUILD_CUDA enabled):
// auto result = secp256k1::cuda::bch::rpa_grind_gpu(
//     sk32, msg32, prefix_bits, prefix_data, 0);


// === RECEIVER: Scan for payments ===
secp256k1::bch::RpaScanner scanner(pc, scan_sk);

// For each tx that passed prefix filter:
secp256k1::bch::ScanTx tx{};
tx.txid        = /* 32-byte txid */;
tx.vout        = 0;
tx.input_pubkey = /* sender's compressed input pubkey */;
tx.outputs.push_back(/* output pubkey to check */);

if (auto match = scanner.scan_tx(tx, 30)) {
    // match->cashaddr   → "bitcoincash:q..."
    // match->key_index  → which derivation index matched
    // Spend: derive privkey = CKDpriv(spend_sk, shared_secret, match->key_index)
}

// Multi-threaded scan — 153k tx/s on 16 cores:
auto matches = scanner.scan_batch_cpu(all_txs, /*max_key_index=*/30);

Build

# CPU only (default)
cmake -B build -DSECP256K1_BUILD_BCH=ON
cmake --build build --target rpa_wallet_example bench_bch

# CPU + CUDA grinding
cmake -B build -DSECP256K1_BUILD_BCH=ON -DSECP256K1_BUILD_CUDA=ON
cmake --build build

# Run example
./build/src/bch/rpa_wallet_example

# Run benchmark (16 cores)
taskset -c 0-15 ./build/src/bch/bench_bch 16

Repository

https://github.com/shrec/UltrafastSecp256k1 (branch: dev) MIT License

shrec · May 18, 2026, 6:03pm

git clone https://github.com/shrec/UltrafastSecp256k1
cd UltrafastSecp256k1
git checkout dev

cmake -B build -DSECP256K1_BUILD_BCH=ON \
      -DSECP256K1_BUILD_BENCH=ON \
      -DSECP256K1_BUILD_EXAMPLES=ON \
      -DCMAKE_BUILD_TYPE=Release \
      -DSECP256K1_MARCH=native
cmake --build build -j$(nproc)

# Example — end-to-end RPA
./build/src/bch/rpa_wallet_example

# Benchmark
taskset -c 0-15 ./build/src/bch/bench_bch 16

shrec · May 18, 2026, 6:28pm

@bitcoincashautist @2qx @im_uname @Bastian

BitcoinCashPodcast · May 18, 2026, 7:21pm

Am I right that this means you can scan the entire blockchain for RPA transactions matching a specific wallet on a GPU in 3 minutes?

shrec · May 18, 2026, 7:24pm

yes you are right. with this it’s possible to make Frigate like service for bch

shrec · May 18, 2026, 7:26pm

frigate also works on my engine for bitcoin

BitcoinCashPodcast · May 18, 2026, 7:26pm

That is incredibly awesome.

I really can’t wait for BCH to have these kinds of usability and privacy tools embedded by default everywhere.

Super important area of research and development.

shrec · May 18, 2026, 7:29pm

https://github.com/sparrowwallet/frigate this is same for bitcoin based on my engine in production by CraigRaw

ABLA · May 19, 2026, 5:26am

These are great results; one of my biggest concerns with all kinds of stealth addresses was the trade-off between scanning time and privacy.

shrec · May 19, 2026, 7:44am

On GPU, RPA scan throughput matches the existing Silent Payments scan pipeline class.
The extra grinding work does not materially change the throughput profile in this benchmark.