Research: Hash-Keyed Pub/Sub

bitcoincashautist · February 24, 2026, 12:25pm

Edit:

All the below was just my process of rediscovering hash table lookups with Claude’s help, starting from BIP-37 and BIP-157, thinking of whether it’s possible to combine them.

Ha! So we basically arrived at “hash table lookup” via the scenic route through Bloom filters and set intersection theory.

The result of that research is this little section that I added to faster blocks CHIP:

Hash-Keyed Pub/Sub (Hypothetical)

An alternative architecture addresses BIP-37’s scaling limitation: clients send 64-bit hashes of their addresses once, and the server maintains a reverse index for O(n_b) routing per block, independent of client count.
Optionally, the server can publish per-block hash lists, allowing clients to verify routing completeness or operate in Neutrino-style pull mode.

With 1-minute blocks:

Per-block work: 10x more blocks with 1/10th the transactions each; net work equivalent.
Index size: unchanged (depends on subscribed addresses, not block structure).
Client bandwidth: unchanged (receives same transactions regardless of block time).
Merkle proofs: ~33% shorter due to shallower trees.
Header chain: clients must sync 10x more headers (~42 MB/year); mitigation with header merkle proofs possible.

This architecture is not implemented in any production Bitcoin Cash wallet but demonstrates that faster blocks pose no obstacle to future light client designs.
It was discovered during research into BIP-37 and BIP-157/158 for this CHIP; see discussion on Bitcoin Cash Research.
This is the same pattern used by email servers, message queues, and search engines: build an index, route by lookup.
It is perhaps surprising that light wallet architectures have not adopted this approach, given that hash-indexed routing predates Bitcoin by decades.

/edit

I was curious about whether it’s possible to AND two Bloom filters (one computed on block elements, one computed on wallet elements) as a way to eliminate a block as NOT having any transactions of interest. But to have a good elimination rate we need huge filters. But we don’t need huge filters for element extraction. What if we can combine the best of both worlds?

Position-List Pre-filtering for Bloom Filter Protocols

Abstract

We observe that BIP-37 style Bloom filters with arbitrarily large bit arrays can be represented as sorted position lists, where storage depends only on element count. This enables a two-phase filtering protocol: fast elimination via sorted merge comparison, followed by BIP-37 extraction on surviving blocks. The construction adds a near-free elimination layer using only integer comparisons.

1. Position Lists

A Bloom filter sets k bits per element in a bit array of size m. When m is large and elements are few, most bits are zero—storing the bit positions is more compact than storing the array.

For a set S of n elements with k hash functions h_i: element → [0, 2⁶⁴), define the position list:

P(S) = sort( ⋃_{x ∈ S} {h₁(x), …, h_k(x)} )

Storage is k · n · 8 bytes regardless of hash space size. Making m = 2⁶⁴ costs nothing extra while reducing collision probability to negligible levels.

2. Two-Phase Filtering

Setup: Server maintains a position list per block. Client sends a position list for their watch set.

Phase 1 — Elimination:

Compare sorted lists via merge. If no position matches, the block contains nothing relevant.

def has_overlap(block_pos, client_pos):
    i, j = 0, 0
    while i < len(block_pos) and j < len(client_pos):
        if block_pos[i] == client_pos[j]:
            return True
        elif block_pos[i] < client_pos[j]:
            i += 1
        else:
            j += 1
    return False

Cost: O(n_b + n_c) integer comparisons. False positive rate is the probability that any of k · n_b block positions lands in the set of k · n_c client positions:

P_fp = 1 − (1 − k · n_c / 2⁶⁴)^{k · n_b} ≈ (k · n_c)(k · n_b) / 2⁶⁴

For k = 3, n_b = 160,000, n_c = 1,000:

P_fp ≈ (3,000 × 480,000) / 2⁶⁴ ≈ 7.8 × 10⁻¹¹

Most blocks are eliminated here. Only blocks with a matching position proceed to Phase 2.

Phase 2 — Extraction (on match):

This is standard BIP-37: fold client positions into a Bloom filter sized for practical transmission, then test block elements:

def extract_matches(block_elements, client_pos, bloom_size):
    # Fold 64-bit positions into smaller filter
    bloom = bytearray(bloom_size // 8)
    for pos in client_pos:
        bit = pos % bloom_size
        bloom[bit // 8] |= 1 << (bit % 8)
    
    # BIP-37 style: server tests each element
    return [e for e in block_elements if test_bloom(bloom, e)]

The same client position list serves both phases—Phase 1 uses the raw 64-bit positions, Phase 2 folds them to a practical size.

3. Relationship to BIP-37

BIP-37 has the server test every transaction against a client-provided Bloom filter. This works but requires O(k · n_b) hash operations per block unconditionally.

Aspect	BIP-37	This construction
Elimination	None	Position list merge
Extraction	Bloom filter	Same (folded positions)
Server cost per block	O(k · n_b) hashes	O(n_b + n_c) comparisons
Server cost on match	—	O(k · n_b) hashes
False positive source	Bloom filter size	Two stages: near-zero elimination, standard Bloom extraction

The contribution is Phase 1 elimination, which makes BIP-37’s expensive test rare rather than universal. On a chain where clients match perhaps 1 in 10,000 blocks, almost all work becomes simple integer comparison.

4. Properties

Property	Value
Server storage per block	k · n_b · 8 bytes
Client subscription size	k · n_c · 8 bytes
Elimination cost	O(n_b + n_c) comparisons
Extraction cost	O(k · n_b) hash operations
False positive (elimination)	≈ 10⁻¹⁰ for typical parameters
False positive (extraction)	Standard Bloom rate for folded size

5. Conclusion

Position lists in a large hash space provide cheap elimination with near-perfect accuracy. When elimination fails, the same positions fold to a standard BIP-37 Bloom filter for element extraction. The construction requires no novel primitives—only sorted lists, merge comparison, and modular arithmetic. It can be viewed as adding an efficient pre-filter to the existing BIP-37 protocol.

bitcoincashautist · February 23, 2026, 5:58pm

One nice feature this enables: delegation.

6. Delegation

The two-phase structure enables a useful delegation pattern. Consider three parties:

A (Archive): Maintains full blocks and publishes position lists per block
B (Broker): An intermediary with bandwidth but not full trust
C (Client): Wants transactions matching their watch set, has limited bandwidth

Protocol:

C sends their position list to B (once, or on update)
B monitors A’s published block position lists
For each block, B performs Phase 1 elimination locally
On match, B downloads the full block from A
B performs Phase 2 extraction using C’s (folded) positions
B sends only matching elements to C

What B learns: The same probabilistic fingerprint that BIP-37 reveals—B can test addresses against C’s filter. This is unavoidable for any scheme where B filters on C’s behalf.

What B cannot do: Claim false negatives. If B returns no matches for a block, C can independently verify by obtaining that block’s position list and checking Phase 1. This requires only n_b \cdot 8 bytes, not the full block.

Bandwidth analysis:

Path	Per-block cost	Notes
A → B	n_b \cdot 8 bytes	Position list, every block
A → B	Full block	Only on Phase 1 match (~10^{-11} of blocks)
B → C	Matching elements	Only actual matches
C → B	n_c \cdot 8 bytes	Once per subscription update

For a client watching 1,000 addresses across a chain of 800,000 blocks where only 500 blocks contain relevant transactions:

Without delegation: C downloads position lists for all blocks
With delegation: C sends 8 KB once, receives ~500 small responses

The broker B bears the monitoring cost but can amortize it across many clients—each additional client requires only one extra merge operation per block.

Trust model: C trusts B for availability (B might not respond) but not integrity (B cannot forge matches or hide them undetectably). This matches the trust model of light client protocols generally.

bitcoincashautist · February 23, 2026, 7:10pm

This turned out weird because of path of discovery. The path was:

Start with Bloom filters (BIP-37) and Neutrino (BIP-157)
To eliminate negatives cheaply, can we somehow check if there’s intersection?
Filters need to match in size so we can AND the bits for testing, and they need to be sparse enough to have a good elimination rate.
Notice sparse Bloom filters waste space → store positions instead
Notice we don’t need k > 1 in large hash space → simplify to k = 1
Notice we don’t need folding to produce a smaller filter for element extraction → just do direct lookup
Realize this is just… hashing things and comparing the hashes

We (me & my buddy Claude) reinvented one of the oldest tricks in database systems by starting from the wrong end.

The actual contribution:

BIP-37: Client sends filter, server tests every element against it.

BIP-157 (Neutrino): Server sends filter per block, client tests their elements against it.

Both do O(n) membership tests. Neither does intersection.

This construction: Both sides have filters (position lists). Check intersection first. Only do membership tests on match.

The insight is that both parties having a filter enables a cheap intersection test that neither BIP-37 nor BIP-157 exploits.

Why didn’t Neutrino or BIP-37 do this?

Neutrino’s design assumes the client downloads the block filter and tests locally. The client already has their own addresses—no need to represent them as a filter, just test directly.

The intersection approach only helps when:

A server (or broker) is filtering on behalf of a client, AND
You want to avoid per-element testing in the common (no match) case

BIP-37 has the server do the work but doesn’t give the server the option to precompute a block filter to intersect against and cheaply eliminate no-match blocks.

Neutrino has block filters but pushes the work to the client.

This construction combines: server-side filtering (like BIP-37) + precomputed block filters (like Neutrino) + intersection for cheap elimination.

The construction uses truncated hashes for set comparison, as in BIP-152 (Compact Block Relay). Where BIP-152 computes set difference to minimize transaction transfer between full nodes, we compute set intersection to identify relevant blocks for light clients.

bitcoincashautist · February 23, 2026, 7:15pm

What are Bloom filters even good for?

Membership testing when you’re space-constrained and can tolerate false positives.

The key property: constant size regardless of what you’re testing against.

A Bloom filter built from n elements is O(n) bits. But you can test any element against it—elements you’ve never seen, elements from an unbounded universe. The filter doesn’t grow with the query set.

Where this matters:

Caching/deduplication: “Have I seen this URL before?” You have a stream of billions of URLs. You can’t store them all. A Bloom filter gives you a compact summary that answers “probably yes” or “definitely no.”

Database query optimization: Before doing an expensive disk lookup, check a Bloom filter. If it says “definitely not in this block,” skip the I/O.

Network routing: “Should I forward this to peer X?” Check if X’s interest filter matches, without storing X’s full interest set.

In all these cases:

The query stream is unbounded or unknown
You need sublinear space
False positives are tolerable (you just do extra work)
False negatives are not (you’d miss data)

Why Bloom filters don’t fit block filtering well:

In our setting, both sets are known and bounded:

Server knows block elements
Client knows watch addresses

Neither side has an unbounded query stream. Both sides can afford to store their full set (as hashes). So Bloom filters’ main advantage—sublinear space for unbounded queries—doesn’t apply.

BIP-37 uses Bloom filters to compress the client’s watch set for transmission. But the space savings (~10 bits/element vs 64 bits/element) comes at the cost of:

Higher false positive rates
No efficient intersection operation
More complex implementation (k hash functions, bit manipulation)

The tradeoff:

	Bloom filter	Hash set
Space	~10 bits/element	64 bits/element
Membership test	O(k)	O(1) or O(\log n)
Intersection	O(m) bit ops, same-size only	O(n_1 + n_2) merge
False positive rate	~1% (tunable)	~10^{-16} (fixed)
Good for	Unbounded query stream	Bounded set comparison

Bloom filters are the right choice when you’re testing many unknown elements against a fixed set and space is tight.

Hash sets are the right choice when you’re comparing two known sets and operations matter more than space.

BIP-37 picked Bloom filters for the bandwidth savings. But the operational cost—no cheap intersection—made it unsuitable for scalable server-side filtering.

bitcoincashautist · February 23, 2026, 9:01pm

With all that out of the way, we arrive at ideal indexer-free architecture that scales!

Position-Keyed Pub/Sub for Light Clients

Abstract

We describe a simple architecture for light wallet services: clients subscribe by sending hashes of their addresses, and a broker routes matching transactions using an index from position to subscriber set. Per-block work is O(n_b) hash lookups regardless of client count. The construction scales to 10^{12} subscribed positions before false routing becomes significant, supports 10^6+ concurrent clients, and requires no novel primitives—only hashing and hash table lookup.

1. The Problem

Light clients want to learn about transactions relevant to their addresses without downloading the full chain.

Existing approaches:

BIP-37 (Bloom filters): Client sends a Bloom filter to a full node. Server tests every transaction against every client’s filter. Cost: O(c \cdot k \cdot n_b) per block for c clients. Doesn’t scale.

BIP-157/158 (Neutrino): Server computes a compact filter per block. Client downloads filters and tests locally. On match, client downloads the full block to extract relevant transactions. Scales server-side but pushes substantial bandwidth and CPU to clients.

Electrum-style indexing: Server maintains full address-to-transaction index. Scales but requires substantial server infrastructure and full chain indexing.

We want: server cost independent of client count, minimal client work, simple implementation.

2. Construction

Client subscription:

Client hashes each address to a 64-bit position:

def subscribe(addresses):
    return [hash64(addr) for addr in addresses]

Client sends positions to broker once. Cost: 8 n_c bytes.

Broker index:

Broker maintains a reverse index from position to subscribers:

index: dict[int, set[Client]] = defaultdict(set)

def add_subscription(client, positions):
    for pos in positions:
        index[pos].add(client)

def remove_subscription(client, positions):
    for pos in positions:
        index[pos].discard(client)

Per-block routing:

For each block element, hash it and check the index:

def process_block(block):
    for element in block.elements:
        pos = hash64(element.address)
        if pos in index:
            for client in index[pos]:
                client.notify(element)

Cost: O(n_b) hash lookups per block, independent of client count.

3. Scaling Analysis

Broker work per block:

Operation	Cost
Hash each element	O(n_b)
Index lookup	O(n_b) expected
Notify clients	O(\text{matches})

Total: O(n_b) plus notifications. Client count doesn’t appear.

Memory for index:

c clients with n_c addresses each: c \cdot n_c index entries.

For 10^6 clients × 100 addresses = 10^8 entries.

At ~50 bytes per entry (position + client pointer + overhead): ~5 GB.

Fits in RAM on a modest server.

False routing rate:

A block element routes to wrong client if hashes collide:

P(\text{false route}) = \frac{c \cdot n_c}{2^{64}}

For 10^8 positions: P \approx 5 \times 10^{-12} per element.

Expected false routes per block (n_b = 4{,}000): \approx 2 \times 10^{-8}.

Negligible.

Limits:

Total positions	False routes per block
10^{8}	10^{-8}
10^{10}	10^{-6}
10^{12}	10^{-4}
10^{14}	10^{-2}

Up to 10^{12} positions (e.g., 1 billion clients × 1,000 addresses) before false routing exceeds 0.01%.

4. Comparison

Approach	Server work/block	Client bandwidth	On-match cost	Scales to
BIP-37	O(c \cdot k \cdot n_b)	Matching txs only	—	~100 clients
BIP-157/158	O(n_b) once	Every filter + full block on match	Download & parse ~1.5 MB	∞ (work on client)
This	O(n_b)	Matching txs only	—	10^6+ clients

BIP-37 server cost grows linearly with clients—impractical at scale, which is why most nodes disable it.

BIP-157/158 pushes work to clients. For each block, client downloads and tests a filter (~20 KB). On match, client downloads the full block (~1.5 MB) and parses it to extract relevant transactions.

For a wallet matching 1 in 1,000 blocks over 800,000 blocks:

Cost	BIP-157/158	This
Filter downloads	800K × 20 KB = 16 GB	0
Block downloads	800 × 1.5 MB = 1.2 GB	0
Data received	Parse 800 full blocks	800 txs directly

This construction: constant server work per block, clients receive only matching transactions, no filter downloads, no block downloads.

5. Delegation and Trust

The architecture naturally supports delegation:

What broker learns:

Client positions (hashes of addresses). This is a probabilistic fingerprint—broker can test candidate addresses against it. Same privacy model as BIP-37.

What broker can do:

Withhold transactions (lie by omission)
Learn which addresses clients watch

What broker cannot do:

Forge transactions (clients verify signatures)
Claim false negatives undetectably (if proofs are provided)

Mitigation for omission attacks:

Broker provides merkle proofs for each transaction. Client can verify inclusion in block.

For stronger guarantees: client queries multiple independent brokers, or samples random blocks to audit broker honesty.

6. Privacy Extensions

Multiple brokers:

Client splits addresses across k brokers. Each broker sees only a fraction of the watch set.

def split_subscription(addresses, brokers):
    for i, addr in enumerate(addresses):
        broker = brokers[i % len(brokers)]
        broker.subscribe(hash64(addr))

No single broker learns full address set.

Subscription rotation:

Client periodically unsubscribes and resubscribes from different IP, Tor circuit, etc. Broker cannot link identities across rotations (unless address set is unique enough to fingerprint).

Dummy subscriptions:

Client subscribes to random positions alongside real ones. Broker cannot distinguish. False positives go to client who discards them.

7. Implementation Notes

Hash function:

Any 64-bit hash with good distribution. SipHash, xxHash, or truncated SHA-256 all work.

Must be deterministic: client and broker must agree on hash(address).

Element definition:

What gets hashed? Options:

Output script (most common)
Output script hash (P2SH, P2WSH)
Public key (P2PK)
Any combination

Define canonically for the chain.

Subscription updates:

Clients may add addresses (new receive addresses) or remove them (spent, rotated).

# Incremental update
broker.add_positions(client, new_positions)
broker.remove_positions(client, old_positions)

No need to resend full subscription.

Persistence:

Broker can persist index to disk for restart. Simple key-value store: position → list of client IDs.

Client subscriptions can be replayed on reconnect.

8. Relationship to Prior Work

BIP-37 (2012): Introduced client-side Bloom filters for server-side filtering. Recognized as problematic due to DoS potential and O(c) server scaling.

BIP-157/158 (2017): Neutrino. Server-side filter construction, client-side testing. Elegant but requires clients to download and process every block’s filter, plus download full blocks on match.

Electrum protocol: Full address indexing on server. Works but requires substantial infrastructure (full index of all addresses ever used).

Compact block relay, BIP-152: Uses 48-bit truncated hashes for set reconciliation between peers. Similar hash-based set comparison, different application (block propagation vs. light client filtering).

This construction combines ideas:

Hash-based element identification (like compact blocks)
Server-side filtering (like BIP-37)
O(n_b) per-block work (like BIP-157 server)
Zero client work (like BIP-37 client)

The key observation: a reverse index from position to client turns filtering into routing.

9. Properties

Property	Value
Client subscription	8 n_c bytes
Broker memory	~50 bytes per subscribed position
Per-block work	O(n_b) hash lookups
Per-match work	O(1) routing
False routing rate	\approx c \cdot n_c / 2^{64} per element
Max positions	~10^{12} before FP > 0.01%

10. Conclusion

Light client filtering is a routing problem, not a filtering problem. Clients subscribe to hashes of their addresses. A broker indexes subscriptions and routes matching transactions. Per-block cost is constant in client count.

The construction is trivial—hash addresses, build an index, do lookups—but this simplicity is the point. No Bloom filters, no compact filters, no per-client work. Just hashing and hash tables.

The hardest part is already solved: hash tables scale.

bitcoincashautist · February 23, 2026, 9:25pm

Appendix A: Offline Client Support

Online clients receive matches via pub/sub. For offline clients, the broker maintains a lightweight cache: just the list of block heights where matches occurred.

Design:

# Broker state
index: dict[int, set[Client]]                        # position -> clients
pending: dict[Client, list[int]] = defaultdict(list) # client -> block heights

def process_block(block):
    matched_clients = set()
    for element in block.elements:
        pos = hash64(element.address)
        if pos in index:
            for client in index[pos]:
                if client.online:
                    client.notify(element)
                else:
                    matched_clients.add(client)
    
    # Store only the block height
    for client in matched_clients:
        pending[client].append(block.height)

def on_reconnect(client):
    for height in pending[client]:
        block = get_block(height)
        for element in block.elements:
            if hash64(element.address) in client.positions:
                client.notify(element)
    pending[client].clear()

Protocol:

Storage cost:

A block height is 4 bytes. For a client matching 1 in 1,000 blocks:

Offline duration	Matched blocks	Storage
1 day	~0.14	~1 byte
1 month	~4	~16 bytes
1 year	~52	~208 bytes
10 years	~520	~2 KB

For 1 million clients with 10% offline averaging 1 month: ~1.6 MB total.

Comparison with alternatives:

Approach	Storage per client	On reconnect
Queue transactions	~KB to MB	Send queue
Client scans position lists	None	Client downloads ~35 MB for 30 days
Store block heights	~16 bytes for 30 days	Broker extracts from listed blocks

Edge cases:

Long offline periods: Storage remains small. Cap the list if needed and fall back to position list scanning for oldest blocks.

Client never reconnects: Garbage collect subscriptions and pending lists after inactivity threshold (e.g., 30 days).

High-volume wallets: Clients matching many blocks (exchanges, pools) accumulate longer lists. Consider tiered service or dedicated infrastructure.

bitcoincashautist · February 24, 2026, 8:00pm

Little more research: I tried to salvage BIP-37 and again came back to conclusion that bloom filters should be abandoned for hash indexes.

The idea was to do away with the tweak, and standardize k. So all clients use the same k and tweak=0. Then the server could precompute element hashes and use cheap operations to match with clients:

If all clients use identical filter size (m), the server can precompute bit positions as integers and test via bitwise AND against each filter. SIMD could test multiple clients simultaneously.

But why did BIP-37 even have this tweak? Idea was to improve privacy so server can’t easily tell if some clients are asking for common addresses, but that is security theater: server can just log the matches and compare, or scan the whole chain to get full list of used addresses, or just archive the filters for later analysis.

The tweak hampers scalability because it prevents cross-client work sharing.
Do away with the tweak, standardize number of hashes, and server can scale better.

Then I wondered if we could merge client filters into one master filter to cheaply test for any match before trying out each client’s filter individually, but 4-byte hashes are too small for that, and expanding… brings us back to hash index

Could use a larger master filter (say 10×) and map client bit positions into it. But this adds complexity and the fundamental (O(C)) dispatch remains. Converges toward just building a hash index—which is exactly the hash-keyed pub/sub insight.

PS, some BIP-37 scalability estimates.

Actual Computation Cost

Per transaction element with k=7:

7 × MurmurHash3 calls
7 × modulo operations
~200-300 CPU cycles total

A modern core handles ~3 billion cycles per second. Processing a block with 10,000 transactions, ~5 elements each:

(10,000 × 5 × 7 × 300) / 3×10⁹ ≈ 0.035 seconds

The Real Cost in BIP-37

The problem was never the hash computation itself—it’s the multiplication by C clients when each has a unique tweak:

Clients	Hashes per block (with tweaks)	Time
1	350,000	0.035s
100	35,000,000	3.5s
10,000	3,500,000,000	350s

With tweak=0 and precomputation:

Clients	Hashes per block	Time
Any	350,000	0.035s

The hash count becomes constant. Per-client filter testing is just bit lookups—effectively free by comparison.

bitcoincashautist · February 26, 2026, 12:43pm

Note that BIP-37 can be optimized to O(n) work per block!

Traditional BIP37 (per block)

For each client, for each element, hash and test:

Cost: O(n × m × k) hash operations

Optimized with tweak=0 (per block)

One-time setup: Build client index when clients connect

Bit 0 → {Client B, Client F, ...}
Bit 1 → {Client A, Client C, Client D, ...}
...
Bit 95999 → {Client B, ...}

Per block processing:

Hash each element once → k bit positions
Intersect client sets for those k positions
Dispatch to matching clients

Cost: O(n × k) hash operations + O(n × k) set intersections

The number of clients (m) disappears from the per-block work entirely - it’s encoded in the precomputed client index.

Requirements

Standardized k (Flowee Pay already effectively does this)
tweak=0 (currently random)
Server maintains client→bits and bit→clients mappings

Trade-off

Server memory scales with m (client count), but per-block CPU work is independent of m.

This makes BIP37 viable for high-client-count servers, which was previously considered unscalable.

BIP-37 as Inefficient Pub/Sub

When you standardize k and set tweak=0, the server builds:

# Bit position → subscribers
index: dict[int, set[Client]]

This is structurally identical to hash-keyed pub/sub:

# Hash position → subscribers  
index: dict[int, set[Client]]

The only difference is what you’re indexing.

Parameter Comparison

Aspect	Optimized BIP-37	Hash-Keyed Pub/Sub
Positions per address	k = 7	1
Bits per position	32 (into filter size m)	64
Client subscription	7 bit positions per address	1 hash per address
Index entries per address	7	1
Hash space	m (typ. ~100K bits)	2^{64}
False positive rate	(1 - e^{-kn/m})^k ≈ 0.1-1%	\approx 10^{-16} per lookup

The Waste

For a client watching 1,000 addresses:

	BIP-37 (k=7)	Pub/Sub
Index entries	7,000	1,000
Subscription size	~3.5 KB (7 × 4 bytes × 1000)	8 KB (8 bytes × 1000)
Lookups per element	7	1
Collision probability	High (small m)	Negligible

BIP-37 does 7× the index lookups for worse collision resistance.

Why the Extra Hashes?

Bloom filters use multiple hash functions to reduce false positives given a small bit array. But once you abandon the compact bit array representation and build an explicit index, you’re just doing extra work:

# BIP-37 style: 7 lookups per element
for i in range(k):
    pos = murmur3(element, i) % m
    if pos in index:
        # potential match

# Pub/sub: 1 lookup per element  
pos = hash64(element)
if pos in index:
    # potential match

The multiple hashes were a space-time tradeoff that no longer applies when you have an explicit index.

Salvaging BIP-37?

You could deploy optimized BIP-37 with existing clients if you:

Standardize k (already effectively done by Flowee)
Set tweak=0 (breaking change for privacy theater)
Accept the 7× overhead as legacy tax

But if you’re making breaking changes anyway, why not just switch to 64-bit hashes? The client change is trivial:

# Old: send Bloom filter
filter = BloomFilter(m, k, tweak=0)
for addr in addresses:
    filter.add(addr)
send(filter.bits)

# New: send hash list
send([hash64(addr) for addr in addresses])

The server-side index structure is nearly identical—just keyed on 64-bit integers instead of bit positions.

Verdict

“Salvaged BIP-37” is just hash-keyed pub/sub with:

7× more index entries
7× more lookups per element
Higher collision rates
More complex implementation

The Bloom filter machinery becomes pure overhead once you build an explicit reverse index. The only reason to keep it would be backward compatibility with clients that already construct BIP-37 filters—but those clients send random tweaks, so you can’t index them efficiently anyway.

tom · February 27, 2026, 2:45pm

I see you’re having fun with your AI

Apart from obvious issues like the bloom filter being a set of bits, not a list of bits, there is a lot of thought you two put in there.

But it very much looks like testing a ICE car at one specific RPM for optimal fuel efficiency. You can stare yourself blind on this problem, only later on to realize that the issues you’re solving has nothing to do with reality. Cars change RPM all the time. Like clients ask for different blocks at different rates, like IO-overhead making the CPU cost irrelevant.

What’s clear is that you’re not running this code. You didn’t do any emperical testing. We can’t review your empirical testing methodology. Instead your observations of something being slow are completely based on models you and the AI created in a completely virtual way.

Over the last couple of weeks it becamse clear that me pointing this out in various ways isn’t making you change your approach. Instead you use me as a new input for a more complex model, iterating the model and cursing me for the reality checks.

May I invite you instead to focus on writing actual code, creating actually relevant (to the real world usage) measurements of the topics you’re focusing on?
I expect you will gain your insights much smoother and long lasting.

bitcoincashautist · February 27, 2026, 9:20pm

Oh yeah, it’s an amazing time-saver, no way I could be bothered to do all this work manually. Sure, you have to give it good inputs, check outputs, watch out for mistakes, etc. But, it made many things reachable to me, things that were previously unreachable.

I don’t have to. I’m not selling anyone this as a solution or trying to convince anyone to adopt it. This was a research avenue for my own curiosity, and I shared here my path of discovery, and it was enough for me to prove to myself that one more wallet architecture is viable (pub/sub), and prove that if bip37 hits a limit it could get an efficiency boost from tweak=0. If it’s not enough of a proof to you, that’s fine, you can remain in a state of disbelief about this claim.

I only shared it with you because I thought you’d be happy to know I won’t anymore be dismissive of bip37 as an unscalable architecture. I used to think it was a dead end, and now I don’t, thanks to this research here. You never thought it was a dead end, so I don’t have to prove anything to you here - you already believed in this wallet architecture. Just set the tweak=0 and poof: it can scale.

We have other points of disagreement, but those have nothing to do with this topic here.

tom · February 28, 2026, 1:59pm

To be fair, I didn’t really look at the idea. I’m not in any such state. That wasn’t what my reply was about…

That is indeed nice to hear.