I was curious about whether it’s possible to AND two Bloom filters (one computed on block elements, one computed on wallet elements) as a way to eliminate a block as NOT having any transactions of interest. But to have a good elimination rate we need huge filters. But we don’t need huge filters for element extraction. What if we can combine the best of both worlds?
Position-List Pre-filtering for Bloom Filter Protocols
Abstract
We observe that BIP-37 style Bloom filters with arbitrarily large bit arrays can be represented as sorted position lists, where storage depends only on element count. This enables a two-phase filtering protocol: fast elimination via sorted merge comparison, followed by BIP-37 extraction on surviving blocks. The construction adds a near-free elimination layer using only integer comparisons.
1. Position Lists
A Bloom filter sets k bits per element in a bit array of size m. When m is large and elements are few, most bits are zero—storing the bit positions is more compact than storing the array.
For a set S of n elements with k hash functions hi: element → [0, 264), define the position list:
P(S) = sort( ⋃x ∈ S {h1(x), …, hk(x)} )
Storage is k · n · 8 bytes regardless of hash space size. Making m = 264 costs nothing extra while reducing collision probability to negligible levels.
2. Two-Phase Filtering
Setup: Server maintains a position list per block. Client sends a position list for their watch set.
Phase 1 — Elimination:
Compare sorted lists via merge. If no position matches, the block contains nothing relevant.
def has_overlap(block_pos, client_pos):
i, j = 0, 0
while i < len(block_pos) and j < len(client_pos):
if block_pos[i] == client_pos[j]:
return True
elif block_pos[i] < client_pos[j]:
i += 1
else:
j += 1
return False
Cost: O(nb + nc) integer comparisons. False positive rate is the probability that any of k · nb block positions lands in the set of k · nc client positions:
Pfp = 1 − (1 − k · nc / 264)k · nb ≈ (k · nc)(k · nb) / 264
For k = 3, nb = 160,000, nc = 1,000:
Pfp ≈ (3,000 × 480,000) / 264 ≈ 7.8 × 10−11
Most blocks are eliminated here. Only blocks with a matching position proceed to Phase 2.
Phase 2 — Extraction (on match):
This is standard BIP-37: fold client positions into a Bloom filter sized for practical transmission, then test block elements:
def extract_matches(block_elements, client_pos, bloom_size):
# Fold 64-bit positions into smaller filter
bloom = bytearray(bloom_size // 8)
for pos in client_pos:
bit = pos % bloom_size
bloom[bit // 8] |= 1 << (bit % 8)
# BIP-37 style: server tests each element
return [e for e in block_elements if test_bloom(bloom, e)]
The same client position list serves both phases—Phase 1 uses the raw 64-bit positions, Phase 2 folds them to a practical size.
3. Relationship to BIP-37
BIP-37 has the server test every transaction against a client-provided Bloom filter. This works but requires O(k · nb) hash operations per block unconditionally.
| Aspect | BIP-37 | This construction |
|---|---|---|
| Elimination | None | Position list merge |
| Extraction | Bloom filter | Same (folded positions) |
| Server cost per block | O(k · nb) hashes | O(nb + nc) comparisons |
| Server cost on match | — | O(k · nb) hashes |
| False positive source | Bloom filter size | Two stages: near-zero elimination, standard Bloom extraction |
The contribution is Phase 1 elimination, which makes BIP-37’s expensive test rare rather than universal. On a chain where clients match perhaps 1 in 10,000 blocks, almost all work becomes simple integer comparison.
4. Properties
| Property | Value |
|---|---|
| Server storage per block | k · nb · 8 bytes |
| Client subscription size | k · nc · 8 bytes |
| Elimination cost | O(nb + nc) comparisons |
| Extraction cost | O(k · nb) hash operations |
| False positive (elimination) | ≈ 10−10 for typical parameters |
| False positive (extraction) | Standard Bloom rate for folded size |
5. Conclusion
Position lists in a large hash space provide cheap elimination with near-perfect accuracy. When elimination fails, the same positions fold to a standard BIP-37 Bloom filter for element extraction. The construction requires no novel primitives—only sorted lists, merge comparison, and modular arithmetic. It can be viewed as adding an efficient pre-filter to the existing BIP-37 protocol.


