CAAS: Continuous Audit as a Service — executable security evidence for open-source crypto infrastructure

shrec · April 28, 2026, 12:23am

Hi everyone,

I would like to share the audit model I have been building around UltrafastSecp256k1 .

The model is called CAAS — Continuous Audit as a Service .

The idea is simple:

Security claims should not live only in a static PDF. They should become reproducible, executable evidence.

This is not an argument against human review. Human review is valuable. The problem is treating a one-time external audit report as the final source of truth for a living open-source cryptographic project.

A codebase changes. A PDF does not.

CAAS is my attempt to make audit evidence continuous, public, reproducible, and extensible.

Motivation

In many crypto projects, the normal audit model looks like this:

a project reaches a snapshot;
an external auditor reviews that snapshot;
a PDF report is published;
development continues;
the audited snapshot becomes old;
future bugs or regressions may appear outside the audited state.

This model can be useful, but it has limitations.

It is expensive.
It is static.
It is often difficult for third parties to reproduce.
It can become an authority signal rather than an evidence system.
It can exclude independent open-source developers who cannot pay for large one-time audits.

My view is not “do not audit.”

My view is:

Audits should become executable, reproducible, permanent, and cumulative.

Core principle

The core rule of CAAS is:

Every security claim must map to a test, and every test must map to evidence.

For example:

Claim	Evidence type
“This path is constant-time”	CT analysis, timing tests, secret-taint checks
“Invalid public keys are rejected”	invalid-input regression tests
“This matches the reference implementation”	differential tests
“This historical exploit class is covered”	exploit PoC regression
“This backend matches another backend”	backend parity checks
“This bug cannot return”	permanent regression test

A claim without a test is not evidence. It is only an intention.

Bug handling philosophy

CAAS does not claim that the code is bug-free.

That would not be credible for any serious cryptographic library.

The actual security posture is different:

Bugs can exist. When they are found, they must become reproducible, fixed, documented, and permanently added to the audit corpus.

The lifecycle is:

bug / exploit / missing assumption
→ reproduce
→ write test or PoC
→ fix
→ add regression coverage
→ document
→ run continuously

This means a bug report is not treated as a reputation failure. It becomes audit memory.

A useful summary is:

Bug-free software is not a credible claim. Public, reproducible bug handling is.

What CAAS currently includes

In UltrafastSecp256k1, CAAS includes multiple layers:

unit tests
integration tests
exploit-style regression tests
differential tests against reference implementations
Wycheproof coverage
invalid-input tests
fuzzing
sanitizer builds
static analysis
constant-time checks
backend parity checks
C ABI negative tests
benchmark evidence
source graph analysis
audit traceability
bug capsule generation
CI gates
local audit dashboard

The important point is that these tools are not used as marketing labels. They are used to produce artifacts that can be inspected, reproduced, challenged, and extended.

Exploit corpus and regression memory

A major part of CAAS is the exploit corpus.

When a known attack class, CVE, paper, or historical implementation bug is relevant, the goal is to turn it into a permanent regression test.

If the library is vulnerable, the test fails first, the bug is fixed, and the test remains forever.

If the library is not vulnerable, the PoC still becomes evidence that the attack class was considered and tested.

This changes the role of security research.

Instead of being only a report, research becomes executable memory.

Source graph and traceability

CAAS also includes a source-graph layer.

The goal is to answer questions like:

Which functions are covered by tests?
Which functions are high-risk?
Which functions are secret-bearing?
Which files have audit gaps?
Which backends are missing parity?
Which claims map to which artifacts?
Which bugs became regression tests?

This matters because large projects cannot rely only on human memory.

Security review should be queryable.

Local CAAS dashboard

I also added a local CAAS web dashboard.

After the audit pipeline runs and artifacts are collected, a Python script can launch a local web panel. The dashboard visualizes the audit state in HTML.

It can show:

gate status
collected artifacts
exploit coverage
traceability data
benchmark evidence
bug capsules
known gaps
provenance metadata
audit summaries

This is important because generating evidence is not enough. Evidence must also be inspectable.

The goal is:

executable evidence + reviewable evidence.

Runtime vs audit dependencies

One common misunderstanding is that CAAS makes the production library heavy.

It does not.

There are two separate layers:

Production engine:
lightweight runtime
static/shared library
C ABI
no required external runtime audit tooling

CAAS:
Python scripts
CI tools
analyzers
fuzzing
formal/spec tools
dashboard
reports

CAAS dependencies belong to the development and audit pipeline. They are not part of the production runtime.

A wallet, node, service, or downstream project does not need to ship the CAAS toolchain just to use the engine.

Why this matters for BCH

Bitcoin Cash has a strong practical focus: usable payments, low fees, local control, and applications that ordinary users can actually use.

For that kind of ecosystem, open-source security should not depend only on expensive institutional trust signals.

A smaller project, an independent developer, or a wallet team should be able to say:

Here are the claims.
Here are the tests.
Here are the artifacts.
Here are the known limitations.
Here is how to reproduce the evidence.
Here is how to add a missing test.

That does not replace expert review. It makes expert review more effective.

A human reviewer should be able to say:

This assumption is missing.

And then that assumption should become a test, a documented limitation, or a CI gate.

What CAAS is not

CAAS is not a magic guarantee.

It does not claim:

no bugs exist;
external review is useless;
every future attack is known;
physical side channels are solved;
all platforms have identical guarantees;
a project should never receive a formal third-party audit.

The honest claim is narrower:

Here is the current evidence. You can reproduce it, challenge it, and extend it.

Static audit vs living audit corpus

A static audit asks:

Was this snapshot reviewed?

A living audit corpus asks:

What claims are currently covered?
What changed?
What regressed?
Which attack classes are tested?
Which limitations are known?
Which gaps remain?
Can I reproduce the evidence?

Both models can coexist.

But for open-source crypto infrastructure, I believe the second model is essential.

A PDF can certify a moment.

CAAS tries to defend a process.

Current status

The current CAAS work around UltrafastSecp256k1 includes:

207 exploit tests
414 audit run entries
dedicated Wycheproof CI
CAAS Stage 2e active
Bitcoin Core compatibility testing
693/693 Bitcoin Core tests passing
benchmark evidence on real Bitcoin Core workloads
local HTML dashboard for audit artifact inspection

The project is still evolving, and I am actively adding missing tests, improving documentation, and importing useful reference tests from Bitcoin Core / libsecp256k1 where applicable.

What I am looking for

I would appreciate adversarial feedback from the BCH community.

Useful feedback includes:

missing historical secp256k1 / ECDSA / Schnorr bugs;
missing exploit classes;
missing Wycheproof or reference vectors;
invalid-input cases that should be covered;
benchmark methodology objections;
constant-time assumptions that need stronger evidence;
unclear security claims;
documentation gaps;
BCH-specific cryptographic workflows that should be tested;
reusable-address / stealth-address / SRPA-related cases that need coverage.

If a finding is valid, my goal is to turn it into one of:

a regression test;
an exploit PoC;
a differential test;
a benchmark correction;
a documented limitation;
a CI gate;
a code fix.

Repository

Project:

The goal is not to ask anyone to trust the project blindly.

The goal is to make the claims reproducible, make bugs permanent regressions, and make review continuous.

Do not trust the maintainer.
Do not trust a PDF blindly.
Reproduce the evidence.

BitcoinCashPodcast · April 30, 2026, 5:57pm

@MathieuG has recently been innovating with ParityUSD on LLM based testing of contracts for security concerns. Seems like that maybe should also be included - some kind of AI scan over a repository? That may have unique challenges for reproducibility because of how AI can be non-deterministic, I don’t know, but it’s where the cutting edge of BCH security analysis is so it seems very relevant to this.

shrec · April 30, 2026, 6:17pm

Yes, I agree — AI-assisted review should probably become a first-class CAAS layer.

The main challenge, as you said, is reproducibility. I don’t think AI findings should be treated as deterministic proof by themselves. My preferred model is:

AI reviewer → candidate finding → human/tool confirmation → reproducible PoC or test → CAAS regression/gate.

So the model output is not the final artifact. The final artifact is the deterministic test, exploit PoC, invariant, or documented residual risk created from the finding.

I’m already experimenting with this workflow: LLM reviewers have found real bugs, but I reset/re-run them in blind review mode because model context can bias later passes. I think the useful direction is not “trust the model”, but “use models to generate objections, then convert valid objections into reproducible evidence.”

For CAAS, an AI scan layer could record:

model name/version,
prompt profile,
target files,
repository commit,
generated findings,
whether each finding was confirmed,
and the deterministic regression added afterward.

That would make AI review auditable without pretending it is deterministic.