Storing Data On-Chain in an Input: Example + Demo

HI all,

Just want to give an example of how Inputs can be used to store data up to ~1650 bytes (10K bytes after Pay-2-Script CHIP upgrade). This might open up a few neat use-cases (and doesn’t pollute the UTXO set because it’s on an input):

  1. Storing Lockscript Parameters that can be recovered when importing Wallets.

    We can, technically, do this with OP_RETURN, but we’re constrained to 220 bytes. This means we would often have to span across several tx’s. In comparison, with 2026 VM upgrade, storing on an input gives us near 10KB which is probably sufficient for most contract payloads.

  2. Account-based Datastore

    To store on an input, you must be able to unlock the UTXO. Assuming the UTXO is locked with a user’s public key, this means that others cannot spam data onto the store (technically, you protect against this on OP_RETURN by signing the payloads but, given the 220-byte size constraints, it becomes a bit difficult).

  3. Having a permissioned data-store that must satisfy smart-contract conditions.

    To give an example, imagine you had a frontend-only catalogue website that you were trying to keep relatively decentralized by just leveraging Fulcrum Servers and IPFS. You wanted anyone to be able to post their item on the home-page, but you wanted to protect against spam by incurring a cost (e.g. 0.05 BCH). You could, technically, setup a Smart Contract that uses an Input as the datastore and have the remaining smart-contract code validate that it sends at least 0.05 BCH back to itself.

One downside to storing on an input is that you need a “Setup Transaction”. But, after that, we should just be able to chain many items off that initial setup transaction, creating a database of sorts.

How we can do this

At it’s most basic, setting up a datastore that just uses a public key (similar to a P2PK) is pretty trivial. Using CashASM, we can just use a Redeem Script as follows:

OP_DROP // Drop the data from the stack (we ignore it)
<key.public_key> // Push the user's public key onto the stack
OP_CHECKSIG // Check the signature

… and to unlock simply:

<key.schnorr_signature.all_outputs> // Push the signature
<data> // Push the data (e.g. 1500 bytes)

I’m not sure we would be able to write this in CashScript as it does depend upon direct Stack Manipulation (but maybe there’s a trick that can be used).

I’ve got a PoC of the above on the CashConnect homepage: https://cashconnect.developers.cash/

(Template itself is here: Draft: Initial commit (!1) · Merge requests · cashconnect-js / cashconnect-js · GitLab , but it’s not very well done and mostly fluffs the fee.)

1 Like

There are many methods, see this paper: https://ledgerjournal.org/ojs/ledger/article/download/101/93/613

You could probably have a system where data could be partitioned and compute a Merkle root over partitions, then users would just provide a new leaf + new root + proofs that they only changed the 1 leaf.

1 Like

My suggested way is to do an anonymous, encrypted backup to both the cloud and, if you have that option, to your own storage. That’s what Flowee Pay does.

1 Like

Thanks for the link! Really good read, particularly the part on “Sniping UTXOs” - which my example above is susceptible to.

You could probably have a system where data could be partitioned and compute a Merkle root over partitions, then users would just provide a new leaf + new root + proofs that they only changed the 1 leaf.

Haven’t quite wrapped my head around this yet. Is the idea that the lockingBytecode remains the same (for easy lookups of the “account”)? This is one of the drawbacks of the Data Hash Method metnioned - the address changes for each piece of data, so it becomes hard to lookup.

My initial thoughts were we could just incorporate a datasig check against the publicKey to keep the address/lockingBytecode consistent and protect against the sniping attack - but I’m not sure that’s the most optimal way to do it yet.

I’ve seen IPFS proposed a lot for this (and looked at OrbitDB - https://orbitdb.org/ - a while ago). I think the big problem is that malicious users/bots could spam a lot junk data on any anonymous store so, to protect against that, I think a fee of some kind would probably be necessary.

For example, with OrbitDB, one idea I explored a little bit was that wallets pay an up-front fee that grants them X bytes on a given peer (storage provider). E.g. 10,000 sats = 100KB.

Ultimately though, I ditched this because OrbitDB is (or was) a pure JS library (not Typescript) and there were no assurances that the peer (storage provider) would actually remain online.

That was my initial thought, too. With checkdatasig you could have your locking script simply require that the pushed data is signed with the committed key.

Didn’t think much about the details, but yeah you could totally have a constant address and store the merkle root in NFT commitment. It would be a simpler variant of this: Enabling .BCH domains similar to .eth with ENS - #17 by bitcoincashautist where the whole tree would have a single owner.

This is obviously a direct exploit to cover.
But it is quite manageble, honestly. Max filesize quite small, only propertly signed files, only encrypted loads, and no public index, are obvious ways to counter it. You can check the implementation on flowee thehub repo if you’re curious.

Personally I’m not a fan of IPFS for backup of money, because there literally is nobody you can hold accountable for the file going poof.

1 Like