Payment Protocol

This is in relation to the Flowee Pay research project (here).

Payment protocol is a way for two wallets to talk to each other and establish an agreement on an actual payment from one to the other. With the outcome that the receiver actually got the transaction as a file and probably it has also send to the network for mining.

This inter-wallet-conversation brings a lot of benefits which improve security and usability. These benefits we know can be had, so lets mark them our goal:

  • Merchant gets notified about progress in steps instead of one ā€œdoneā€.
    • makes social engineering near impossible.
  • Payments are checked by merchant before sending to miners.
    • Risky transactions are either rejected of a higher fee is demanded.
  • Double Spend risk goes down considerably.
  • Customer doesnā€™t require Internet connection.
  • The customer wallet automatically records extra information like who is being paid.
  • The customer wallet learns both the BCH value and the Euro/USD value and thus
    can inform the customer if the exchange rate is fair.
  • There is never any doubt, by both parties, if a payment has succeeded or failed.
  • A return-address can automatically be included to refund within the law.
  • Merchant can request payment to any script or use any transaction features.

The majority of these are rather easy to get already using the existing BIP70 spec. But BIP70 is showing its age. Specific issues with (pure) BIP70:

  • transport layer is essentially impossible to change due to the security design. Specifically its x509 usage.
    • wallet to wallet payments are unsupported.
    • payments over NFC or bluetooth not supported.
  • signing messages is required while the transport layer duplicates this. Overcomplicating things.
  • offline payments need more work: if a merchant doesnā€™t have an UTXO we want to spent it should be possible to suply that.
  • Requests unpack the transaction, overcomplicating things. Just send the transaction.
  • the sending of a transaction from the wallet to the merchant uses a semi-hardcoded endpoint. Same server as the request but a hard-coded location for the call.

A better design can likely solve all of those quite easy while striving for our goals.

3 Likes

Going to try and revive this discussion with something Iā€™ve been thinking about recently.

BIP70/JPP currently does not allow specification of inputs - only outputs. This would restrict many of the cool Smart-Contract use-cases that CashTokens might allow us.

To accommodate this, It might be worth considering using LibAuth Templates ( libauth/src/lib/transaction/fixtures/templates at v2 Ā· bitauth/libauth Ā· GitHub ) as the format for describing the transactions (as opposed to only allowing specification of outputs).

1 Like

Yeah, this needs a revival. The community seems to be ready for this now.

Making an updated payment protocol is basically going to take 3 stages;

  1. decide which properties to put in a message. Which optional ones etc.
    This is is relevant in order to support actual usecases. Like Jims post above indicates.
  2. Figure out the message-flow. For instance bip21 (the QR) is one ā€˜messageā€™ in one direction only where bip70 takes multiple rounds of messages in both directions.
    How many rounds are needed to do what we want to do and decide on things like where does a finished transaction get sent and who offers it to the full nodes.
  3. Actually decide on the formatting of the messages. JSON, XML, something else. I think most people hate protocol-buffers for this kind of usecase. PB is meant for high-speed, high volume. It has no real advantages for a payment protocol.

Iā€™ll dust off my notes and research over the next week or so and see if I can start this.
In the mean time Iā€™d love it if people can comment on usecases they would like to see supported.

Thanks for writing these Tom. I donā€™t have any immediate feedback/thoughts on above yet. I hope to write a bit more about this in the coming weeks.

One additional thing that I did want to jot down for consideration (so we donā€™t forget) is what a suitable URL protocol handler scheme might be. Currently, we use the BIP70 bitcoincash:?r=someEndpoint form for both BIP70 and JPP (we use HTTP Headers to identify which proto is actually used). We may want to stick with this form?

However, there is some interest in creating Web Wallets (as PWA Apps) to ease concerns over App-Store censorship (which might be a problem BCH - or crypto generally - may face in future). Iā€™m having trouble finding definitive information on this, but it looks like PWA apps might have a limitation in that the protocol handlers either need to use:

  1. Schemes from this whitelist (enforced by Spec) (see here: HTML Standard )
  2. Or prefixed with web+someProtoHandler to evade the explicit Browser Whitelisting

This feature is still listed as experimental and doesnā€™t have full browser support anyway, so maybe we just retain the bitcoincash: scheme. Anyway, wanted to make note of it in case anyone had any thoughts.

Some further reading:

HTML Standard (Whitelisted Schemes)

I think anyonecanpay transactions is a good usecase to start if weā€™re looking to design an upgrade BIP70 and get similar security benefits for more complex transactions like CashTokens or smart contracts.

Similar to current BIP70, anyonecanpay transactions donā€™t need specific inputs defined by the application/merchant, thatā€™s up to the wallet/user. Also, like BIP70, we want unbroadcasted and partially signed transactions to be sent back to the application/merchant to broadcast.

And similar to CashTokens and smart contracts, we need the wallet to be able to freeze the pledged coins unless explicitly allowed by the user (and perhaps with an already signed refund transaction, so the application can be aware of the spending transaction).

The breakdown would go along these lines:

  1. consolidate a number of coins into a single ā€œpledgeā€ utxo.
  2. partially sign a transaction sending that pledge utxo to the intended recipients (using anyonecanpay sighash flag)
  3. send back the utxo and partial transaction signature to the merchant/application.
  4. make sure those UTXOs stay frozen unless explicitly spent by the user.

Things can get more complicated quickly so Iā€™m curious if thereā€™s a way to define something general enough to handle this usecase or more complex ones.

2 Likes

A discussion on telegram gave me a new usecase that might be useful to consider in payment protocol next.

Person gets repeated payments, for instance from their paycheck. The typical payment protocol is not really useful in the pay-roll usecase as the timings would not work in the standard payment-request way of working.

What would be a neat solution to this is that the payee uses the payment protocol to register an output-script (or address) with the financial department somewhere before the next payment is due. Perhaps when his wallet receives the previous one, their wallet can automatically reach out and register a new output to send the next payment to.

I think this fits the payment protocol because the action taken is essentially half of any normal payment protocol exchange. The missing part is the finishing of the payment, as that may be a week later. I think it would be useful to allow the first part (please pay me on X) to be usable without the verification part in this case.

Maybe the TXID of the Nth payment could be used to generate the N+1th address, where only the payer and payee are able to generate it. Something like - A and B do a ā€œhandshakeā€ and create a shared secret AB. Then, they use the AB+R to generate the 1st address, then they use AB+TXID0 to generate the next oneā€¦ and the next one and so on.

What about RPA? No need for the handshake and also provides recoverability from seed.

1 Like

Iā€™m trying to find solutions that are low cost and expected to have wide usage. Naturally people can go with whatever solution they like, but a couple of common and cheap-to-operate solutions will win in the end in my opinion.

One thing to note here is that an xpub can be made for any sub-section of the derivation-tree. The xpub itself encodes the depth. So you could give a different xpub to different parties and could make all of those end up in your (not standard) HD wallet. But thatā€™s just an aside. It has little to do with the payment protocol :slight_smile:

1 Like

Iā€™ll post the link here for future reference: BCH Reusable Address Proposal

This looks much like Monero/CryptoNote addresses, which also encode 2 keys: one which lets you find the outputs, the other which lets you open them.
The partitioning trick of RPAs is neat, but scaling properties are still pretty much the same: you have to inspect each element in a set one by one and that by performing EC mul ops on each one.
Looks like Monero was looking in the same direction: partition the set that needs to be scanned. For that they came up with something called ā€œview tagsā€.

Although, on BCH it could scale better if you have access to the whole UTXO set and donā€™t need/want the history, but still, compared to a simple filter against a list of keys, it is orders of magnitudes slower.
Wallet and wallet infrastructure scaling properties are horrible, and (ab)using Electrum servers has potential to become a burden on light wallets infrastructure.

Trade-offs can be made, but someone, somewhere will have to do those EC mul ops, or youā€™ll have to somehow communicate to the recipient so he knows where/when to look.

1 Like

Iā€™ve been thinking through various transaction validation and propagation scenarios with regards to SPV wallets over the past week or so. In particular, I was curious what happens when thereā€™s a very long unconfirmed transaction chain in mempool, but a miner only mines part of the chain (letā€™s say just the first parent, for simplicity). I eventually realized that this case would have no bearing on whether or not the user must be informed of their transaction being processed or not. For the most part, users/wallets can simply trust that the transaction will ā€œeventuallyā€ get confirmed, and 0-conf is ā€œgood enoughā€ to let the user spend the resulting utxos.

I appreciate the prior thought that Tom has put into the subject as it clarified a few things that I had questions about.

In the past, Iā€™ve thought a bit about how BCH would work in a low- or no-connectivity environment. Think of a music festival like Burning Man or Electric Forest in the USA. The culture at these types of events are extremely well-suited toward the type of p2p trade that BCH or fiat cash facilitates. You wouldnā€™t be caught dead at one of these events without fiat cash. But cellular or wifi coverage can be spotty at best. So how could a BCH economy still function here?

The answer is wallet-to-wallet payments, and so we arrive here, at the need to define a new payment protocol that is more robust than BIP70.

Wallet-to-wallet payments would likely involve sending the entire raw transaction data for a fully valid, signed transaction over a channel such as audio, NFC, or ad-hoc wifi. Ultimately, it would be the responsibility of the receiver of the funds to ensure that the transaction is eventually broadcasted to the network, but the sender could also assist. All receivers of an offline unconfirmed transaction chain would likely want to rebroadcast all of the transactions in the chain opportunistically, as connectivity allows.

The double-spend risk comes from the sender broadcasting a transaction that uses the same inputs that were used for the offline transaction, but setting the outputs to a wallet still under the senderā€™s control, before the receiver is able to connect to the network to broadcast. For transactions among friends, this risk is acceptable. But for relatively anonymous transactions in a public setting, thereā€™s high risk of fraud. How can this be mitigated? (I know there is a bit of irony here as this is exactly what mining solves)

Verifiable paper wallets are one solution, but I think this is actually a rather clumsy solution. Certainly there must be a way that offline payments can be made safe enough that they can be acceptable in the environment I propose?

Zero-Conf Escrow may be a hint? But it seems ZCEs require full node participation, so maybe not suitable for offline wallet-to-wallet transactions.

Hey, so nice to see you here! Great post too. :+1:

In Flowee Pay the wallet will re-broadcast the transaction after receiving a block that (surprisingly) didnā€™t contain the transaction. It will do this until the transaction is mined (or removed by the user).

For something like burning man, Iā€™d say that one person wants to set up a full node that is directly connected to the Internet and anyone else can connect to them locally (which is likely cheap) in a mesh-style. We may want to test such a setup and write a wiki page on that so people can try it out in real life, maybe. But I digress.

This is a requirement for SPV because only a full node can verify that a certain random transaction wasnā€™t just spending made up money. I think that is a bigger risk than the double spend risk. You need to verify that all the inputs being spend are actually real coins.
By far the easiest way to do that is to ask a full node that is close to you in terms of networking.

I fully agree with your assessment that a payment protocol would need to send a full transaction and my first iteration that I wrote a couple of years back started from that position. My local name is not pp (for payment protocol) but ā€œp2p-ppā€ :wink:

So the trade-offs here are that for peer to peer payments the receiver will have to accept risks about money not existing if they are not connected to a full node.
Additionally, they may need to accept risks of double spends if the receiver is unable to broadcast the transaction until later.

Most of these risks are probably quite acceptable in real life settings, afterall we (society) managed with cheques and paper-prints of credit cards for decades. Most people donā€™t steal in a face to face setup.
I expect that in the end such a solution will be acceptable to end-users.

I was thinking to mitigate this, any time a wallet spends unconfirmed UTXOs ā€œoffline,ā€ the sender/receiver would negotiate and share information about any missing transactions that they donā€™t already know about. Later on, the wallet would also ā€œopportunisticallyā€ download updated transaction data from a real node, sanity checking the local state as it gets live data. The wallet interface could show the user some kind of ā€œconfidenceā€ level depending on how many of the utxos it was able to verify itself based on trusted (previously-known) data, and how trustworthy the ā€œnegotiatedā€ data seems (based on blockheights/confirmation counts).

I agree with the idea that most people wonā€™t try to cheat in a face-to-face transaction. But just in case, the wallet should try to retain and verify as much information as appropriate so that the affected users can do diligent detective work if something goes wrong. As soon as the receiver gets some kind of connectivity to a full node, everything it learned while offline can be verified.

I think these techniques would be ā€œgood enoughā€ for covering minor internet outages. For example, I want to send my coworker $5, but cell reception in the break room sucks for everybody. I could still scan their QR code and send a signed transaction to their wallet directly. Our wallets would broadcast the transaction as soon as they have connectivity again, which could be less than a minute later.

In a scenario where there could be multiple hours or days before having the ability to connect to a node, users could have a setting for some ā€œtoleranceā€ level on unconfirmed/untrusted transaction chains. In a localized environment like Burning Man, the circulating UTXO set would likely be relatively small (over a multi-day period). So even if many users were unable to connect to full nodes, they would still eventually propagate each relevant transaction amongst themselves. In a way, this somewhat emulates the same p2p full node protocol, but over sneakernet, with a ā€œmeta-mempoolā€ of opportunistically shared transaction data. :stuck_out_tongue:

This payment protocol could also use a scheme similar to CashID to request metadata about the transaction. Users could perform their own ā€œKYCā€ and associate these notes with the transaction data. To increase trust in case someone attempts a double-spend, the protocol could request a phone number, or Session Messenger identifier, an email address, or something along those lines. Then users could communicate out-of-band to resubmit new ā€œvalidā€ transactions to each other.

Maybe thereā€™s value in a protocol like Nostr here, where if a wallet detects a double-spend on the unconfirmed transaction chain, it can resubmit the transactions it was originally responsible for, and notify other wallets via shared relay. So when those other wallets come back online, they can know where the old utxos went, and resubmit their own transactions accordingly. Eventually, as all involved wallets come back online, the unconfirmed transaction chain will ā€œhealā€ and all funds will eventually settle on-chain. And of course, the wallet app could be responsible for the userā€™s Nostr keys. (Sidenote, maybe a similar scheme can be used for wallets to gossip about which cashfusion server to use, without changing the cashfusion protocol at all, just allowing a communication mechanism to choose a server)