Authors:
Andrew Groot (groot@softwareverde.com)
Josh Green (josh@softwareverde.com)
Initial Feedback From:
Calin Culianu
Tom Zander
Overview
This document’s intent is to suggest paths forward for supporting additional transaction types for generating double spend proofs, which provides coverage for double-spend attacks that may be trivially undetected. It also seeks to identify scenarios that nodes (and third-party zero-confirmation services) should identify when determining if a zero-confirmation transaction may be considered inherently untrustworthy.
The existing beta message format for double spend proofs (see dsproof-beta) allows for efficient, canonical proofs to be generated that inform recipients of the existence of additional transactions spending a particular output. It is currently in use on the Bitcoin Cash network (as a beta) and is an excellent step in the direction of safer zero-confirmation transactions. At present, however, only P2PKH (pay-to-public-key-hash) outputs are supported, allowing malicious users to side-step double-spend detection by using other types of scripts. Fortunately, the general approach to double-spend proofs used by the dsproof-beta message is sound, and a relatively small number of changes would allow for a greater range of scripts to be proven double-spent. Additionally, by introducing clear recommendations about how to interpret the safety and legitimacy of zero-confirmation transactions in a post-double-spend-proof ecosystem, middleware and wallet developers will be able to provide consistent, trustworthy information to end-users risk associated with any given transaction.
Background and Motivation
The existing approach for P2PKH outputs relies on few properties of P2PKH scripts:
- there is only one signature to validate
- the only difference between inputs that double-spend an output would be the signature in the unlocking script
- that signature is in a predictable location
Because of these properties, it is easy for a node with the double-spend transaction to extract the signature from the offending unlocking script and directly compute the hash-prevoutputs
, hash-sequence
, and hash-outputs
values. That signature can then be added directly to the push-data
list at the end of the spender information in the double-spend proof message. Validators of the double-spend proof then have all of the values needed to create the signature preimage and verify that such a transaction would unlock the double-spent output.
In order to support P2PK (pay-to-public key), a very similar approach could be taken, but for P2SH (pay-to-script-hash) and Multisig transactions, there are additional variables that come into play. The script length is variable in these cases, but the existing dsproof-beta format already provides room for additional push-data
values, which could be used to push any necessary data from each transactions unlocking script. In the worst case scenario, the entire unlocking script could be provided in the push-data
for each spender (this works due to the push-only rule and minimal data requirements in BCH). This, however, is still not sufficient as P2SH and Multisig unlocking scripts are capable of requiring multiple signatures to be validated.
While a transaction having multiple signatures is not necessarily a problem for double-spend proofs, the signature preimage can vary based on the hash type designated when generating the signature. This means that the hash fields (hash-prevoutputs
, hash-sequence
, and hash-outputs
) could have different values depending on the signature. Fortunately, hash-prevoutputs
and hash-sequence
only have two possible values for a given transaction, one of which is all zero (0x00
) bytes, meaning there’s no need to transmit that value. However, the hash-outputs
field has three possible states, two of which have non-zero values. In order to validate a double-spend proof for a transaction with two such signatures (i.e. hash types with SIGHASH_SINGLE
or SIGHASH_ALL
), a validator would need to have both hash-outputs
values, which there is currently no way to send in the dsproof-beta message.
Independent of handling multiple signatures, however, the hash type possibilities pose an additional threat to double-spend proof. The values SIGHASH_SINGLE
, SIGHASH_NONE
, and SIGHASH_ANYONECANPAY
all present levels of transaction malleability. A malicious sender could use combinations of these types of signatures to create double-spends whose proofs would fail to be convincing. For example, in the most extreme case of SIGHASH_NONE | SIGHASH_ANYONECANPAY
, all of the other inputs and all of the outputs of the transaction can be changed without altering the signature, meaning a double-spend transaction can be trivially created by anyone but a double-spend proof would appear to represent an identical transaction (with identical push data). Extending double spend proofs to allow for duplicate proofs with identical push-data would open up denial-of-services attacks via broadcasting a double spend proof for a transaction that does not exist. This is why currently the double-spend proof beta message only supports “P2PKH outputs with all inputs signed SIGHASH_ALL
without ANYONECANPAY
.”
This issue is exacerbated by the fact that SPV wallets are unable to easily and definitively make such determinations in a broader set of use cases. While for P2PKH the unlocking script is structured and understandable, SPV wallets often do not have the previous output’s locking script available to guarantee that the format they see is, in fact, part of a proper P2PKH script. For P2SH scripts this is rendered even more difficult, as simply iterating through the unlocking script and redeem script looking for signature-like structures does not necessarily guarantee that a transaction is not easily double-spendable by the creator. A simple example of this is a locking script that does not execute a checksig.n this scenario it is possible the locking script still pushes what appears to be public key and a value that appears to be a signature, but ultimately are not treated as such, fooling the SPV wallet into believing the script was for a regular P2PKH script. There is therefore a class of transaction for which convincing double-spend proofs cannot be created but which are in fact easily double-spendable. This then leads to the question: what do wallets do when they receive these kinds of transactions? If a double-spend proof has not been received for a transaction does that mean it is not double-spent, or that there is no known way to create a convincing double-spend proof? Further guidance is likely needed to ensure that these cases are handled sanely by node and wallet implementations.
The above shows that while double-spend proofs are useful for what are likely the most common transactions, malicious actors still have plenty of opportunity to take advantage of the limitations. It also shows that the changes necessary to make double-spend proofs fully useful will involve both risk analysis and technical solutions. The following sections aim to address both concerns.
1: Establishing Risk Assessment Conventions: Is Double-Spend Provable
Before expanding double-spend proofs to handle more cases, it is prudent to first address how the existence of double-spend proofs affects end-user perception of zero-confirmation transactions. Presently, it’s unclear how to handle this. But as previously mentioned, not all double-spends are easily provable. A full node may be able to determine that two transactions spend the same output, but the only other parties that would be able to definitely verify the double spend would be other full nodes that also receive the full transaction. Since full nodes do not (and should not) relay double-spend transactions, nodes that are interacting with SPV nodes (or, more importantly, those supporting middleware services that provide transaction information to SPV nodes) need a method for determining how likely it is that a transaction 1) has not been double-spent, or 2) was double-spent but the proof was not relayed because a double-spend proof was not possible to create.
For the existing double-spend proof format, the rules for whether a double-spend proof is supported for a given input are very straightforward. If, and only if, the following items are true, a double-spend proof is currently supported for the input:
- The spent output is in a confirmed transaction (mined in a block)
- The locking script for the previous output is in the P2PKH format
- The unlocking script signature has the hash type
SIGHASH_ALL
withoutSIGHASH_ANYONECANPAY
Given this, the following algorithm could be used to determine the risk level an otherwise valid zero-confirmation transaction possesses:
- Are all of the transaction’s inputs spending outputs in a way that is known to allow for a double-spend proof?
- If not, the transaction is not considered double-spend provable, and the risk level should be increased.
- If so, for each of the parent transactions, increase the perceived risk if any of the following conditions are met:
a. The parent transaction is a zero-confirmation transaction
b. The parent transaction is a zero-confirmation transaction and is not double-spend-provable
c. The parent transaction itself spends zero-confirmation transactions.
d. (Optionally) If the transaction has a high chaining depth.
Giving concrete values to this risk level is left for each implementer to determine based on the needs of the application being supported. In the simplest case, any of the above risk increases could be seen as reason to wait for at least one block confirmation before accepting the transaction.
A less risky transaction may still be double-spent, and a more risky transaction may ultimately be mined in the next block, but this assessment provides a mechanism for utilizing the knowledge that double-spend proofs for the evaluation of transaction trustworthiness. It allows for clearer end-user guidance, which in turn theoretically leads to increased confidence in accepting BCH transactions.
2: Extending Double-Spend Proofs: Message Format Changes
The following changes to the dsproof message format would allow for support of any non-trivially double-spent transaction outputs:
-
Always requiring the non-empty versions of
hash-prevoutputs
andhash-sequence
, instead of using the value actually required by the signature. -
Expanding the
hash-outputs
field to be a list ofhash-outputs
values, prefixed by their determining hash type bit value (for now, alwaysSIGHASH_ALL
(0x01) followed by
SIGHASH_SINGLE(
0x03)). -
Using the following
push-data
requirements by locking script type:a. P2PKH: Include all but the last push value (the signature but not the public key)
b. P2SH: Include all but the last push value (everything except the redeem script)
c. All other scripts: Include all of the values pushed in the unlocking script
This convention reduces the amount of data contained within double spend proofs for the most common script types while still maintaining flexibility for the less-used types.
Additionally, the changes to the hash-outputs
affect the canonical order of the spenders. To address this, the hash-outputs
values should be compared in order, the same way the single value was previously. That is, in numerically ascending order of the hash, interpreted as 256-bit little endian integers. If those are the same, the hash-prevouts
values are compared the same way.
The above changes ensure both that the unlocking script for the double-spend transaction can be fully and accurately re-created, and that the data is available to validate any signatures that might be encountered during validation of the double spend proof.
With these changes, the top-level message format would remain the same but the spender format for each transaction would become:
Field | Length | Format | Description |
---|---|---|---|
tx-version | 4 bytes | unsigned int | Copy of the transactions version field |
sequence | 4 bytes | unsigned int | Copy of the sequence field of the input |
locktime | 4 bytes | unsigned int | Copy of the transactions locktime field |
hash-prevoutputs | 32 bytes | sha256 | Transaction hash of prevoutputs for hash type SIGHASH_ALL | SIGHASH_FORKID. |
hash-sequence | 32 bytes | sha256 | Transaction hash of sequences for hash type SIGHASH_ALL | SIGHASH_FORKID. |
hash-outputs-count | variable | var-int | The number of hash-outputs values to follow |
hash-outputs-hashes (see below) | variable | byte-array | A list of serialized hash-outputs values |
list-size | variable | var-int | Number of items in the push-data list |
push-data | variable | byte-array | Raw byte-array of a push-data . For instance a signature |
For hash-output-hashes
referenced above, the following format is used for each relevant hash type bit:
Field | Length | Format | Description |
---|---|---|---|
hash-type-byte | 1 byte | byte | The hash type value the following hash corresponds to. |
hash-outputs | 32 bytes | sha256 | The output(s) hash variant resulting from the preceding hash type. |
At present this would always be the hashes for SIGHASH_ALL
(0x01) and SIGHASH_SINGLE
(0x03), in that order. Both are always required.
This would result in a static increase of 35 bytes per spender, for a total of 70 additional bytes per double-spend proof. These extra bits correspond to the hash-outputs-count
field, hash-type-byte
fields, and the extra hash-outputs
field. For non-P2PKH scripts, the push-data
array would also vary more in size, depending on the unlocking script of the transactions.
3: Expanding the Double-Spend Risk Assessment
With any expansion of the double-spend proof format, the risk assessment defined above should be expanded to match the new reality of double-spend proofs.
With the above expansions to the double-spend proof format, the following cases now need to be acknowledged (as they are permitted by the newly expanded script types):
- Unlocking scripts that result in no signature being executed (e.g. a P2SH that acts like a password)
- Unlocking scripts that contain a signature that cannot be obtained easily without execution (e.g. custom P2SH scripts)
- Unlocking scripts that contain multiple signatures (e.g. raw multisig)
- Non-standard scripts that could result in any of the above
These cases are likely to only be able to be handled by full nodes. As a result, the double-spend proofs are unlikely to be valuable on their own to SPV nodes, making this risk assessment even more important.
One additional scenario is when an unlocking script exclusively uses signature hash types that do not cover the whole transaction (i.e. anything other than SIGHASH_ALL
without `SIGHASH_ANYONECANPAY). There may be circumstances in which a transaction with one or more such scripts would still yield a valid double-spend proof, but further analysis is required for that to be determined. Future expansions of the risk assessment may choose to address those cases, providing sufficient evidence that they are safe.
As a result, the high-level assessment algorithm can remain. The assessment of whether a double-spend proof can be generated for a particular input then becomes:
-
During the validation of the input, at least one signature was found
a. If no signature verifications were required to validate the input, the prevout is potentially double-spendable with the same script; this transaction is high risk
-
At least one signature has the hash type
SIGHASH_ALL
withoutSIGHASH_ANYONECANPAY
a. With this the script is sufficiently tied to its containing transaction. Any double-spending transaction would therefore have a different unlocking script and therefore a double-spend proof could be created.
Note that while transactions that fail the assessment are not guaranteed to have double-spend proofs (and should therefore be considered risky) a double-spend proof is still possible, if the double-spend transaction does match these rules.
Final Thoughts
With these changes, full nodes and 3rd-party zero-confirmation risk assessment services should be able to relay proofs for nearly all common double-spend attempt methods, while also being able to identify transactions that are inherently double-spendable. If these changes are accepted (in spirit) by the BCH network, then the next steps would be creating a series of test-vectors that demonstrate the various risk-states for unconfirmed transactions for nodes and services to use as implementation tests.