Transaction malleability: MalFix, SegWit, SIGHASH_NOINPUT, SIGHASH_SPENDANYOUTPUT, etc

bitjson · February 16, 2021, 5:08pm

Is anyone still interested in an all-encompassing solution to third-party transaction malleability for Bitcoin Cash?

Some background reading:

As I understand it, the primary expected benefit of some large-scale “malleability fix” would be:

To enable applications that rely on off-chain unconfirmed transaction chaining.

So: is malleability solved?

BCH has already rolled out a number of upgrades addressing malleability in various places. Are we done? Are there any other “use-case breaking” sources of malleability we should fix?

bitjson · February 5, 2021, 9:09pm

I wanted to bring up this discussion because a transaction malleability solution could also replace half of the PMv3 proposal:

If anyone is interested in this topic, we should adjust that spec sooner rather than later, since it’s unlikely we’ll want to implement both hashed witnesses and a “malleability fix”.

freetrader · February 5, 2021, 10:24pm

My understanding was that 3rd party malleability was completely fixed over a range of BCH upgrades, but I could certainly be wrong and it’s a good question.

No more third-party malleability : By completing the BIP62 malleability fixes, it will be possible to make P2PKH-spending transactions that cannot be malleated by any third party (miners, relay nodes, etc.). This removes the need for OP_CHECKLOCKTIMEVERIFY timeout clauses. We can also engineer smart contracts to be immune from malleability.

From BIP62 & Schnorr by @markblundeberg

BigBlockIfTrue · May 13, 2021, 4:48pm

Third-party malleability has been eliminated by a sequence of network upgrades. See here for an overview.

Second-party malleability can be avoided by using Schnorr signatures.

So there is essentially no more malleability to be fixed. Compared to SegWit, the only disadvantage I am aware of is that you cannot sign a child transaction before the parent transaction is signed. But that is a different problem than malleability.

bitjson · February 17, 2021, 12:10am

So there is essentially no more malleability to be fixed.

That’s my impression too – it’s something contract authors (or compilers) will always have to review to avoid griefing, but I think pretty much any unlocking script can be validated-enough to remove malleability vectors.

Compared to SegWit, the only disadvantage I am aware of is that you cannot sign a child transaction before the parent transaction is signed.

And even that use case is easily solved with covenants. Not only do you not need to know the parent transaction ID before signing, you can pre-sign dozens of CashChannel authorizations even if your CashChannel doesn’t yet have a high enough balance to pay them all. The receiver can simple process the authorizations whenever you’ve topped it up (and the proper payment time has been reached).

TierNolan · April 10, 2021, 11:11pm

The fact that the question has to be asked shows the problem with the current solution. There is no way to be sure that malleability is actually fixed. The current solution is to list all possible sources of malleability and then close them off one by one.

Segwit solves malleability directly and I think the best solution would be to extract that feature out of the segwit system.

Peter Rizun has a video where he explains his problems with segwit. Specifically, he feels that segwit degrades the commitment to transactions signatures.

This has been interpreted by some to say that the Bitcoin Core software doesn’t actually check segwit signatures. This is obviously untrue, but it is almost a meme at this point.

Peter Rizun’s key insight is that segwit makes the transaction signatures less valuable to miners than under the old rules.

The reason for this is that, under segwit, you can prove which UTXOs are spent without inherently providing the signatures. Under pre-segwit Bitcoin, the miners need the signature data to confirm the merkle root.

If hard forks are being considered, I think a compromise would get the best of both worlds.

txid → hash(Transaction, including signatures)
ctid → hash(Transaction, excluding signatures (scriptSig = 0))

The txid would be used for the merkle root in each block. This means that you must provide the signatures in order to prove that which transactions were included in the block. Miners would need the full transactions (including signatures), to update their UTXO sets.

The canonical txid (ctid) would be used for transaction inputs. This could be a new transaction version.

This eliminates the second class citizen problem for signatures, but means that changing the signatures (or scriptSig) can’t invalidate transaction chains.

That looks to me to give the best of both worlds.

BigBlockIfTrue · April 11, 2021, 9:51pm

This is like saying “there is no way to be sure that smart contracts contain no bugs”.

The Bitcoin Cash scripting language now is powerful enough to write locking scripts in a non-malleable way. At this point, if some transaction is still malleable, then this is a bug in that specific smart contract, not a bug in Bitcoin Cash.

It’s not hard to find and solve this type of bug while writing the smart contract. In fact, analysing existing locking scripts this way is how we ended up with the original list of malleability issues that we now solved.

TierNolan · April 12, 2021, 1:13pm

I disagree. If you keep a smart contract short, then it is less likely to have bugs.

If auditing two implementations of a smart contract, the shorter one (or at least the less complex one) is less likely to have a bug. It is reasonable for an auditor to say something like “The code does not have conform to a consistent coding style that makes review more difficult. This increases the risk of an edge case that was not considered.”.

Making the txid for referencing previous transactions depend only on parts of the transaction that are signed means that malleability post signing is impossible.

This is inherently safer than looking at the signature scripts and listing all the ways they could be modified in a way that will invalidate the signature. If you miss one, then the transaction malleability returns.

Do you mean using the new opcodes? If so, then fair enough. The NOINPUT version of sighash can be emulated which gets much of the benefits.

If you mean that BIP-62, then you are probably right, but it is a higher risk than just directly fixing the problem.

bitjson · May 13, 2021, 4:48pm

Thanks for bringing this back up @TierNolan!

I’ve been thinking about malleability a lot recently, and I’m now thinking it’s worth addressing further.

First I want to acknowledge: in the Bitcoin Cash world, we have both covenants and reasonably secure zero-confirmation transactions; malleability is mostly an inconvenience.

The original use case for which SegWit was designed – pre-signing chains of off-chain transactions – can be easily accomplished with BCH covenants. The CashChannels implementation even offers some additional functionality which isn’t possible on BTC. If someone bothered to do it, it’s quite easy to build off-chain micropayment networks like Lightning Network using covenants. (And once BCH transaction introspection lands, those covenant settlement transactions could use fewer bytes than lightning network settlement transactions.)

All that said, malleability is still holding back some meaningful improvements. From the PMv3 thread:

Malleability makes contracts less efficient and harder to validate – most non-trivial contracts must carefully validate all unlocking bytecode data to prevent vulnerabilities introduced by malleation, and this validation both bloats the contract and makes it harder to review for security. (For example, most covenants which use OP_SPLIT are vulnerable to a sort of “padding” attack which is not intuitive to first-time contract authors.)

The primary blocker to deduplication in transactions is unlocking bytecode malleability – because unlocking bytecode typically contains signatures (and signatures can’t sign themselves), unlocking bytecode is excluded from transaction signing serialization (“sighash”) algorithms. This is also the reason why unlocking bytecode must contain only push operations – the result of any non-push-including unlocking bytecode is a “viable malleation” for that unlocking bytecode. But if unlocking bytecode is signed, transaction introspection operations offer an opportunity to further reduce transaction sizes via deduplication. In a sense, if non-push operations could be used in unlocking bytecode, transactions would effectively have safe, efficient, zero-cost decompression via introspection opcodes.

So as @TierNolan mentions above, a malleability solution would allow contracts to be smaller (in both byte size and operation count) and easier to audit. It would also unlock a new category of transaction size optimizations which don’t increase CPU usage (like generalized compression with sztd, gzip, etc.) and with which transactions would even remain size-optimized during use (it stays “compressed” even during evaluation and in normalized databases).

I also wrote a bit more in the PMv3 thread about other considerations, but in short, I think there is a surprisingly simple solution: instead of removing the unlocking bytecode from the transaction hash (SegWit), we can just start signing the unlocking bytecode too. This would prevent all possible types of transaction malleability. (And enable some other contract features.)

There’s more discussion in the PMv3 thread, and the precise implementation is here.

I’d love to hear what you all think about that approach!

bitcoincashautist · December 31, 2021, 8:33am

Instead of removing the unlocking bytecode from the transaction hash (SegWit), we can remove signatures from the unlocking bytecode, but we don’t need to bring them outside the input, they can stay attached to the input. This way, signatures can sign everything except themselves and other signatures, and we don’t have to break the TX format. Here’s an alternative proposal I drafted to capture this idea. Note that we wouldn’t get signature compression as we would with PMv3, but the proposal would make them compressible on other layers.

We can extend the input format to contain an optional data attachment, inserted as a prefix to the unlocking script.

transaction inputs
- input 0
  - previous output transaction hash, 32 raw bytes
  - previous output index, 4-byte uint
  - unlocking script length, compact variable length integer
  - unlocking script
    - PFX_SIGNATURES, 1-byte constant 0xEF
      - signature 0 length, compact variable length integer
      - signature 0, variable number of raw bytes
      - …
      - signature N length
      - signature N
    - real unlocking script, variable number of raw bytes
  - sequence number, 4-byte uint
- …
- input N

We will refer to the full prefix with its arguments as “detached signature annotation”.
Unupgraded software, unaware of input format change, will interpret the annotation as part of the unlocking script.
As a consequence:

Unupgraded node software would fork the blockchain because, from the point of view of unupgraded software, such unlocking script will be seen as starting with a disabled opcode.
Unupgraded non-node software should already know how to deal with disabled opcodes found on the blockchain, so should not break when encountering them.
From its point of view 0xEF could be some new data push opcode, followed by random data.

Signature Preimage Format

Simply put, when a detached signature is used the preimage will be everything indicated by the hash type but with detached signature annotation(s) excluded.
Only the detached signature prefix byte and their count for the input will be included in place of detached signature annotation.

This definition allows for some future upgrade to transaction format that would add another input field using the same PreFiX byte approach.
Reordering multiple PFX fields would invalidate the signature because the signature hash would commit to the specific order and number of detached signatures.

For the 6 valid signature hash types in Bitcoin Cash, this means:

SIGHASH_ALL | SIGHASH_FORKID Signature hash commits to the entire TX except detached signatures.
SIGHASH_NONE | SIGHASH_FORKID Signature hash commits to the entire input side of the TX except detached signatures.
SIGHASH_SINGLE | SIGHASH_FORKID Signature hash commits to the entire input side of the TX except detached signatures, and the output with the same index.
SIGHASH_ALL | SIGHASH_ANYONECANPAY | SIGHASH_FORKID Signature hash commits to its own entire input except the detached signatures and all transaction outputs.
SIGHASH_NONE | SIGHASH_ANYONECANPAY | SIGHASH_FORKID Signature hash commits only to its own entire input.
SIGHASH_SINGLE | SIGHASH_ANYONECANPAY | SIGHASH_FORKID Signature hash commits only to its own entire input and the output with the same index.

bitcoincashautist · April 4, 2022, 12:45pm

@bitjson since you opened the P2SH32 topic, I had a fresh look at BU’s “nextchain” box and saw they did some research on that front, which might be of interest for further research into our blockchain theory: Transaction Changes

Separating WHAT (state transformation) from HOW (exact unlocking bytecode) is valuable insight.

Recognizing that a transaction is fundamentally exactly and only a transformation of blockchain UTXO state allows for a clean input script malleability solution. From the point of view of the blockchain, all valid transactions that effect the same UTXO state transformation are equivalent, since the UTXO state is the only data that subsequent transactions can access.

If a transaction consumes (and removes) the same coins, and produces the same outputs, the final UTXO will be exactly the same regardless of how the transaction accomplished this. Therefore, for example, it does not matter how a transaction satisfies its input constraint scripts, only that it does so.