CHIP 2021-01 Allow Transactions to be smaller in size

I definitely am a fan of this CHIP and think it can be grouped into the next protocol upgrade, but does not warrant a protocol upgrade by itself. My thought process is simple. Existing miners can use the exact same code they have today, and have extra padding in their coinbase transaction, it doesn’t require them to change anything. New miners or users will have a better experience and we aren’t continuing with a limit that wasn’t well specified. I complained about this issue to ABC, and would love to see it simply be != 64.

Then as a bonus there are a few transactions that are < 64 bytes that would be allowed.

3 Likes

Concur except I also prefer the more precise !=64 assertion because it helps with institutional memory. Imagine one day the issue with the merkle tree somehow goes away - this restriction can be easily removed with little risk. However, imagine also that prior to then some issue pops up in an unrelated “improvement” that would break if a tx is <65 bytes but it’s either not detected or documented because there’s already this wider scope restriction. Now when the attempt is made to remove the restriction because the merkle tree caveat no longer applies - we hit surprise runtime issues.

This is the kind of technical debt we find in long term “enterprisey” ™ projects and I’d prefer to nip it in the bud rather than introduce the potential for it.

Mostly I agree 1000% with Tom’s consideration of project-wide cost of future adoption and the desire to make things simpler for future stake-holders. This is critical and has clearly not been a community-wide priority given the current state of much of the core software. Little by little we should remedy this whenever possible and certainly not introduce more of the issue when not absolutely necessary.

2 Likes

I find that a very compelling argument, and frankly the first real argument to decide between the two options.

Together with the same feeling from quest I’m tempted to change the proposal to be “NOT 64 bytes”. I’ll leave this topic open here for a while longer to allow others to find it and comment before I change it, though.

3 Likes

Perhaps ≠64 makes more sense if you think the limit is going to be removed eventually, and >64 makes more sense if you think the limit will remain permanently. ≠64 is more work now, >64 is potentially more work in the future. The technical debt only exists if you anticipate future removal of the limit.

Personally I don’t think it’s likely that the limit is ever going to be removed, because removal requires significant changes across the ecosystem and there is now no more incentive to make these changes.

The origin of this story is, in its simplest form, that the merkle-proof message sends over a list of hashes and the definition if any hash is a leaf or a node is based on an assumption. An assumption that can go wrong should a transaction be an exact multiple of the hash-size.

The simplest solution is to avoid having such transactions.
The technically more correct solution is to fix the message-format for merkle-proofs and make things explicit there. A simple one byte addition per hash to indicate its type.

In future I’m sure that the message-format will change as the medium will change. People might start to send stuff over JSON as a simple example.

As such I’m quite optimisitic that as time goes on we fix this properly, in the right layer. It may take 10 years as communication layers are upgraded, but I do think that will happen. Probably in much less than 10 years.

And when that happens, there no longer is any need to forbid the 64 bytes tx.
Remember, the only reason this solution was picked is because it had practically zero impact.

3 Likes

I’ve reviewed the CHIP and support it as written. My interest is as a development stakeholder.

2 Likes

I reviewed this and would support activation of this CHIP (revision 75b97e22a3dc295de7373255025f38cd0911b866) in the first network upgrade after May 2021.

1 Like

I rename the chip, adding CHIP in the filename and generally making it clearer. Which made the link above fail.

Here is the new link: CHIP-2021-01-Allow Smaller Transactions.md

Are 63 bytes and smaller allowed now? The wording is ambiguous: the word “limit” in “the limit that transactions shall not be 64 bytes in size” could perhaps better be replaced with “restriction” or “rule”. The impact section still says changing the minimum from 100 to 65. But other sections appear to have changed in line with the argumentation for allowing 63 bytes and smaller.

1 Like

Thank you for proof reading it, I agree the wording was not as clear as it could have been. I pushed a commit to clarify and used the word “rule”.

I updated the last edit date to today, but no changes in version number based on this being a language change not a change that affects the spec.

Awemany (working as a BU dev) and I originally brought this problem to the BCH community’s attention. When ABC chose to limit at 100 bytes I pointed out this was idiocy way back before the original fork, and was ignored. So BU is happy to support a change that fixes yet-another-dumb-decision autocratically made during the ABC days.

3 Likes

Is this ready for implementation? (for May 2023 activation)

2 Likes

in my opinion it is ready for implementation into BCHN

3 Likes

The technically more correct solution is to fix the message-format for merkle-proofs and make things explicit there. A simple one byte addition per hash to indicate its type.

I think we should go that way. Or even better: make a tagged hash, to avoid using hashes in a different context (so that “some 512-bit tag” can be used to set the first 512-bit block for SHA-256). Also because merkle branch is not the only thing hashed by SHA-256d. There are also block headers (80 bytes per header), and it is possible to create some block header, that could be interpreted as a valid transaction. Of course, attacking in this way is hard, because it would require mining a lot of bytes, but technically, it can be quite well aligned (also because it is possible to set sequence to 0xffffffff, and then use locktime as a nonce to mine a transaction).

I think it would break waaay to many things, without a clear benefit.
If quantum computing ever becomes a threat, we’d need to move to 384-bit hashes which would break stuff all the same but the benefit would be: survival of the blockchain :smiley: So, in the same go we would be able to fix this too.

Is it, though? I don’t think it has enough degrees of freedom for it to even be theoretically possible to match a valid TX. I entertained this idea while working on group tokens (“unforgreable groups”) CHIP and had the idea of being able to also allow conbase TX-es to create a new group, where it would be generated from the previous block hash instead of TXID.

PS oh, but you mean using an 80-byte TX to match a block header… in a sort of collision attack? Ok, suppose you manage it and now there’s a TX and a block with the same hash, what’s the consequence?

would you want to work on a new message-type in the current p2p layer, or start working on a new message-layer altogether in order to get this rolling?

Ok, suppose you manage it and now there’s a TX and a block with the same hash, what’s the consequence?

I don’t have that much computing power. But I know that it is technically possible, in the same way as it is technically possible to create 64 byte transaction, that would be interpreted as a merkle branch, with two connected 32 byte hashes. Here is how:

+-----------------------+-------------------------+
| blockHeader(80)       | bitcoinTransaction(80)  |
+-----------------------+-------------------------+
| version(4)            | version(4)              |
| previousBlockHash(1)  | inputCount(1)           |
| previousBlockHash(31) | previousTransaction(31) |
| merkleRoot(1)         | previousTransaction(1)  |
| merkleRoot(4)         | previousOutput(4)       |
| merkleRoot(1)         | emptyInputScript(1)     |
| merkleRoot(4)         | sequenceNumber(4)       |
| merkleRoot(1)         | outputCount(1)          |
| merkleRoot(8)         | amount(8)               |
| merkleRoot(1)         | outputScriptSize(1)     |
| merkleRoot(12)        | outputScript(12)        |
| timestamp(4)          | outputScript(4)         |
| targetBits(4)         | outputScript(4)         |
| nonce(4)              | locktime(4)             |
+-----------------------+-------------------------+

By using locktime as a nonce, miners can mine a transaction that would start with many zero bits, and can try to trick users, that they mined some valid block. But only full nodes would know that it is not the case, and that this block is fake. However, to make any serious attack on full nodes, it is needed to form a valid block header and a valid transaction at the same time, then some software bugs of the full nodes could cause some chaos. But it is very, very hard. Probably comparable to 64-byte transaction attack. To make it serious, it is needed to hardcode a lot of bytes to make it valid from both points of view: as a transaction and as a block header. That’s why it is quite unexpected to see some serious attack in the near future, but in theory, it is possible.

When it comes to the consequences, I am still not sure, because I need to read more code. Or maybe we could prepare some test version, where for example SHA-256 is reduced to 16 rounds, then it may be possible to mount some kind of such attacks, and see, how that node would behave.

1 Like

Awesome, thanks for laying it out clearly like that!

We can observe something here: there are at least 32 bytes you can’t possibly match other than by brute-forcing the previous transaction hash (because you must roll the previous transaction contents until you hit your desired hash in this one) which turns this problem into a preimage attack against 256 bits.

If you wanted to make it “easier”, you’d roll both the previous block and the previous transaction simultaneously, which would turn it into a collision search with 128 bits of security but made harder by the fact you need to also match the 0s with the collision. And after finding that so those bits “click”, you’d still need to mine this block, all inside a 10-minute window.

Btw, nonce is not anymore enough to hit a block, and I think difficulty grew enough that miners started rolling parts of version, timestamp and merkle root too! BTC is doing about 200EH/s which translates into some 2^77 attempts per 10-minute block, so you need at least 10 bytes of wiggle room. With BCH it’s lower but still some 1.3EH/s meaning 2^69 and requiring about 9 bytes.

Got a supportive statement from a miner, and permission to quote him:

I do support the TX size one, removing the only case that create confusion instead of a whole range is the way to go.

– checksum0, Bitcoin miner since 2010

1 Like

Regarding the whole != 64 vs >64 debate – I think there is a reason to prefer >64 bytes as the rule.

I’m thinking of lib authors here when I say that for software that authors txns, >64 is a better rule for human reasons and software engineering reasons.

Let me explain why. It’s generally easier on software to have a minimum length of things, this way you are more likely to “trigger” the rule and get an error early if you inadvertently violate the rule. A magical forbidden value of precisely 64 may be “surprising” to some software. 61 works, 62 works, 63 works, etc… but 64, when hit occasionally, fails.

This can be exasperating for someone that is not familiar with all the BCH rules but is building libs to build txns for BCH – if you only hit the magical 64 value 25% of the time, it is strange to you why your txn is rejected and is very surprising behavior. So, the != 64 rule is less likely to fail early. >64 is more likely to be triggered and more likely to “alert” software authors that there is a length rule, so you better watch out!

So for this reason I prefer >64, as @BigBlockIfTrue suggested initially.

3 Likes