Restrict transaction version numbers

markblundeberg · September 3, 2020, 2:54am

There are many ideas under consideration about introducing new transaction formats. For keeping things clean it is best if the new transaction format can use a never-before-seen transaction version number. Since the transaction version number is the first four bytes of a transaction, this means that a transaction parser can trivially branch between ‘legacy format’ and ‘new format’ based on the first four bytes, without needing any context.

Currently there is only a policy-level rule that transactions shall be version 1 or 2. (OP_CSV and BIP68 are only active for tx version >2, hence why version 2 is allowed). However any transaction version is permitted by consensus.

Thus there is a problem: right now if we (say) propose that a new transaction format will use version=123456 (never used as of today), then a disruptive person can mine a block that includes examples of legacy transactions with version=123456, prior to the proper activation of the new format. Then, the new transaction format will be reusing a prior-seen transaction version, and this creates a technical nightmare for nodes and wallets that now cannot parse transactions without having context of which block those transactions have appeared in or will appear in. Context-free parsing is very valuable and worth keeping.

Hence, a proposal:

An upgrade shall be made that restricts transaction version numbers to be 1 or 2, at consensus layer.

Estimated negative impacts:

Miners: No harm, since miners generally use transaction version=1 for the coinbase. Unlike the block header version, there is no advantage / meaning assigned to the coinbase tx version.
Wallets/ecosystem: No harm for wallets which are already constrained by the version=1,2 rule when generating new transactions.
Full node development burden: This would be a routine upgrade with no weird side effects and can be easily included as a MTP-activated rule in ContextualCheckTransaction (or equivalent for other codebases).
Future development: This change is purely meant to simplify future development of hard forks, but at the same time it hampers the options available to future soft forks. If it is anticipated that BCH will ossify and convert to a soft-fork-only network like BTC, then the proposed version restriction should have a predetermined sunset date.
The actual introduction of a new tx format will require a great deal of work from all ecosystem players. Thus if the entire concept of a new tx format is opposed, then this version restriction may also be opposed as it increases the likelihood of a new tx format being introduced.

tom · September 3, 2020, 6:54am

Sounds like a good idea.

I tested this idea when I worked on FlexTrans (it was a heavy handed alternative to SegWit) and realized that testnet3 was full of transactions that used random version numbers. This should not be an issue at all, just a good idea to make activation based on a time/and-or/height and activation only on BCH branches of testnets.

im_uname · September 3, 2020, 7:23am

I doubt very many people oppose a new tx format per se, people just all have different ideas what the new format looks like.

On topic: An attacker might attempt to consume all the lower version numbers before activation, but I guess consuming all 4 billion possible versions is quite hard - so concept ack.

tom · September 4, 2020, 7:17am

Since you bring it up: it is relevant to point out that this kind of effort (and I know, I went through a lot of steps previously) is absolutely massive strain on the system where 100% of the infrastructure, wallets, etc all need to get the new code which is not trivial to test and such an effort can’t have a 3 months notice. More like an 18 month notice.

This is then a really good point: if you need to convince an entire ecosystem to upgrade all of their software, the bar lies really high for the requirements of this format. It really has to be worth the effort for everyone. Not just a minority that wants some non-money feature.

bitjson · January 20, 2021, 6:25pm

Hey all, I agree that new transaction versions need a lot of time for review and implementation, so I’d like to get the conversation started on enabling contract-validated token systems for BCH. I think it’s possible without raising any current VM limits or requiring miners to track any new type of data: we just need the ability to create fixed-size inductive proofs and some way to read/manipulate the currently-incompatible integers in transaction serializations.

This change might also be a good time to begin restricting version numbers.

I’m proposing May 2022 as the deployment date. I’d appreciate any questions, comments, or feedback you have. Thanks!

tom · January 22, 2021, 9:00pm

A CHIP: 2021-01-Restrict Transaction Versions.md

BigBlockIfTrue · February 14, 2021, 1:00pm

Reviewed, looks good to me. Recommend activation of this CHIP in the first network upgrade after May 2021.

proteusguy · March 12, 2021, 6:00pm

Does this overlap/have anything to do with the seemingly random versions that miners assign to their blocks? Is there a story behind this chaos that I’m not aware of?

tom · March 12, 2021, 7:33pm

Only in passing. The block version numbers look more sane if you look at them as hexadecimal. Also check the BIP9 spec.

freetrader · March 13, 2021, 1:26pm

Reviewed, looks good to me. Recommend activation of this CHIP in the first network upgrade after May 2021.

Likewise

andrewstone · March 16, 2021, 2:32pm

I support this change, so BU does so unless a BUIP is explicitly raised against it.

TierNolan · April 5, 2021, 12:44am

As long as 50%+ of the miners support this rule, it isn’t possible for someone to flood the network with transactions that are not version 1 or 2.

I realise that Bitcoin Cash is less opposed to hard forks. Bear in mind that making it a consensus rule that only version 1 or 2 transactions are valid means that new transaction versions require a hard fork too.

The IsStandard check means that most miners will refuse to mine transactions that aren’t 1 or 2 anyway.

References

Gitlab source

bool IsStandardTx(const CTransaction &tx, std::string &reason, bool allowMultipleOpReturn) {
    if (tx.nVersion > CTransaction::MAX_STANDARD_VERSION || tx.nVersion < 1) {
        reason = "version";
        return false;
    }

and

Gitlab source

    static const int32_t MAX_STANDARD_VERSION = 2;

tom · April 5, 2021, 2:00pm

Hi TierNolan.

Just want to say that you nailed it and all you write is correct. Thank you for adding!

Personally I prefer to not talk about “hard fork” and just call it a protocol upgrade. This makes clear that such a change does not cause the chain to split.

Again, thank you for adding your post here!

freetrader · April 5, 2021, 5:54pm

Correct, but as @markblundeberg points out in the description, it’s more about not having unwanted tx versions be mined after announcement of a proposal.

TierNolan · April 5, 2021, 10:00pm

I was thinking of someone just creating transactions. That is mostly protected by miners using the IsStandard rules. But agreed, even then a non-standard enforcing miner could mine those transactions, if a spammer could get them to the miner.

The only way to actually prevent the spam would be to have a majority of miners do a covert soft fork.

Essentially, 50%+ of miners announce that for the next 60 days, they will reject any blocks with transactions versions other than 1 or 2. During that 60 days, a formal rule could be discussed and put in place.

Is there a list of what versions have already been used?

Something like this would mean that it doesn’t require a hard to add a new version number later.

if (block_height < 750000) {
  if (version < 1 || version > 2) {
    return false;
  }
}

Every year, 50k could be added to the threshold unless there is another transaction version required.

The other side of the argument is what happens to people who have non-standard transactions but can’t spend them due to locktimes in the future. Arguably, it is their own fault for using unspecified version numbers.

bitcoincashautist · December 9, 2021, 9:29am

How big of a problem is this? Those dealing with new not-yet-mined TX-es can parse them using the new rules even if the same ver. appeared in the past, because whatever block they’ll land in, it will certainly be higher than the block where some new TX format was activated.

If you’re doing mined TXes verification, then you need to be aware of block height anyway to know which consensus rules to apply for verification.

I guess the problem would be for non-node software which is fed already verified TX-es and not verifying it by itself, and if it assumes that the TX format will be the same regardless of block height and what’s written in the version field. Do we know how much software has this assumption?

I was talking with @im_uname and @freetrader about this, and we agreed that a decision should be made ASAP whether to commit to breaking this assumption for May 2023 upgrade, and then communicated to stakeholders, so they can start updating their software “today” not to make assumptions which won’t hold in the future.

I agreed to write some kind of technical bulletin that can be circulated among stakeholders, consider this as information gathering in preparation for a “Need to create new tx format for 2023 upgrade?” forum post

tom · December 16, 2021, 11:40am

This whole thing leaves a bad taste in my mouth as this was discussed at length 18 months ago with all the relevant stakeholders agreeing. Not blaming anyone, just sad to see that something this low on impact and this high in return would miss the 2022 slot.

The quoted part from Mark is something I agree with; its a technical nightmare for wallets should this not be done WELL in front of any transaction format change.

The timing is the point here, there are thousands of places that parse transactions. Many javascript libraries, tons of websites. Practically all of them just ignore the version, as they would break otherwise. Visitors to a website will call the website broken if it marks as unreadable a transaction that has been included in a block but lied about its version number.

To roll out both the restriction-of-version-number and the actual new tx format at the same time means you need to push the entire ecosystem an update that says “NOW you can trust the version number, and a version N has to be routed to different code”.

That is literally a nightmare scenario.

The alternative, as Mark also said, means that those pieces of code should know more context that they currently don’t need or have, most places that parse transactions are at best SPV and can’t do something like “MTP”.

Both scenarios make the amount of work needed to introduce a new tx format a lot harder and a lot more fragile in the months between release of such software and actual BCH upgrade. The solution to do the version number requirement first means that such software deployments completely eliminate this complexity as they can be certain that a tx of version NEW is going to be in the new format.

A self-contained transaction that without context can be judged to be well-formed is a pretty useful thing for software developers.

If the people in charge would be suggesting now that in 2023 an incompatible transaction format is going to be activated and no version safety exists until that same day, then I fear a lot of libraries will not get written or released as the foundation is too unstable to build on. Its just too hard to do so correctly.

bitcoincashautist · December 16, 2021, 12:50pm

I agree, it’s unfortunate that TX version lock didn’t make it, as it would make it possible to later use not-yet-seen version numbers to signal some consensus rule.

I have drafted the bulletin mentioned above, it’s already been reviewed and approved by freetrader and matricz:

Once merged, it can be a starting point for any public duscussions regarding the TX FORMAT.

tom · December 16, 2021, 2:05pm

Maybe you can include the observations there, that the risk can be lowered massively by staggering the tx-format versioning upgrade and the actual tx-format change upgrade.

bitcoincashautist · December 19, 2021, 10:29am

Added a subsection:

Versioning The FORMAT As a Feature

There would be value in paying off the technical debt by versioning the FORMAT, so any future upgrades that would prefer to change it would not have to take the whole burden of changing the format.

Old software doesn’t expect the version field to have anything to do with the format of the TX it’s being fed.
First step would be locking the version field using consensus, so it could later be used to signal any future consensus rules that would come after the version lock.
Just the version lock wouldn’t break old software so only new, version-aware, software would benefit from that.

It would still help with general preparedness for a future, breaking, change.
From then onward, we could use part of the version field as an upgrade counter and match it 1-to-1 with applicable consensus specification.
As a consequence:

Any version value seen before the version lock would relate many-to-many with “prehistorical” consensus specifications, therefore be indeterminate.
However, it would narrow it down to pre-lock era.
Post-lock, the counter part of the version field would be 1-to-1 with newer consensus specifications.
We’d have other bits free to allow us to have multiple kinds of TXes exist part of the same consensus specification, which would make the whole version field relate many-to-1 with consensus specifications.

Alongside the counter, we could use a flag to signal whether an upgrade is breaking or non-breaking, e.g. when we increment the counter for a new HF, if it’s non-breaking then we don’t toggle the flag, and if it is, then we toggle the flag.
Old version-aware software would know the old state of the flag so could know whether it can safely process counter+1 TXes even if it knows it can’t fully understand them.

In that scenario, upgraded software could know what rules apply to the TX even without knowing the block height, but it still needs to be upgraded so that it could match the version with applicable consensus rules.

As the consensus upgrade counter part of the version field would be the most important and long-lived feature, the 4-byte* uint should be split into 16 bits for the counter, 1 bit for the breaking flag, and 8 bits for the type of transaction.
The remainder* would be reserved for future use.

(*) Because the version field was never locked, and if we want to enable context-free transaction parsing, we’d have to use only the not-seen-prior-to-lock numbers to encode the newVersion so the new version would have a range smaller than 4 bytes.