Possible ABIv3 as default contract interface

I am looking for feedback on a potential future EIP regarding a standardized compact representation for calldata in EVM languages. The design I’ve been kicking around is as follows:

To illustrate, in ABIv3 a call to foo(bool[12]) would encode as 0x047f for [ false, false, false, false, false, true, true, true, true, true, true, true ].

In the header byte (zeroth byte), the first three bits are the encoding version identifier, 000. The next five bits are the function identifier 00100, in this case, 4. If the function ID is 31 or greater, all five bits are set and the RLP encoding of the function ID is appended to the header byte.

Next is 7f which is the RLP encoding of the integer which represents the values in the bool array, where each element is represented by a bit. Notice that in this case the first five bools are false which makes the integer 000001111111 or 127 i.e. 0x7f.

Boolean arrays are a special case. Everything else is much more normal. Values are straightforwardly encoded as RLP strings, except for arrays and tuples which are encoded as RLP lists.

An example call to foo((string,bool,bool,int72)[2],uint8) (given function ID 27) is 0x1bcac44180010ac44201800181ff.

Note that this RLP-based design eliminates the need for zero-padding, 0xff sign extension, dynamic element offsets, and the always-four-bytes hash for a selector.

It seems to me that this could obviate a significant amount of the bespoke calldata hacks being used by Layer 2s and could benefit composability and adoption (not to mention gas costs). ABIv2 standardization seems to have failed, but this could revive it. Standardization is also beneficial for enabling tools to figure out what the heck is going on.

I should mention that I have a proof of concept implementation in Java GitHub - esaulpaugh/abiv3: ABIv3 proof of concept for Ethereum

I don’t think this scheme is going to bring significant savings in a general case.

Call data is already (relatively) cheap, especially the zero bytes (4 GAS vs 16 GAS). The cost of RLP decoding will be on the same order as the savings you get from eliminating the padding. You may actually end up paying more. And with RLP, while you make the input shorter, you also get more non-zero bytes than in the original, which may end up costing more than the zeros you remove.

Second, with RLP you lose random access to calldata. The encoding is variable-length so instead of being able to go straight to the parameter you want, you need to decode everything that’s before it. With the current scheme the cost for random access is O(1) if the input is completely static and poportional to the nesting level for dynamic arrays. So even if it ends up being useful in some special cases, I don’t think it would be a good idea to replace ABIv2 with this scheme in general.

RLP also does not do much for strings (or bytes) and I think it’s common to pass around larger chunks of data as bytes.

From what I’ve seen, the real savings in those bespoke schemes are usually in removing redundant parts of the input and that can only be done if you know the structure of your input. It won’t be solved by a generic encoding scheme.

Boolean arrays are a special case.

That might actually be something worth optimizing, since the current encoding of a boolean array wastes a lot of space and you could pack 8 non-zero bytes into one instead. But it could just as well be added to ABIv2 without creating a completely new scheme.

The general case will almost certainly be a function call in layer 2 rollups which are limited not by computation but by calldata length. I want to make sure we don’t judge a future encoding by today’s usage patterns, which are of course significantly influenced by the limitations of current techniques as well as by the fact that Ethereum has not yet scaled. ABIv3 is explicitly (the way I conceive of it) intended for Ethereum post-scaling. If Ethereum doesn’t scale, there is no point in working on ABIv3 (or Ethereum itself, really).

Skipping an RLP item generally involves reading one byte (e.g. 0x84). The data is contiguous, so there’s no pointer chasing. For the uber-random-access sensitive, there is ABIv2 and custom calldata hacks, both of which will still be available. Alternatively, ABIv3 could be designed so that array elements are given fixed widths and not RLP-encoded. At least for common types like integers. If not mandatorily, then optionally (indicated by reading the encoding version bits). Or people could just pass large arrays to contracts as bytes, which I’m sure people already do.

One-byte values 0x01 to 0x7f cost 1 non-zero byte with RLP but cost the same as 8.75 non-zero bytes using ABIv2. 64-bit values cost 9 with RLP and the equivalent of 14 non-zero bytes w/ ABIv2. And that’s assuming that they’re not negative values. A simple -1 costs two non-zero bytes in RLP (0x81ff) and 32 non-zero bytes with ABIv2.

Now we could discourage the use of negative numbers for all time, or we could have EVM languages use a sensible encoding by default.

I think it is indefensible for Ethereum to expect developers to bit hack every contract they put out. The waste of man-hours is enormous, the risk of human error is significant (both in the writing and the reading of such hacks), and the penalty to interoperability and composability is probably incalculable. All of which is unnecessary in the event that the auto-generated ABIv3 is cheaper than the developer’s manual bit twiddling.

And this is all before even considering the possibility of an ABIv3 codec precompile or other on-chain library. Further, this is without consideration to whether a standardized format is more amenable to batching and compression than a collection of differently-designed custom bit hacks. My intuition is that RLP will be larger than expertly hand-optimized bit level hacking, but also more compressible, narrowing the gap.

Note that, on the language level, it’s allowed not to decode calldata arguments merely once per function call, but rather to decode them on demand on access, as well as to pass references to calldata around, as well as to slice calldata arguments, all of which can be significant optimizations regardless of the concrete encoding schema. Any schema for a new abi encoding standard will have to keep allowing those features (and they are not in opposition to getting rid of padding, resp. to a more compact calldata layout). Thereby the reuse of RLP encoding is just not a suitable candidate (i.e. having not constant, but linear cost for accessing elements is a no-go).

(I don’t necessarily want to say that the proposal in ABI v3 · Issue #2542 · vyperlang/vyper · GitHub is perfect and exhaustive as is just yet, but it demonstrates that saving space as well as maintaining constant time random access is feasible, so there is little reason to consider less-optimal layouts for the particular use-case of abi encoding)

I very much think that developers should be punished harshly for writing functions which accept such a long list of arguments that linear costs become significant, especially in light of the fact that RLP items can be skipped via a trivial amount of computation.

Now, array element accesses are another story. I think a compelling argument can be made that arrays containing fixed-width elements should be allowed, if not required. At least above a certain length. That is something that will have to be addressed.

However, again, tuples which contain hundreds of top-level elements are an anti-pattern which should not be given consideration.

Lastly, I will say that my conception of ABIv3 is predicated on the assumption that Ethereum will scale (i.e. that gas will become inexpensive). If Ethereum never scales, then I think that the ABI algorithm is the least of our concerns. And speaking of scaling, by all accounts the bottleneck for layer 2 (widely touted as the future of ethereum) will continue to be calldata length, not computation. If purely-Layer 1 applications want to stick with ABIv2, they should be allowed to.