User-defined types and operators

chriseth · August 12, 2021, 4:52pm

Over the course of the previous weeks, a solution to the question of User-defined value types and surprisingly also some others has solidified in the team. I’m creating a new topic since this is about more than just user-defined value types. Here is the combined proposal that solves the following problems:

user-defined value types (the main topic of this thread) and whether or not / how they should support operators
special unchecked integer type (was requested as alternative for the “checked” keyword and is also a good solution as a for loop index variable)
support for different ways to implement fixed point type arithmetic
and maybe even more.

Specification

Introduce the following new “type” statement:

type <name> is <built-in value type>;

This statement defines a new type that is identical to “built-in value type” with regards to stack, storage, memory and calldata representation, ABI validation, cleanup and is replaced by “built-in value type” in the ABI (just as a contract type is replaced by “address”). Explicit conversion is only possible from and to the built-in value type, no implicit conversion is possible.

In addition to this, the “using” statement will be extended in the following two ways:

The “using” statement can be given at file-level, but with the following restrictions:

The “for” part cannot use “*”, i.e. the type has to be mentioned explicitly.
The “using” statement has to be in the same file as the type it applies to.
There can only be one such “using” statement per type.

If such using statements at file level are used, importing the type will also automatically apply the using statement to the type. This means you can define functions for a type without having to import these functions as their own identifiers and thus you do not clobber your scope with functions you want to call as members only anyway.

The functions part can be of the following form: using {f, M, myadd as +} for <type>;, where f can be a function, M can be a module and myadd can be a function.

This essentially allows you to define custom operators on custom types. If you use these for operators, there are some restrictions about which functions you can use, but this has not been fully worked out yet.

Examples

Using these mechanisms, you can clearly define which operators you want to define for your user-defined value types and it is also rather easy for another person to look up the exact code of the operator, if the type is known.

Timestamp.sol:

type timestamp is int;
// The functions "add" and "sub" are already visible here, so we
// can bind them to the type already. It is recommended to put the
// "using" directive as close to the type definition as possible.
using {add as +, diff as -, lessThan as <} for timestamp;

// You can only add an integer to a timestamp, not another timestamp.
// The result is a timestamp.
function add(timestamp ts, int diff) pure returns (timestamp) {
  // makes use of builtin overflow checks and it's obvious what the
  // meaning of the + below is, because just regular int types are involved
  return timestamp(int(ts) + diff);
}
// A timestamp difference is not a timestamp, it is a time difference.
// We could use a new type here, but for simplicity, let's just use `int`.
function diff(timestamp a, timestamp b) pure returns (int) {
  return int(a) - int(b);
}
function lessThan(timestamp a, timestamp b) pure returns (bool) {
  return int(a) < int(b);
}

TimestampUser.sol:

import {timestamp} from "Timestamp.sol";

function earlier(timestamp a, timestamp b) pure returns (timestamp) {
  // This uses the operator defined via "earlier" in Timestamp.sol
  // without this function being visible in this source unit.
  return a < b ? a : b;
}

This also allows you to define fixed point numbers with regular infix operators instead of functions, which makes formulas much better readable, but it also allows to define an integer type that does not have overflow checks:

WrappingUint.sol:

type wrappingUint is uint;
using {add as +, sub as -, inc as ++, mul as *} for wrappingUint;

function add(wrappingUint a, wrappingUint b) pure returns (wrappingUint) {
  return wrappingUint(uint(a) + uint(b));
}
// and so on.

If the functions for operators only consist of conversions from / to the underlying type and a simple operator, they are most likely inlined by the optimizer and thus they come at no additional gas cost.

Open Questions

How do custom operators work if the two types are different? Is it enough that one of the types is the “using type”? Should compilation fail if there are two functions that could match after implicit conversion? Should if fail if two functions match even without implicit conversion? (Note that you can always import the function itself from the module and use functional-style notation)

How to distinguish the prefix and postfix increment and decrement operator? First thing that comes to mind: using {postfixInc as .++, prefixInc as ++.} for <type>;

Is it OK to define multiple functions of the same name for an operator with different storage locations, as long as when the function is resolved it is unique (for this we need Ranked overload resolution · Issue #1256 · argotorg/solidity · GitHub)?
Example:

function add(StructName memory a, StructName memory b) ...
function add(StructName storage a, StructName storage b) ...
function add(StructName calldata a, StructName memory b) ...

fvictorio · August 13, 2021, 7:42pm

My two cents: in the other thread you said:

It would be very helpful for us if some people in this thread could share example code that would benefit from this feature. It does not have to be code that actually makes use of the feature, but code that you think could be rewritten using the feature would already be very very valuable for us.

Since operator overloading is a… controversial feature, let’s say, and since the semantics are complicated by unchecked, I think it would make sense to have a good amount of examples that justify adding them.

maxareo · August 14, 2021, 9:51am

I happen to have one example in a recent use case. It is using the FixedPoint library by Uniswap which is responsible for fixed point operations in their DEX application. However, in a different scenario, a user-defined type Fraction can be built on top of it. It has simple operations such as * and /. Addition and substraction may also be needed but they are currently not implemented in that library for some reason. An example of this Fraction is given below. Operation overloading in this scenario seems to be able to provide a good amount of convenience.

import "FixedPoint.sol";

contract FractionExample {

    struct uq128x128 {
        uint256 _x;
    }

    type Fraction is uq128x128;

    using {mul as *, div as /} for Fraction;

    function mul(Fraction frac, int256 y) pure returns (Fraction) {
        return FixedPoint.muli(frac, y)
    }

    function mul(Fraction frac1, Fraction frac2) pure returns (Fraction) {
        return FixedPoint.muluq(frac1, frac2);
    }

    function div(Fraction frac1, Fraction frac2) pure returns (Fraction) {
        return FixedPoint.divuq(frac1, frac2);
    }

    function div(uint256 y, Fraction frac) pure returns (Fraction) {
        return mul(FixedPoint.reciprocal(frac), y);
    }
}

rumkin · September 19, 2021, 2:52pm

IMO there should be one more question related to ABI: How this will affect method signatures and resulted ABI object?

I’d like to have information about types in generated JSON ABI and to use it in other environments, like convert JSON into TypeScript or Golang code with types inferred from ABI. Such output requires information about custom types names and the types inherited by them.

chriseth · September 20, 2021, 8:51am

We did not change anything with respect to the ABI: If there is a more “specialized” Solidity type, it will be part of the abi-json under the key “internalType”. It is like that for contracts or enums already. In the future, we might consider providing more information about Solidity types, but I would prefer to do that outside of the ABI so that it can be used for all types, not only those present in function paramaters or return values.