Reducing Calldata Size with Optional/Default Function Parameters

NoahZinsmeister · November 15, 2021, 4:51pm

Hi! Hoping this is the right venue for this topic, please let me know if not.

I’d like to discuss the possibility of native support for optional and/or default parameters on public functions in Solidity. After a bit of searching, I couldn’t find a discussion of this issue in the forum, so here goes.

Consider the following function:

function foo(uint256 bar? = 10) public {
  ...
};

The optional bar parameter indicates that the compiler should generate every other function in the matrix of all possible signatures for this function, i.e.

function foo() external;

This generated function would simply call the “parent” function with the specified default value. If no default value was specified, the global default value for the type would be used. Attempts to override the implementation for this generated function would probably have to be a compiler error?

The motivation behind this proposal is to make it as easy as possible for developers to minimize the use of unnecessary calldata. This is particularly relevant in the context of EVM-/Solidity-compatible L2s, where calldata costs represent the lion’s share of transaction costs.

This proposal obviously begs the question “why not simply write the overloaded function yourself?” While of course possible, this can significantly bloat contracts, especially as the number of parameter permutations increases, and introduces the possibility of errors.

I’m sure there are things I haven’t thought of here, happy to discuss!

cameel · November 15, 2021, 7:29pm

Yes, this is a good place for this discussion. We already have a feature request for this (Optional function parameters #232) but the forum is a good place to brainstorm ideas and gauge interest before submitting a more concrete proposal for a new feature.

This specific solution was already proposed in one of the comments in that issue. Some of the problems raised there:

Default values other than zero will not work for calldata parameters. This is because the contract cannot write to calldata so it cannot insert the default when calling a function. It can only forward a value that’s already in calldata.
It won’t work with constructors. There can only be one constructor.

Also one of this things to consider is that while overhead of this might not be all that large (it would add one external function stubs for each optional parameter), it still adds code that cannot be removed by the optimizer and it might make it easier to hit the contract size limit even if you do not care about the deployment cost.

An alternative proposal that comes to my mind would be to keep generating just one function but include the default values in the ABI JSON to let the off-chain caller insert them if they’re not specified by the user. The compiler could do the same when the function is called by a contract on chain. You’d only have to worry about them when making a low-level call.

Your point about lowering calldata cost is an interesting one though. Having clients insert the value does not solve that. The choice between one or the other solution seems more like a trade-off, depending on whether you care more about code size or calldata size so maybe we really need both.

rumkin · November 18, 2021, 3:59pm

As an idea. It might be “client-side” solution: compiler can produce ABI with default param value. As a drawback such solution affect’s ABI standard and could be made only by more complex development process with EIP.

As an alternative there could be added some virtual calldata which can be modified only once by the target contract and should be append-only. So there will be two call datas one passed by user and other extended by contract.

nventuro · November 22, 2021, 2:24pm

Unfortunately, as you mention in the next paragraph, this doesn’t actually solve the underlying problem of reducing calldata size. Keep in mind that this proposal is not about language ergonomics, but rather optimizing for a very specific problem.

I think this is the key part: we already know of a highly effective way to tackle this problem. Modifying how calldata parameters work, introducing sentinel values, etc., all seem rather complex and difficult to get right, and introduce a bunch of problems of their own. What if the solution is to simply automate the manual process @NoahZinsmeister describes here?

In my mental model of Solidity, public functions have two basic components: argument decoding and copying (moving data from calldata into the stack/memory) and execution. External functions are slightly different in that arguments can remain in calldata instead of being copied, but the model sort of applies. Many programmers build on this by making an external function perform basic authorization and sanity checks, and then dispatch to an internal version that has the actual implementation. See for example in OZ’s ERC20 how approve dispatchs to _approve with msg.sender as the sender, or how both transfer and transferFrom call _transfer.

So given this, would it be feasible to implement a form of default parameters where what actually happens is that multiple selectors are automatically generated, calling the ‘full’ version of the function with the default values filled in?

cameel · November 22, 2021, 3:02pm

Sounds feasible to me as long as you can accept the limitation that the optional parameters must either be memory or the default value must be zero.

I’ll adjust the title because clearly what you all really care about is the calldata reduction and defaults are just a means to that end. The current title suggests that this is a more general discussion about optionals/defaults in the language which would include ergonomics too.

NoahZinsmeister · November 22, 2021, 5:38pm

Great point, definitely something I’d overlooked. Important to note regardless of which direction this goes.

I think this makes a lot of sense, and is basically what I was driving at in the original post.

cameel · November 22, 2021, 6:04pm

I think that it might be a good idea to just go with optionals here. It already achieves the goal of calldata reduction, syntax is very straightforward (f(<type> <parameter>?)) and it does not limit you to memory arguments. It also decouples this feature from the discussion about defaults and side-steps any objections from people who might have strong opinions on how defaults should work. You can always assign the default in the function body (well, maybe a built-in to differentiate zero from a defaulted value might be needed here).

frangio · November 22, 2021, 11:54pm

What are the controversial aspects about argument defaults? It sounds pretty straightforward to me.

cameel · November 23, 2021, 12:16am

I was thinking about the discussion in #232. There were various opinions, including people arguing that default values are bad in general or people wanting defaults for constructors (which can’t work using the mechanism proposed here as you can only have one constructor). Also the discussion in this thread really focuses only on the calldata optimization, ignoring all other use cases for defaults. I suspect that in some cases the other mechanism with the caller just inserting the default might be preferable.

nventuro · November 23, 2021, 12:39am

@cameel can you provide a simple example of how the mechanism you’re proposing would work in a function with e.g. a default value for a uint?

cameel · November 24, 2021, 12:11am

The difference is just which side inserts the default - the caller or the callee. Let’s say we have a function defined like this:

function f(uint x = 42) public {}

In my proposal the caller inserts the value so if you called f() in a contract, the compiler would replace it for you with f(42) when generating bytecode. That’s it. It’s really what I think of as the typical way of implementing defaults in a programming language.

The biggest complication is that on Ethereum it’s not just the compiler encoding calls but also external tools so there must also be some way for a tool to know what the default is. To solve this, we could include the defaults in the ABI JSON. The tool would let you omit the parameter but it would still be there in transaction data.

Anyway, I just realized that using the f(uint x? = 42) syntax as proposed by @NoahZinsmeister is a rather neat way out. If we use it to represent a default inserted by the callee, we still leave the possibility of adding the other kind of default open with the f(uint x = 42) syntax. We don’t have to ever implement it but just leaving the possibility open means that we do not have to worry about which one is better after all.

cameel · November 24, 2021, 12:32am

One more potential issue. I dismissed it at first because I completely forgot that we also have syntax for calls with named parameters. If you have N optional parameters, the compiler would have to generate not N but 2^N - 1 extra functions. For example this function:

function g(
    uint a? = 1, uint b? = 2, uint c? = 3, uint d? = 4, uint e? = 5,
    uint f? = 6, uint g? = 7, uint h? = 8, uint i? = 9, uint j? = 10
) public {}

could potentially be called as f({e: 42}). To support every combination of omitted parameters the compiler would have to generate 1023 external functions even in this relatively simple case. Functions with more parameters would lead to serious bloat, even if you never actually use or need named arguments.

We can just limit it to only permutations that omit all parameters after a given position. It’s not a big issue if you only care about the savings because omitting parameters in the middle does not save you any calldata anyway and you can get some actual savings by just manually creating an overload. Still, this is a good example of why syntax-wise someone might prefer the mechanism with the default inserted by the caller - it does not have this limitation at all.

nventuro · November 24, 2021, 2:51am

Thanks for the explanation. However, unless I’m missing something, the default being inserted by the caller doesn’t address the underlying request of a way to reduce calldata size, does it? The caller still has to include the ‘42’ where required, so it just makes for a nicer interface.

Now that I go over my first message again I realize that perhaps my proposal was not very clearly worded, so I’ll provide a simple example to better show what I mean.

Let’s say we have a function defined like this:

function f(uint x = 42) public {}

What I propose we do is have the compiler treat that as if the code were the following:

function f(uint x) public {}
function f() public { f(42); }

That is, it’d automatically add two external functions: one with the explicit argument, and one with no argument that automatically calls the first one with the default value. This is essentially the same process @NoahZinsmeister describes as being error-prone when done manually.

This implementation results in slightly more complicated client code as we’d force it to deal with overloads (which many implementations struggle with, particularly those in JavaScript), but we get the full calldata savings since no extra data has to be sent and we don’t have any magic sentinel values or restrictions on default values as the detection of intent to use a default value happens while looking at the selector.

Such a thing can of course be implemented via a transpiler, but that’d be opening a huge can of worms.

nventuro · November 25, 2021, 10:31pm

Something I’ve been thinking about with @frangio is that my proposal could lead to radically larger code size, since the creation code now has to account for all of these newly added selectors and simple wrappers. That’d be problematic as it’d require size reduction techniques such as usage of delegatecall libraries, which would be fine on an L2 but not on L1.

Consider as an example this struct from Balancer v2’s swap interface:

struct FundManagement {
    address sender;
    bool fromInternalBalance;
    address payable recipient;
    bool toInternalBalance;
}

There, sender is typically msg.sender and recipient is typically sender. This would result in 2^4=16 variants, as all four parameters have either two possible values (the booleans) or a default and an explicit one (the addresses). Enums similarly have multiple (but finite) possible values.

A neat pattern that sidesteps this problem entirely involves using an auxiliary contract. The ‘main’ contract remains exactly the same as before, except we add a fallback function that delegatecalls to the aux contract, forwarding all calldata (assume main did not originally have a fallback). The aux contract is simply a massive jump table with all of the automatically generated selectors for all of the different permutations, delegate-calling back into the ‘original’ function in the main contract by filling in all of the parameters.

So if e.g. someone made a call with swapDefaultSenderFalseFromInternalBalanceFalseToInternalBalance(address recipient) (which only exists in the aux contract), the main contract would delegatecall aux in its fallback, which would then delegatecall main back with swap({ sender: msg.sender, fromInternalBalance: false, recipient: recipient, toInternalBalance: false}).

cameel · December 3, 2021, 10:10pm

However, unless I’m missing something, the default being inserted by the caller doesn’t address the underlying request of a way to reduce calldata size, does it?

Yeah, it doesn’t. I never said it does. I was only saying that both proposals are valid ways to implement defaults in the language and choosing one prevents us from implementing the other so we must carefully consider if we want to choose this one and forgo the other.

But then I realized that this is not necessarily the case because the original proposal here uses ? after the parameter name, which still leaves the possibility of using syntax without ? for the other one.

Now that I go over my first message again I realize that perhaps my proposal was not very clearly worded

No, it was very clear, I understand what you mean. You just asked how my proposal would work so I elaborated but it was not meant to be an alternative way to save calldata - just an alternative way to implement defaults in the language.

we don’t have any magic sentinel values or restrictions on default values

But we do have restrictions on functions themselves:

Can’t use non-zero defaults with calldata.
Can’t have defaults in a constructor.
Can’t put an argument without a default after one with a default. At least not without having the compiler generate 2^N functions or introducing some new custom encoding scheme where only non-defaulted parameters are encoded.

I’m not saying the other proposal saves more calldata (it does not save any), I’m just saiyng that it’s a more flexible implementation of defaults because it does not have any of these restrictions. That’s irrelevant now though because we don’t have to choose after all.

cameel · December 3, 2021, 10:46pm

This would result in 2^4=16 variants, as all four parameters have either two possible values

Yeah, that’s what I meant in this comment:

I think that the reasonable solution is to simply not allow a parameter without a default after one with a default. This way they must all be clustered at the end. You don’t save any calldata by providing defaults for ones in the middle anyway.

A neat pattern that sidesteps this problem entirely involves using an auxiliary contract. The ‘main’ contract remains exactly the same as before, except we add a fallback function that delegatecalls to the aux contract, forwarding all calldata (assume main did not originally have a fallback). The aux contract is simply a massive jump table with all of the automatically generated selectors for all of the different permutations, delegate-calling back into the ‘original’ function in the main contract by filling in all of the parameters.

I’d say its more of a usage pattern than a solution. I mean, it solves the problem of bloating your contract but you don’t need compiler-level support (other than support for the defaults) to do this. You can just not provide any defaults in your original contract and then have a function with defaults in the auxiliary contract. Applied to the example from my post it would look like this (we’ll maybe with a delegatecall; I’m using a normal call here for simplicity):

contract C {
    function g(
        uint a, uint b, uint c, uint d, uint e,
        uint f, uint g, uint h, uint i, uint j
    ) public
    {
        // ...
    }
}

contract D {
    C c = C(address(0x1234567890123456789012345678901234567890));

    function g(
        uint a? = 1, uint b? = 2, uint c? = 3, uint d? = 4, uint e? = 5,
        uint f? = 6, uint g? = 7, uint h? = 8, uint i? = 9, uint j? = 10
    ) external
    {
        d.g(a, b, c, d, e, f, g, h, i, j);
    }
}

swapDefaultSenderFalseFromInternalBalanceFalseToInternalBalance(address recipient)

Or do you mean that the “automatically generated selectors” would not just have different parameter combinations but also different names and default values would be embedded in the name?

Not sure if there’s a point in encoding defaults in the name but just encoding the information about which arguments are actually used by a given variant would be enough to distinguish them. For example (still using my example with function g()) generated functions could have signatures like this:

g_0000000000()
g_0000000001(uint)
g_0000000010(uint)
g_0000000011(uint, uint)
g_0000000100(uint)
...

Then you’d actually save on calldata. At least as long as you don’t generate so many functions that you get a selector collision :)).

But the problem I see here is that this is getting reeeally specific to a particular use case - with an assumption of an extra contract, a naming scheme for signatures, and all. I think the original proposal was fine but I have doubts if this variant is worth standardizing at the compiler level. You might be better off implementing a proxy contract that gets arguments as bytes with values encoded in a custom way, unpacks them on chain and passes them into the actual call. This could save you even more because you could use some variable-length encoding scheme for parameters.

frangio · January 12, 2022, 5:39pm

Yeah this is how default arguments tend to work in other languages. This would be fine.

frangio · January 13, 2022, 6:00pm

I also agree that the approach shared by @nventuro above does not require compiler-level support and probably shouldn’t have any sort of compiler level support for a long time.

It shouldn’t delay optional function parameters if those are a widely requested feature.