There was a similar question in the previous AMA. See the answer for Does it make sense to cache arr.length?
Solidity currently doesn’t perform any caching of values read from storage in memory. However in some cases, values are cached in the stack. In particular, we can sometimes replace sload(key)
and mload(key)
with a variable in stack.
The default codegen and its optimizer cannot really cache values read from storage, but the default optimizer can still occasionally avoid loading the same value from storage multiple times. For example, in the following contract, the assembly would only contain a single sload
.
contract C {
uint x;
function read_twice() public returns (uint a, uint b) {
a = x;
b = x;
}
}
One can verify this by looking at the assembly generated by solc --asm --optimize contract.sol
However, the upcoming Yul codegen and Yul optimizer can cache significantly more cases. Examples of this can be found in our test suite. One can verify this by looking at the IR generated in the upcoming compilation pipeline: solc --ir-optimized --optimize contract.sol
. Feel free to make suggestions on optimization opportunities that we miss in your contracts, especially when the rules are generic.
Also, in principle, all operations can often be ‘cached’ in memory and written to storage in the end. This, however, may complicate existing code. This is illustrated in the following contract where we sort a storage array:
contract C {
uint[3] arr;
function sort_arr() public {
uint[3] memory arr_copy = arr;
// now perform sorting on arr_copy
arr = arr_copy;
}
}
The above example can save some storage-loads and storage-writes (i.e., sload
and sstore
respectively) when compared to performing the sorting directly on the storage array.
Regarding the last question about the current state of these efforts and challenges faced: most of our current efforts around optimization revolves around the Yul optimizer. This is because at some point in the near future, solidity will switch to the Yul compilation pipeline and the Yul optimizer is more modular and powerful than the bytecode based optimizer. Also doing this in the current codegen and bytecode based optimizer is difficult. This is because the generated bytecode is devoid of high level information. For example, function calls and the if
statement would involve the jump
or jumpi
opcode. Additionally, the bytecode contains stack operations, such as dup5
or swap10
. Both of these contribute to making the analysis harder, and therefore making the bytecode optimizer less powerful.
An example of the challenges we face is determining when a cached value is invalidated. In the following example for caching storage reads, the optimizer currently doesn’t infer that the function f
only writes to storage slot 100
and therefore the value read from storage slot 0
is safe.
function f()
{
sstore(100, 0)
}
let x := sload(0)
// assume that f cannot be inlined
f()
// can we replace the following by y := x?
let y := sload(0)
The above example is extremely simple and the Yul inliner would inline the function, allowing the replacement y := x
. However, with more complicated functions that writes to a storage slot, this is harder to reason, especially when the slot that is written depends on the argument of the function, for example when writing a certain array index.
An even more interesting situation occurs when trying to cache memory loads in the stack, i.e., mload
. Here is a Yul snippet on how Solidity copies function arguments from calldata
to memory
.
mstore(64, 128)
let _1 := 0
// Does not invalidate location 64
// Writes to memory location [128, 128 + calldatasize())
// Reasonable bound to calldatasize(): 2**32
calldatacopy(128, _1, calldatasize())
// Can this be replaced by z := 128?
let z := mload(64)
Even though a human can easily see that the memory location [64, 96)
does not get modified after the first mstore
, it is extremely hard for the optimizer to make this inferrence. Note that, for real world contracts, it is not easy even for humans to make such claims. We experimented with using an SMT solver to make such inferences here. Even though it is functional, it creates a new set of problems: trusting an external tool (here Z3 SMT solver) for generating assembly code, compilation and optimization performance, verifying the results of the solver, and producing deterministic results. Ideally we would want to have a simpler solver that is written to help perform such optimizations. An attempt towards this can be found here.