Solidity Team AMA #2 on Wed, 10th of March 2021

Hi everybody,

We are the Solidity team and this is our second AMA (but the first one hosted on this very forum)! You can find a writeup of the first AMA, which was conducted in 2020, here.

Your questions will be answered throughout the day (European time zones) on Wednesday, 10th of March 2021.

A few guidelines & tips for a smooth AMA

  • Feel free to post your questions beforehand to add them to the queue already! We won’t answer questions added after 10th of March and will close this thread once the AMA has concluded.
  • Ask questions using the blue Reply button at the very bottom.
    image
  • Only ask one question per reply. If you have multiple questions, use multiple replies.
  • Check existing questions before you post yours to avoid asking the same question multiple times.
  • Avoid replying to somebody else’s post. If you’d like to discuss a related topic in more detail, create a new topic in this forum.
  • Keep your questions on topic; we will only reply to questions concerning Solidity and the Solidity ecosystem and will remove off-topic posts.

Topic

Ask any question you might have about the Solidity compiler, language design, Yul, or, for example, as recently discussed on Twitter, the SMT Checker.

Useful links

Looking forward to hearing your questions! :incoming_envelope:


… And if you want to get to know the team better, have a look at our “Meet the Solidity Team” post! :man_technologist: :woman_technologist:

5 Likes

Hi Team…

I’ve been using solidity version 0.6.8 . We haven’t launched the contracts yet. The idea is that we plan on doing in the next week or so. I want to know if you think it’s worth updating the version to 0.8.2.

The reason I am asking this is because there’re not so many things changed. The things that need to change are:

  • instead of safemath, we have to have plain +, -.
  • in some places, explicit conversion is disallowed( so address(-1) , uint(-1) and things like these have to change).
  • address(bytes20(addr)) changes to payable(addr)

Do you think it makes sense to update to 0.8 ? in the sense that maybe 0.8 compiler uses much better compiler and reducing OP codes which result in gas cost reduction ? If so, then it’s worth it, if not, the above things I pointed out won’t make code cleaner or so safe.

Thanks a lot in advance.

It seems like we don’t have to use public modifier for constructors in the 0.8 versions.

Is this the appropriate behaviour ?

If you could design Solidity from scratch again, what would be different and which aspects would you keep the same?

Reading and writing storage is of course very gas expensive. So caching these operations in memory seems like a very useful optimization, but AFAIK Solidity doesn’t attempt to do this. It’s of course possible to cache manually using local variables, but it makes the code more ugly and harder to understand, especially if the same state is accessed across multiple functions. Have you considered implementing such an optimization and what are the challenges there?

1 Like

Are there plans to allow trailing commas in multi line expressions such as function calls? Many other languages support this, including JavaScript, Go, and Python.

One pattern I keep running across is the attempt to write functions to be stateless or as close to stateless as you can get to avoid storage costs. The result is a lot of things getting packed into calldata, but with odd workarounds to try and accommodate the restrictions of calldata. Are there any plans to expand calldata manipulations, for example, being able to .pop calldata arrays?

or other ideas to more easily pack and slice calldata. really curious about any thoughts in this area.

My question may be a bit broad and could be summarized as: What’s the current state of YUL?

Like, to what extent is YUL already used when compiling solidity code? Since --optimize-yul is deprecated and the help says --optimize enables the YUL optimizer it seems that YUL is already integrated in the default compilation pipeline. I can imagine it is more nuanced and would be interested to understand the current state of the transition to YUL better.

Also, looking through solidity’s Github issues I noticed proposals to add inheritance, structs, tuples etc to YUL. These are proposals / feature requests and I understand that many potential features may still be undecided but I wonder how many higher level features the team is planning to add to YUL? Is the plan to keep YUL a lower level language used as an intermediate language or do you foresee YUL being used as a language to implement contracts directly?

1 Like

One of my favorite features that’s been recently added to Solidity is the “immutable” keyword, for setting storage variables in the constructor that are written directly to the stored bytecode. This is a really powerful feature for cutting down on gas costs.

Are there any plans to expand upon this behavior? Structs are a useful tool for avoiding “stack too deep” errors, would it be possible to have immutable structs?

Are there plans to allow trailing commas in multi line expressions such as function calls? Many other languages support this, including JavaScript, Go, and Python.

Can you be more specific, please? Are you talking about f(a,b,c,) or things like a=2,b=3?
In general, I don’t see a big benefit in any of the two. If you could explain the motivation, we could discuss it.

My question may be a bit broad and could be summarized as: What’s the current state of YUL?

Like, to what extent is YUL already used when compiling solidity code? Since --optimize-yul is deprecated and the help says --optimize enables the YUL optimizer it seems that YUL is already integrated in the default compilation pipeline. I can imagine it is more nuanced and would be interested to understand the current state of the transition to YUL better.

Also, looking through solidity’s Github issues I noticed proposals to add inheritance, structs, tuples etc to YUL. These are proposals / feature requests and I understand that many potential features may still be undecided but I wonder how many higher level features the team is planning to add to YUL? Is the plan to keep YUL a lower level language used as an intermediate language or do you foresee YUL being used as a language to implement contracts directly?

At the current stage, Yul is used heavily for internal routines like the ABI coder, overflow-checked arithmetic and everything more complicated that has been added over the course of the last year. You can access these routines as an isolated file when requesting the evm.bytecode.generatedSources or evm.deployedBytecode.generatedSources fields via Standard-Json.

Apart from internal routines, we also re-wrote the entire Solidity code generator to go through Yul. Since the beginning of the year, we have 100% coverage on our semantic tests. You can request the generated yul code using solc --ir or solc --ir-optimized --optimize. This only exports the Yul code. To switch over the whole compilation pipeline, you can use solc --experimental-via-ir.

We are still working on carrying all metadata and debugging information across the new pipeline and we would also like to improve the optimizer so that the new code is cheaper in most cases and at least not much more expensive in rare cases. We are currently conducting gas tests to get a good picture.

If anyone has Solidity code that is already compiling in 0.8.x, we would be grateful for a pointer so that we can tune the optimizer for “real world” code.

There have been some discussion in connection to YulPlus but most of the proposals are on ice. One thing we would still like to improve is the memory management. If you make memory management more explicit, it is easier to move stack variables to memory or reuse unused memory.

Yul’s purpose is to be an intermediate or assembly language that is auditable It can be written by hand (see inline assembly), but should usually be generated by other programs.

2 Likes

In general, it is always recommended to use the latest release of the compiler, if at all possible. If, for whatever reason, you want to stick with older versions, be sure to check the List of Known Bugs — Solidity 0.8.24 documentation, since bug fixes will not be backported across breaking releases.

That being said, what you can expect gas-wise can vary for any contract at hand. There have been some significant improvements in the optimizer in the latest releases, but on the other hand changes like checked arithmetic and ABIEncoderV2 by default may also increase the gas cost of a given contract.
In case your contracts previously used safemath and ABIEncoderV2 already, you can expect gas savings. If not, you can still switch back to the old abi encoder using pragma abicoder v1; and use unchecked blocks for cases in which you had not used safemath before and may arrive at gas savings that way. In general, when using the same amount of safety features, gas usage is expected to improve with newer versions.

1 Like

Specifying the visibility of constructors, i.e. the use of public or internal for constructors, was deprecated in 0.7.0. The reason is that the mechanism overlaps with specifying that a contract itself is abstract or not. I.e. the only difference between a contract with internal and public constructor was that instantiating a contract with internal constructor directly (without inheriting from it) was impossible. But that’s also what it means for a contract to be abstract, so declaring a constructor internal has exactly the same effect as declaring the contract abstract.

Hence to avoid having multiple ways to express the same thing in the language, constructors are now public by default and in cases in which you would have made the constructor internal instead, the recommendation is to instead make the contract itself abstract.

Specifying the visibility of a constructor will likely be disallowed entirely in Solidity 0.9.

If you could design Solidity from scratch again, what would be different and which aspects would you keep the same?

Most design decisions were forced by the design of the EVM, so there was actually not too much wiggle room on the semantic side.
Some random details that I personally find annoying but are hard to change now:

  • event identifiers use a full 32 bytes
  • some aspects of the ABI
  • function identifiers could be a bit longer
  • we don’t need all the different bit-width types and small types do not provide a big benefit

Also some of the gas costs were not foreseeable at the point Solidity was designed. For example, we thought libraries might be a good way to split code, but delegatecall turned out just way too expensive for that purpose.

1 Like

There was a similar question in the previous AMA. See the answer for Does it make sense to cache arr.length?

Solidity currently doesn’t perform any caching of values read from storage in memory. However in some cases, values are cached in the stack. In particular, we can sometimes replace sload(key) and mload(key) with a variable in stack.

The default codegen and its optimizer cannot really cache values read from storage, but the default optimizer can still occasionally avoid loading the same value from storage multiple times. For example, in the following contract, the assembly would only contain a single sload.

contract C {
	uint x;
	function read_twice() public returns (uint a, uint b) {
		a = x;
		b = x;
	}
}

One can verify this by looking at the assembly generated by solc --asm --optimize contract.sol

However, the upcoming Yul codegen and Yul optimizer can cache significantly more cases. Examples of this can be found in our test suite. One can verify this by looking at the IR generated in the upcoming compilation pipeline: solc --ir-optimized --optimize contract.sol. Feel free to make suggestions on optimization opportunities that we miss in your contracts, especially when the rules are generic.

Also, in principle, all operations can often be ‘cached’ in memory and written to storage in the end. This, however, may complicate existing code. This is illustrated in the following contract where we sort a storage array:

contract  C {
	uint[3] arr;
	function sort_arr() public {
		uint[3] memory arr_copy = arr;

		// now perform sorting on arr_copy

		arr = arr_copy;
	}
}

The above example can save some storage-loads and storage-writes (i.e., sload and sstore respectively) when compared to performing the sorting directly on the storage array.

Regarding the last question about the current state of these efforts and challenges faced: most of our current efforts around optimization revolves around the Yul optimizer. This is because at some point in the near future, solidity will switch to the Yul compilation pipeline and the Yul optimizer is more modular and powerful than the bytecode based optimizer. Also doing this in the current codegen and bytecode based optimizer is difficult. This is because the generated bytecode is devoid of high level information. For example, function calls and the if statement would involve the jump or jumpi opcode. Additionally, the bytecode contains stack operations, such as dup5 or swap10. Both of these contribute to making the analysis harder, and therefore making the bytecode optimizer less powerful.

An example of the challenges we face is determining when a cached value is invalidated. In the following example for caching storage reads, the optimizer currently doesn’t infer that the function f only writes to storage slot 100 and therefore the value read from storage slot 0 is safe.

function f()
{
    sstore(100, 0)
}
let x := sload(0)
// assume that f cannot be inlined
f()
// can we replace the following by y := x?
let y := sload(0)

The above example is extremely simple and the Yul inliner would inline the function, allowing the replacement y := x. However, with more complicated functions that writes to a storage slot, this is harder to reason, especially when the slot that is written depends on the argument of the function, for example when writing a certain array index.

An even more interesting situation occurs when trying to cache memory loads in the stack, i.e., mload. Here is a Yul snippet on how Solidity copies function arguments from calldata to memory.

mstore(64, 128)
let _1 := 0
// Does not invalidate location 64
// Writes to memory location [128, 128 + calldatasize())
// Reasonable bound to calldatasize(): 2**32
calldatacopy(128, _1, calldatasize())
// Can this be replaced by z := 128?
let z := mload(64)

Even though a human can easily see that the memory location [64, 96) does not get modified after the first mstore, it is extremely hard for the optimizer to make this inferrence. Note that, for real world contracts, it is not easy even for humans to make such claims. We experimented with using an SMT solver to make such inferences here. Even though it is functional, it creates a new set of problems: trusting an external tool (here Z3 SMT solver) for generating assembly code, compilation and optimization performance, verifying the results of the solver, and producing deterministic results. Ideally we would want to have a simpler solver that is written to help perform such optimizations. An attempt towards this can be found here.

4 Likes

I’d also add that you’ll be missing out on some nice features that have been added since 0.6.8:

  • Free functions and file-level constants.
  • Calldata parameters in external internal functions (cheaper compared to memory arguments).
  • Assertion failures, out-of-bounds errors, overflow/underflow errors, etc. don’t eat up all the remaining gas in your transaction. You also get an error code indicating which of these happened. This does come at a cost of a slight increase in the bytecode size but will make your life a bit easier when debugging failures. You can even catch them and do your own error handling. This does not make much sense for assertion failures but might be a valid use case if you’re relying on built-in safe math for input validation instead of treating it as a sanity check.
  • SMTChecker got support for a ton of new functionality so if you’re using it, you definitely should upgrade.
  • You can change state mutability when overriding a virtual function.
  • Natspec for state variables.
  • Error messages and warnings now have proper error codes.
  • Lots of smaller improvements like fallback function being able to accept/return data via parameters, gwei as a unit or type(T).min/type(T).max helpers. Also tons of bugfixes which might not seem very important until you run into them yourself.
1 Like

I’m talking about f(a,b,c,), and probably only if it’s broken across many lines. A trailing comma would make it slightly easier to add another parameter at the end, or to swap/remove the last one. It’s not only useful for function calls, but also event definitions, struct instantiations, etc. Consider the following:

event SomeEvent(
    uint256 a,
    uint256 b
);

If I want to swap the order of a and b, I cannot just swap the lines. I also have to add a comma after b and remove the one after a. It’s of course only a tiny inconvenience, but still an inconvenience.

Other arguments I’ve heard are smaller diffs and a little bit more consistency.

I’m curious about your impression of the current state of 3rd party tooling around Solidity and more generally smart contract development, e.g., editor support, debuggers, linters, etc. Are you mostly happy or do you think it is lacking? Do you have any favorite tools, do you miss anything in particular? Is there a lot of exchange between the Solidity team and third party teams or is everyone mostly working on their own?

Ok, I see what you are getting at. Yeah, this is a controversial topic, also inside the team. My personal take is that this also makes it easier to remove elements without causing an error for overloaded functions or events.

One thing that might not be obvious to everyone is what the difference between public and internal constructors actually was and why visibility was used to express it. For functions the distinction is very clear. External functions are a part of the contract’s ABI and their parameters need to be encoded in a special way. They must be callable from the outside so they’re always included in the dispatch code that checks the selector and determines where to pass control. Internal functions on the other hand have less restrictions on their parameters (they can accept and return types that cannot be ABI-encoded) and they might be entirely skipped by the compiler if it determines that they’re unused.

In case of a constructor this is not the case because it does not even exist as a callable function after the contract is deployed. The only way you could “call” a constructor would be by using the new C() syntax but that does not have the same semantics as a function call. The constructor is only included in the bytecode sent in the data field of the transaction that creates the contract. We call this the creation bytecode. It runs once and its only purpose is to produce another piece of bytecode. This is the deployed (or runtime) bytecode and that’s what actually ends up on the chain. In the simplest case the deployed bytecode is embedded directly in the creation bytecode and just copied. But it can also be modified during construction - this happens for example if your code contains immutable variables whose values are embedded directly in the deployed bytecode.

You could stretch the concept and say that when creating a contract you’re calling its constructor externally and when your constructor calls another one from an inherited contract it’s an internal call. A contract with an external constructor would be one you effectively cannot inherit from while one with an internal one - one you cannot create. Only the latter case is useful in practice which is why at first we made it possible to make the constructor either internal or public. At some point we independently introduced abstract as another restriction you could put on contracts and we thought it was a clearer notion that also covered the same use case and more. This made specifying constructor visibility completely redundant.

1 Like