Adding HashMap-style storage layout to Solidity

Have there been discussions in the past on adding enumerable HashMaps to Solidity?

Current mapping types are very gas-efficient but have several other drawbacks:

  1. Impossible to enumerate keys/values
  2. Unintuitive for devs coming from other languages
  3. Inaccessible to off-chain analysis tools

If there is interest in discussing this, I would be happy to help work on it (I have no experience with CPP, but I’ll manage).
I have a possible spec in mind (that still requires work), but I didn’t wanna share anything before I understand the history and see whether there is interest in implementing such a thing.

This is an inherent limitation of how mappings work. Mappings are relatively cheap because they’re not implemented as a complicated data structure but instead treat the whole storage as a massive array where collisions are unlikely as long as the keys you calculate are random enough.

Impossible to enumerate keys/values

The simple way around it is to store the keys separately if you actually need them. Implementing an iterable mapping is pretty simple. There’s even an example of that in the docs: Iterable Mappings.

You can also find a robust implementation of EnumerableMap in OpenZeppelin.

Inaccessible to off-chain analysis tools

Well, they’re accessible if you have access to a node and can inspect storage. But if you mean that they cannot be easily returned from functions, the issue is that mappings can be massive. A very common use case is to store something for every address that interacts with a contract.

I have a possible spec in mind (that still requires work), but I didn’t wanna share anything before I understand the history and see whether there is interest in implementing such a thing.

I think this is something that should be done in external libraries, not in the compiler. I’d recommend taking a look at OpenZeppelin’s EnumerableMap I linked above to see if you can improve it.

It’s not something that we’d add to the language itself though unless it’s an elementary building block for other data structures as the mapping currently is. More complex structures should be built by library authors and the goal of the compiler is only to give them tools to make that possible. For example we’d rather work on finally adding generics to the language than implement specific data structures ourselves.

I should have mentioned that I am familiar with current solutions. I am also familiar with mapping’s storage layout and the technical details of why they are not enumerable/iterable.

Current solutions like the ones linked are:

  1. Storage-inefficient (e.g. 5 slots per key-value pair for the IterableMapping)
  2. Run-time expensive (require iterating over keys array)
  3. Clumsy to implement (using a library for something most devs expect out-of-the-box; evidently - OpenZeppelin themselves don’t use these in their ERC-20/ERC-721 implementations)

Regarding data accessibility - due to the storage layout scheme used for mappings, even having direct access to the LevelDB doesn’t let you enumerate mappings. Actually - you can’t associate slots to variables even if you traverse the entire storage trie because the keys are ran through keccak to generate the slot.

What I have in mind is more akin to Java’s HashMap:

I believe we can use a scheme that lets us enumerate more cheaply and uses up to 3 slots per key/value pair (the more pairs, the closer the ratio trends to 2 slots per pair). And I’m sure we can even come up with something better than my trivial attempt :slight_smile: