[ad_1]
Yul is a low-level language that can be used in-line in Solidity via an assembly block, as a standalone language, and as a compilation target. Currently, the default dialect of Yul is the EVM dialect, so to harness this power, you must first gain a deep understanding of how the EVM works and second master the abstraction of standards Solidity imposed.
Since the EVM is a stack-based virtual machine, it operates by a set of instructions that can be categorized to:
1- Stack Instructions
- is the set of instructions that manipulate the position of values on the stack.
- Since Yul manages local variables and control flow, stack opcodes that interfere with these features are not available in Yul, except for a built-in
pop
function to drop variables. - Examples of Stack Opcode:
pushN
,dupN
,swapN
, andjumpN.
2- Arithmetic Instructions
- pops two or more values from the stack, performs an arithmetic operation with, and then pushes the result.
- Examples of Arithmetic Opcode:
add
,div
,mul
, andmod.
3- Comparison Instructions
- pops one or two values from the stack, performs a comparison, and pushes the result; either False (0) or True (1).
- Examples of Comparison Opcode:
lt
,gt
,eq
, andiszero.
4- Bitwise Instructions
- pops one or two values from the stack and performs a bitwise operation on them.
- Examples of Bitwise Opcode:
and
,or
,xor
, andnot.
5- Memory Instructions
- it read from and writes to the memory.
- Examples of Memory Opcode:
mstore
,mload
, andmstore8.
6- Read Context Instructions
- it reads from the global state and the execution context.
- Examples of Read Context Opcode:
caller
,sload
, andchainid.
7- Write Context Instructions
- it writes to the global state and the execution context.
- Examples of Write Context Opcode:
call
,create
, andsstore.
You can find a list of all opcodes used in Yul
here.
Note: please note that we will toggle between EVM instructions and Solidity layout a lot in this article.
Master Solidity Layout for Efficient Assembly Coding
As per Solidity documentation, there are five standard layouts that every developer must be aware of. The crucial aspects of layouts are:
1- Storage Layout
Storage is persistent between function calls, writing to and reading from the storage is the most expensive in terms of gas.
Contract storage is simply a key mapping to a value, it maps a 32-byte key which represents the position of a variable in storage to a 32-byte value at that given position sstore(key, value)
1–1. Layout of Statically-Sized Variables in Storage:
- The EVM operates on 32 bytes in each slot, the first state variable is stored in slot zero.
- If the second variable can fit into the same slot; it will be right-aligned in that slot, otherwise it will be stored in the next slot.
- Immutable and constant variables are compile-time variables that don’t occupy a slot in the storage.
- Struct with static-sized variables follow the same rules, and it can compact to save gas as long as it fits 32 bytes. The declaration of the struct type doesn’t occupy any slot in the storage, as it is considered as a blueprint for the struct instances.
contract FixedSizeVariables {uint256 private value1; // value1 = 1 in slot 0
uint256[2] private value2; // value2[0] = 2 & value2[1] = 3 in slot 1 & 2
uint128 private value3; // value3 = 4 in slot 3
uint128 private value4; // value4 = 5 in slot 3
uint8 private value5; // value5 = 6 in slot 4
uint8 private value6; // value6 = 7 in slot 4
}
// Storage Layout:
// 0x00: 0x0000000000000000000000000000000000000000000000000000000000000001
// 0x001: 0x0000000000000000000000000000000000000000000000000000000000000002
// 0x002: 0x0000000000000000000000000000000000000000000000000000000000000003
// 0x003: 0x0000000000000000000000000000000500000000000000000000000000000004
// 0x004: 0x0000000000000000000000000000000000000000000000000000000000000706
Let’s assume the value of each variable as stated above in the comments:
- State variable
value1
is 1, since the EVM operates on bytes only we have to pad 1 to bytes32 and add the hexadecimal which will occupy slot 0. - A fixed-size array of 2 elements each of
uint256
will occupy 2 slots; slot 1 and slot 2. - State variables
value3
andvalue4
are both ofuint128
type so EVM will compact them in one slot, which is slot number 3.value3
which is equal to 4 will be right aligned to the next variable, and so on. The value type Sizes are:
— uint256: 32 bytes.
— uint128: 16 bytes.
— uint64: 8 bytes.
— uint32: 4 bytes
— uint16: 2 bytes.
— uint8: 1 byte.
— bytes32: 32 bytes.
— address: 20 bytes.
— bool: 1 byte.
1–2. Layout of Dynamically-Sized Variables in Storage:
Using reserved slots doesn’t work for dynamically-sized arrays and mapping because there is no way of knowing how many slots to reserve, instead:
- Mapping is stored by concatenating the key value and the storage slot, then hashing both of them together.
- Array’s length is stored in the slot they were declared in; array elements are stored sequentially somewhere else in the storage, starting at the hash of the slot number where the array is declared.
- Bytes and strings that occupy less than 31 bytes are packed in one slot, and the right-most byte represents its length multiplied by two otherwise, they are stored the same way as arrays.
- Using elements smaller than 32 bytes in dynamic-sized variables may increase your contract’s gas usage. This is due to the fact the EVM operates on 32 bytes, which means that the EVM will consume more gas to reduce the size of any element from 32 bytes to the desired size.
- Structs with dynamic-sized variables will follow the same rules of storage and expensive gas cost for elements less than 32 bytes.
contract DyanmicSizeVariables {mapping(address => uint256) private _balances; // account -> balance slot 0
uint256[] private _values; // slot 1
string private _name; // slot 2
}
// Storage Layout:
// 0x00: 0x0000000000000000000000000000000000000000000000000000000000000000
// 0x01: 0x0000000000000000000000000000000000000000000000000000000000000002
// 0x02: 0x4a65726f6d650000000000000000000000000000000000000000000000000012
// mapping elements:
// 0x3ddcac31351e0705625963ec259851464733fec321375bc6bada6a59752ea7c4: 0x00000000000000000000000000000000000000000000000000000000000004b0
// 0xbabeeff9e42c6a75123df37ff2f874914fb38fdf5076178f847844476f22232a: 0x0000000000000000000000000000000000000000000000000000000000000171
// array elements [50, 60]:
// 0xb10e2d527612073b26eecdfd717e6a320cf44b4afac2b0732d9fcbe2b7fa0cf6: 0x0000000000000000000000000000000000000000000000000000000000000032
// 0xb10e2d527612073b26eecdfd717e6a320cf44b4afac2b0732d9fcbe2b7fa0cf7: 0x000000000000000000000000000000000000000000000000000000000000003c
Mapping in Slot 0 :
- In mapping, the slot it occupies stays empty since
_balances
are the first state variable it occupies slot 0 with empty bytes32 as follows:
0x00: 0x0000000000000000000000000000000000000000000000000000000000000000
- Let’s assume that the key address in
_balances
mapping is `0x266626BC2bb7C645ce958DA731E2C3F4705E8d87` as the address occupies 20 bytes, so we have to pad it to 32 bytes by adding 12 more bytes to the left-most side to be 24 more zeros as follows:
//Please note that address has to be all lowercased
000000000000000000000000266626bc2bb7c645ce958cc731e2c34705e8d87
- since the mapping occupies slot 0, so the representation of the slot index is:
0000000000000000000000000000000000000000000000000000000000000000
- Concatenate key plus slot index of mapping to be:
000000000000000000000000266626bc2bb7c645ce958cc731e2c34705e8d870000000000000000000000000000000000000000000000000000000000000000
- Hash the sum of both to get the element storage location keccak256(000000000000000000000000266626bc2bb7c645ce958cc731e2c34705e8d870000000000000000000000000000000000000000000000000000000000000000) to be:
3ddcac31351e0705625963ec259851464733fec321375bc6bada6a59752ea7c4
- let’s assume that the
balance
ofaddress
`0x266626BC2bb7C645ce958DA731E2C3F4705E8d87` in_balances
mapping is 1200, so we will pad it to bytes32 to be:
00000000000000000000000000000000000000000000000000000000000004b0
- Let’s take a second key in
_balances
mapping to sum up all the steps:
// address of the second account is
0x266626bc2bb7c645cc958cc731e2c34705e7f87
// pad address to 32 bytes without hexadecimal
000000000000000000000000266626bc2bb7c645cc958cc731e2c34705e7f87
// index of mapping slot which is slot 0
0000000000000000000000000000000000000000000000000000000000000000
// concatenate key to the slot index
000000000000000000000000266626bc2bb7c645cc958cc731e2c34705e7f870000000000000000000000000000000000000000000000000000000000000000
// keccak256 of the concatenation is:
babeeff9e42c6a75123df37ff2f874914fb38fdf5076178f847844476f22232a
// balance of the address is 369 to bytes32
0000000000000000000000000000000000000000000000000000000000000171
Array in Slot 1:
- The length of
_values
array [50, 60] is 2, and it is declared in slot 1 so the slot of declaration will store the array’s length in the right-most side.
// declared in slot 1 with 2 elements in length
0x01: 0x0000000000000000000000000000000000000000000000000000000000000002
- The array elements representation will be stored sequentially at the hash of the slot index of the array declaration which is slot 1, as follows:
Keccak256(0000000000000000000000000000000000000000000000000000000000000001) = b10e2d527612073b26eecdfd717e6a320cf44b4afac2b0732d9fcbe2b7fa0cf6
so this the where the array element of index 0 will be stored, now it’s time to store the element itself which has the value 50 as follows:
0000000000000000000000000000000000000000000000000000000000000032
- The second element with a value of 60 will be stored right after the first element by incrementing the hash of the declaration slot, as follows:
storage location of the first element with index 0 was:
b10e2d527612073b26eecdfd717e6a320cf44b4afac2b0732d9fcbe2b7fa0cf6
storage location of the second element with index 1 will increment the hash to be: b10e2d527612073b26eecdfd717e6a320cf44b4afac2b0732d9fcbe2b7fa0cf7
and bytes32 representation of 60 is:
000000000000000000000000000000000000000000000000000000000000003c
String in Slot 2:
- Bytes32 representation of the
_name
Jerome is
4a65726f6d650000000000000000000000000000000000000000000000000000
then multiplying its length of 6 characters by 2 which equals 12 that is added at the most right side as displayed in slot 2 and add hexadecimal.
0x4a65726f6d650000000000000000000000000000000000000000000000000012
1–3. Layout of Inherited State Variables in Storage:
- For contracts that use inheritance, the order of storage is determined by C3 linearization, starting with the parent contract and then the child contract.
- If a child contract has multiple parents, the order of storage starts with the most base-ward contract and proceeds by the order of inheritance.
- State variables from different inherited contracts share the same storage slot.
contract First {
uint256 private x; // x = 0
}contract Second {
uint256 private y; // y = 1
}
contract Third is First, Second {
uint256 private z; // z = 2
}
// storage Layout
// 0x00: 0x0000000000000000000000000000000000000000000000000000000000000000
// 0x01: 0x0000000000000000000000000000000000000000000000000000000000000001
// 0x02 : 0x0000000000000000000000000000000000000000000000000000000000000002
2- Errors Layout
Solidity has a set of predefined errors but starting from v0.8.4 it allowed developers to define custom errors by name and argument type. A general rule is that errors are stored by the first 4 bytes of the hashing the error and any error data if any.
// bytes4(keccak256('InsufficientBalance(uint256,uint256)')
bytes32 constant insufficientBalanceSelector = 0xcf47918100000000000000000000000000000000000000000000000000000000;// bytes4(keccak256('UnauthorizedCaller()')
bytes32 constant unauthorizedCallerSelector = 0x5c427cd900000000000000000000000000000000000000000000000000000000;
error InsufficientBalanceSelector(uint256 available, uint256 required);
error UnauthorizedCaller();
function transfer(address to, uint256 amount) public pure {
assembly {
if eq(caller(), to) {
mstore(0x00, unauthorizedCallerSelector)
revert(0x00, 0x04)
}
let callerBalance := sload(keccak256(mload(0x40), 0x40))
if lt(callerBalance, amount) {
mstore(0x00, insufficientBalanceSelector)
revert(0x00, 0x04)
}
}
}
- Kindly focus on error handling and disregard the details of the assembly code, as it will be explained later; but if you have any questions, feel free to post them in the issues tab in GitHub Repo.
- Hash error
InsuffiecientBalance(uint256,uint256)
is cf4791818fba6e019216eb4864093b4947f674afada5d305e57d598b641dad1d - Taking the 4 left-most bytes as a selector: cf479181
- padding to bytes32 and adding the hexadecimal: 0xcf47918100000000000000000000000000000000000000000000000000000000
- Hash error
UnauthorizedCaller()
is 5c427cd9530cc2f15c24eb9ab95a0c7157bdefd597f18e0b4b4ed82a60681983 - Taking the 4 left-most bytes as a selector: cf479181
- padding to bytes32 and adding the hexadecimal: 0x5c427cd900000000000000000000000000000000000000000000000000000000
- First sanity check, if the caller’s address equals the destination’s address in the code block
if eq(caller(), to)
we are storing at slot 0 the error selectormstore(0x00, unauthorizedCallerSelector)
and revert the function execution with the message displayed to end user of the error of the 4 bytes. - Second sanity check, if the caller’s balance is less than the amount to be transferred, the function will revert with an error message of
InsufficientBalance
and data of theavailable
andrequired
amounts to the end user.
3- Memory Layout
While reading from and writing to memory is cheaper than storage, you still have to consider cost carefully when writing to memory as it’s cost quadratically; you can read more about gas in this guide.
Reading from memory is limited to a width of 256 bits, while writing can be either 8 bits or 256 bits wide, in the case of writing Solidity reserved 4 slots as follows:
0x00
(32 bytes) scratch space0x20
(32 bytes) scratch space0x40
(32 bytes) free memory pointer0x60
(32 bytes) zero slot
The 64 bytes scratch spaces are used for hashing methods and shouldn’t be touched or written to. When coding in inline assembly, writing to memory should always start after the free pointer, and that’s why we load from memory the first 2 slots as reserved mload(0x40)
.
Worth to note that variables are stored differently in memory than in storage:
- Arrays always occupy multiples of 32 bytes; one slot points to the value in memory, one slot indicates the length, then one slot for each element sequentially, and this is even true for bytes1[].
- String and bytes occupy 3 consecutive 32 bytes, one slot points to the string, one slot stores the length, and then one slot stores the actual data tightly packed and aligned to the left.
Example of how variables are stored differently in memory:
uint8[4] public ids;
In storage: the above array occupies 1 slot (8 *4 = 32 bytes)
In memory: the same array occupies 4 slots ( 4 * 32 = 128 bytes)
struct Person {
uint256 amount;
uint256 id;
uint8 rank;
uint8 deposit;
}
In storage: 2 slots for uint256 each and 1 slot for uint8 combined
In memory: 1 slot for each variable, 4 slots in total.
4- Calldata Layout
As per the ABI standards, the calldata is the first four bytes of the Keccak-256 hash of the signature of the function; it’s the function name with the parenthesizes list of parameter types and the return type of a function is not part of this signature.
Parameter types are split by a single comma — no spaces are used and each argument is padded to 32 bytes. If an argument is of dynamic size, the 32-byte slot will be a pointer to the dynamic value.
Solidity supports all the types with the exception of tuples, on the other hand, some Solidity types are not supported by the ABI but are represented with alternative types as follows:
- address payable: represented as address
- contract: represented as address
- enum: represented as uint8
- struct: represented as a tuple
How to Encode Different Argument Types and Hash the Function Selector
function baz(uint32 x, bool y) public pure returns (bool r) {
r = x > 32 || y;
}
- The 32 bytes hash of the function above is:
keccak256('baz(uint32,bool)')
equals to 0xcdcd77c0992ec5bbfc459984220f8c45084cc24d9b6efed1fae540db8de801d2
Taking the first left-most bytes as function selector or Id0xcdcd77c0
- Encode the first parameter, let’s say it has a value of
69
and is padded to 32 bytes0x000000000000000000000000000000000000000000000000000000045
- The secondparameter is
true
which always has the value of1
and is padded to 32 bytes0x000000000000000000000000000000000000000000000000000000001
function bar(bytes3[2] memory) public pure {}
keccak256('bar(bytes3[2])')
is fce353f601a3db60cb33e4b6ef4f91e4465eaf93c292b64fcde1bf4ba6819b6a
function selector:0xfce353f6
- The first parameter of value
abc
encoding is0x6162630000000000000000000000000000000000000000000000000000000000
- The second parameter of value
def
encoding is0x6465660000000000000000000000000000000000000000000000000000000000
function sam(bytes memory, bool, uint[] memory) public pure {}
If we wanted to call sam
with the arguments "dave"
, false
, and [1,2,3]
keccak256('sam(bytes,bool,uint256[])')
is 0xa5643bf27e2786816613d3eeb0b62650200b5a98766dfcfd4428f296fb56d043
noting that typeuint[]
is encoded as typeuint256[]
The function selector:0xa5643bf2
- The first argument is dynamic so it’s a pointer to the dynamic parameter measured in bytes from the start of the arguments block
0x000000000000000000000000000000000000000000000000000000060
- The second argument is false which is always zero
0x0000000000000000000000000000000000000000000000000000000000
- The third argument is of a dynamic type pointing to the location of the dynamic data
0x0000000000000000000000000000000000000000000000000000000a0
- Then first argument’s length of
dave
which is 40x000000000000000000000000000000000000000000000000000000004
- Then the bytes32 representation of
dave
is0x646176650000000000000000000000000000000000000000000000000
- The third argument’s length of 3 is
0x000000000000000000000000000000000000000000000000000000003
- Then the first value of the array
1
is0x000000000000000000000000000000000000000000000000000000001
- The second value of array
2
is0x000000000000000000000000000000000000000000000000000000002
- The last value of array
3
is0x000000000000000000000000000000000000000000000000000000003
5- Events Layout
As per the ABI standards, events are stored in the logs entries which include the contract’s address, series of topics, and some arbitrary binary data. Note that the address of the contract is provided internally and needs no manual encoding.
An event has a name and a series of event parameters; indexed parameters are called topics and non-indexed parameters are called the data.
An event can have up to four topics, the first topic is the keccak256 hash of the event signature, and the rest is based on actual event parameters.
Non-indexed parameters or arbitrary data are stored in memory and then passed to the log instructions a pointer to the start of the data and the length of the data.
event Transfer(address indexed sender, address indexed receiver, uint256 amount);function transfer(address to, uint256 amount) public returns(bool) {
_transfer(msg.sender, to, amount);
emit Transfer(msg.sender, to, amount)
}
That’s the transfer
function from ERC20 in Solidity, to code the event in inline assembly as follows:
event Transfer(address indexed sender, address indexed receiver, uint256 amount);function transfer(address to, uint256 amount) public returns(bool) {
// hash of the event name
bytes32 transferHash = keccak256("Transfer(address,address,uint256)")
// amount is non indexed so will be stored in memory
mstore(0x00, amount)
// event has 3 parameters
// `0x00` is the memory pointer
// `0x20` the 32 bytes length of amount
log3(0x00, 0x20, transferHash, caller(), receiver)
}
- Dynamically and statistically-sized arrays as indexed parameters are the concatenation of the encoding of its elements, always padded to 32 bytes without any length prefix.
bytes
andstring
endings are just the string contents without padding or length prefixes.struct
encoding is the concatenation of its members, always padded to 32 bytes evenbytes
andstring
.
Now let’s overview everything we learned so far in the access storage contract, link to the source code in GitHub:
[ad_2]
Source link