All About Anchor Account Size
TL;DR
Smart contracts using Anchor require developers to allocate space for new accounts and specify the account size. Anchor provides guidelines for calculating the size based on the account structure, but many developers use std::mem::size_of instead, as they don't have to manually update the size when making changes to the account structure. Are they equivalent? In this blog post, we conduct a systematic comparison of the results produced by std::mem::size_of and the Anchor space reference.
While std::mem::size_of generally works well and allocates more space than necessary, there are some scenarios where it may return a smaller size. Therefore, developers should understand the differences between these two methods before selecting the appropriate method.
1. The Comparison
In Anchor, developers commonly utilize std::mem::size_of::<T>() to determine the size of the Account struct. However, this function may not produce an exact match with the size of the data stored in the Account after serialization with BorshSerialize.
The table below provides a summary of the size calculation for different data types using both methods. While most types give the same results, discrepancies may arise for certain data types, such as Vector and Enum, where std::mem::size_of::<T>() may return a smaller value.
Note: all experiments in this blog post were done on a 64-bit machine.
| Type | std::mem::size_of<T>() | Anchor Space |
|---|---|---|
bool | 1 | 1 |
u8/i8 | 1 | 1 |
u16/i16 | 2 | 2 |
u32/i32 | 4 | 4 |
u64/i64 | 8 | 8 |
u128/i128 | 16 | 16 |
[T; amount] | space(T) * amount | space(T) * amount |
Pubkey | 32 | 32 |
Vec | 24 | 4 + space(T) * amount |
String | 24 | 4 + string length |
Option | pad(space(T)), when T is String/Vec/Box | 1 + space(T) |
pad(1 + space(T)), otherwise. | ||
Enum | see below | 1 + largest variant size |
Enum: size_of returns 0 for zero-variant enums, space(variant) for single-variant enums, and pad(space(max discriminant) + space(max variant)) otherwise.
2. The Differences
The size_of function in Rust returns a fixed value based on the size information of the type and may not provide an accurate calculation of the memory occupied for some objects.
2.1. Vec and String
pointer capacity length
+----------+----------+----------+
| 0x0123 | 4 | 2 |
+----------+----------+----------+
|
v
Heap +--------+--------+--------+--------+
| 'a' | 'b' | uninit | uninit |
+--------+--------+--------+--------+
-
Vector: when calculating the size of a Vector usingsize_of, the function returns a value of 24 since the Vector in memory comprises three fields: the pointer to the data, the capacity, and the length, each having a size of 8 bytes. -
String: similarly, for a string, thesize_offunction returns a value of 24, which is due to the fact that a String in memory consists of three fields: the pointer to the data, the capacity, and the length, each having a size of 8 bytes.
Example
pub struct Example {
pub example: Vec<T>,
}
When calculating the size of the Example account that contains a vector of type T, some developers may use the formula std::mem::size_of::<Example>() + (std::mem::size_of::<T>() * example.len()).
This approach is valid and may even result in an account size that exceeds the original structure size of 20 bytes, because after serialization with BorshSerialize, the content of the vector becomes the length (4 bytes) plus the size of the actual content.
Please note that it is necessary to account for the additional bytes required to store the length of the vector when calculating the total account size.
2.2. Option<T>
The Option<T> has two variants: None or Some. In memory, the None variant doesn't store any values but just a "tag" of 0, while the Some variant stores values together with the "tag" of 1.
However, the 0/1 "tag" is not needed if T is a Box or other smart pointer types. Since smart pointers in Rust cannot be 0, None can be represented as 0 and Some can be directly represented by the value of the pointer itself.
Therefore, when calculating the size of an Option<T> using the size_of function, the result will depend on the type T:
-
If it's a
Box<...>, aVec<...>or aString, the result is the size ofTafter the proper alignment padding. -
Otherwise, the result is the size of
Tplus 1 after the alignment padding.
In comparison, after serialization, the actual space occupied is 1 + space(T), because Option uses 1 byte to represent Some or None.
Example
println!("u8 = {}", std::mem::size_of::<u8>()); // 1
println!("Option<u8> = {}", std::mem::size_of::<Option<u8>>()); // 2
println!("u16 = {}", std::mem::size_of::<u16>()); // 2
println!("Option<u16> = {}", std::mem::size_of::<Option<u16>>()); // 4
println!("u32 = {}", std::mem::size_of::<u32>()); // 4
println!("Option<u32> = {}", std::mem::size_of::<Option<u32>>()); // 8
println!("u64 = {}", std::mem::size_of::<u64>()); // 8
println!("Option<u64> = {}", std::mem::size_of::<Option<u64>>()); // 16
println!("u128 = {}", std::mem::size_of::<u128>()); // 16
println!("Option<u128> = {}", std::mem::size_of::<Option<u128>>()); // 24
println!("Box<u8> = {}", std::mem::size_of::<Box<u8>>()); // 8
println!("Option<Box<u8>> = {}", std::mem::size_of::<Option<Box<u8>>>()); // 8
println!("Box<u16> = {}", std::mem::size_of::<Box<u16>>()); // 8
println!("Option<Box<u16>> = {}", std::mem::size_of::<Option<Box<u16>>>()); // 8
println!("Box<u32> = {}", std::mem::size_of::<Box<u32>>()); // 8
println!("Option<Box<u32>> = {}", std::mem::size_of::<Option<Box<u32>>>()); // 8
println!("Box<u64> = {}", std::mem::size_of::<Box<u64>>()); // 8
println!("Option<Box<u64>> = {}", std::mem::size_of::<Option<Box<u64>>>()); // 8
println!("Box<u128> = {}", std::mem::size_of::<Box<u128>>()); // 8
println!("Option<Box<u128>> = {}", std::mem::size_of::<Option<Box<u128>>>()); // 8
println!("Vec<u8> = {}", std::mem::size_of::<Vec<u8>>()); // 24
println!("Option<Vec<u8>> = {}", std::mem::size_of::<Option<Vec<u8>>>()); // 24
println!("Vec<u16> = {}", std::mem::size_of::<Vec<u16>>()); // 24
println!("Option<Vec<u16>> = {}", std::mem::size_of::<Option<Vec<u16>>>()); // 24
println!("Vec<u32> = {}", std::mem::size_of::<Vec<u32>>()); // 24
println!("Option<Vec<u32>> = {}", std::mem::size_of::<Option<Vec<u32>>>()); // 24
println!("Vec<u64> = {}", std::mem::size_of::<Vec<u64>>()); // 24
println!("Option<Vec<u64>> = {}", std::mem::size_of::<Option<Vec<u64>>>()); // 24
println!("Vec<u128> = {}", std::mem::size_of::<Vec<u128>>()); // 24
println!("Option<Vec<u128>> = {}", std::mem::size_of::<Option<Vec<u128>>>()); // 24
println!("Vec<String> = {}", std::mem::size_of::<Vec<String>>()); // 24
println!("Option<String> = {}", std::mem::size_of::<Option<String>>()); // 24
2.3. Enum
Typically, when the size_of function is applied to an enum, the result equals the space of the discriminant plus the space of the largest variant, taking into account the proper memory alignment padding.
- Discriminant space. For a zero-variant enum or a single variant enum, the discriminant is not needed such that the space is 0. For others, it's the space needed by the largest discriminant or the space specified by
#[repr(inttype)]. - Variant space. For a non-data-carrying enum, it's 0. Otherwise, it's the same space needed by the variant field. For more details, please refer to the visualizing memory layout of Rust's data types tutorial.
Example
// zero-variant enum
pub enum Foo {}
// single-variant enum, the variant size = 0
pub enum Bar {
A(())
}
// single-variant enum, the variant size = 8
pub enum Baz {
A(u64)
}
// largest discriminant = 1 (u8, 1 byte) + largest variant size = 16
pub enum Qux {
A(u8),
B(u128)
}
// discriminant (u16, 2 byte) + largest variant size = 4
#[repr(u16)]
pub enum Fred {
A(bool),
B(u32) = 200,
C(u8) = 404
}
// discriminant (u64, 8 byte) + largest variant size = 4
#[repr(u64)]
pub enum Thud {
A(bool),
B(u32) = 200,
C(u8) = 404
}
fn main() {
println!("Foo : {}", std::mem::size_of::<Foo>()); // 0
println!("Bar : {}", std::mem::size_of::<Bar>()); // 0
println!("Baz : {}", std::mem::size_of::<Baz>()); // 8
println!("Qux : {}", std::mem::size_of::<Qux>()); // 24
println!("Fred: {}", std::mem::size_of::<Fred>()); // 8
println!("Thud: {}", std::mem::size_of::<Thud>()); // 16
}
3. Conclusion
In Rust, the size_of function returns a fixed value based on the size information of a given type, but it may not accurately calculate the memory occupied by the object.
Therefore, when calculating the size of an account, it is important to note that using the size_of function may lead to inconsistencies between the calculated size and the actual size of the account. To ensure accuracy, it is recommended to manually calculate the account size using the corresponding value from the official Anchor documentation.