Sec3 logo — Solana smart contract security firm
Back to Blog
Security

All About Anchor Account Size

Sec3 Research Team

TL;DR

Smart contracts using Anchor require developers to allocate space for new accounts and specify the account size. Anchor provides guidelines for calculating the size based on the account structure, but many developers use std::mem::size_of instead, as they don't have to manually update the size when making changes to the account structure. Are they equivalent? In this blog post, we conduct a systematic comparison of the results produced by std::mem::size_of and the Anchor space reference.

While std::mem::size_of generally works well and allocates more space than necessary, there are some scenarios where it may return a smaller size. Therefore, developers should understand the differences between these two methods before selecting the appropriate method.

1. The Comparison

In Anchor, developers commonly utilize std::mem::size_of::<T>() to determine the size of the Account struct. However, this function may not produce an exact match with the size of the data stored in the Account after serialization with BorshSerialize.

The table below provides a summary of the size calculation for different data types using both methods. While most types give the same results, discrepancies may arise for certain data types, such as Vector and Enum, where std::mem::size_of::<T>() may return a smaller value.

Note: all experiments in this blog post were done on a 64-bit machine.

Typestd::mem::size_of<T>()Anchor Space
bool11
u8/i811
u16/i1622
u32/i3244
u64/i6488
u128/i1281616
[T; amount]space(T) * amountspace(T) * amount
Pubkey3232
Vec244 + space(T) * amount
String244 + string length
Optionpad(space(T)), when T is String/Vec/Box1 + space(T)
pad(1 + space(T)), otherwise.
Enumsee below1 + largest variant size

Enum: size_of returns 0 for zero-variant enums, space(variant) for single-variant enums, and pad(space(max discriminant) + space(max variant)) otherwise.

2. The Differences

The size_of function in Rust returns a fixed value based on the size information of the type and may not provide an accurate calculation of the memory occupied for some objects.

2.1. Vec and String

         pointer    capacity    length
       +----------+----------+----------+
       | 0x0123   |       4  |        2 |
       +----------+----------+----------+
            |
            v
Heap   +--------+--------+--------+--------+
       |    'a' |    'b' | uninit | uninit |
       +--------+--------+--------+--------+
  • Vector: when calculating the size of a Vector using size_of, the function returns a value of 24 since the Vector in memory comprises three fields: the pointer to the data, the capacity, and the length, each having a size of 8 bytes.

  • String: similarly, for a string, the size_of function returns a value of 24, which is due to the fact that a String in memory consists of three fields: the pointer to the data, the capacity, and the length, each having a size of 8 bytes.

Example

pub struct Example {
  pub example: Vec<T>,
}

When calculating the size of the Example account that contains a vector of type T, some developers may use the formula std::mem::size_of::<Example>() + (std::mem::size_of::<T>() * example.len()).

This approach is valid and may even result in an account size that exceeds the original structure size of 20 bytes, because after serialization with BorshSerialize, the content of the vector becomes the length (4 bytes) plus the size of the actual content.

Please note that it is necessary to account for the additional bytes required to store the length of the vector when calculating the total account size.

2.2. Option<T>

The Option<T> has two variants: None or Some. In memory, the None variant doesn't store any values but just a "tag" of 0, while the Some variant stores values together with the "tag" of 1.

However, the 0/1 "tag" is not needed if T is a Box or other smart pointer types. Since smart pointers in Rust cannot be 0, None can be represented as 0 and Some can be directly represented by the value of the pointer itself.

Therefore, when calculating the size of an Option<T> using the size_of function, the result will depend on the type T:

  • If it's a Box<...>, a Vec<...> or a String, the result is the size of T after the proper alignment padding.

  • Otherwise, the result is the size of T plus 1 after the alignment padding.

In comparison, after serialization, the actual space occupied is 1 + space(T), because Option uses 1 byte to represent Some or None.

Example

println!("u8 = {}",                std::mem::size_of::<u8>());                // 1
println!("Option<u8> = {}",        std::mem::size_of::<Option<u8>>());        // 2

println!("u16 = {}",               std::mem::size_of::<u16>());               // 2
println!("Option<u16> = {}",       std::mem::size_of::<Option<u16>>());       // 4

println!("u32 = {}",               std::mem::size_of::<u32>());               // 4
println!("Option<u32> = {}",       std::mem::size_of::<Option<u32>>());       // 8

println!("u64 = {}",               std::mem::size_of::<u64>());               // 8
println!("Option<u64> = {}",       std::mem::size_of::<Option<u64>>());       // 16

println!("u128 = {}",              std::mem::size_of::<u128>());              // 16
println!("Option<u128> = {}",      std::mem::size_of::<Option<u128>>());      // 24

println!("Box<u8> = {}",           std::mem::size_of::<Box<u8>>());           // 8
println!("Option<Box<u8>> = {}",   std::mem::size_of::<Option<Box<u8>>>());   // 8

println!("Box<u16> = {}",          std::mem::size_of::<Box<u16>>());          // 8
println!("Option<Box<u16>> = {}",  std::mem::size_of::<Option<Box<u16>>>());  // 8

println!("Box<u32> = {}",          std::mem::size_of::<Box<u32>>());          // 8
println!("Option<Box<u32>> = {}",  std::mem::size_of::<Option<Box<u32>>>());  // 8

println!("Box<u64> = {}",          std::mem::size_of::<Box<u64>>());          // 8
println!("Option<Box<u64>> = {}",  std::mem::size_of::<Option<Box<u64>>>());  // 8

println!("Box<u128> = {}",         std::mem::size_of::<Box<u128>>());         // 8
println!("Option<Box<u128>> = {}", std::mem::size_of::<Option<Box<u128>>>()); // 8

println!("Vec<u8> = {}",           std::mem::size_of::<Vec<u8>>());           // 24
println!("Option<Vec<u8>> = {}",   std::mem::size_of::<Option<Vec<u8>>>());   // 24

println!("Vec<u16> = {}",          std::mem::size_of::<Vec<u16>>());          // 24
println!("Option<Vec<u16>> = {}",  std::mem::size_of::<Option<Vec<u16>>>());  // 24

println!("Vec<u32> = {}",          std::mem::size_of::<Vec<u32>>());          // 24
println!("Option<Vec<u32>> = {}",  std::mem::size_of::<Option<Vec<u32>>>());  // 24

println!("Vec<u64> = {}",          std::mem::size_of::<Vec<u64>>());          // 24
println!("Option<Vec<u64>> = {}",  std::mem::size_of::<Option<Vec<u64>>>());  // 24

println!("Vec<u128> = {}",         std::mem::size_of::<Vec<u128>>());         // 24
println!("Option<Vec<u128>> = {}", std::mem::size_of::<Option<Vec<u128>>>()); // 24

println!("Vec<String> = {}",       std::mem::size_of::<Vec<String>>());       // 24
println!("Option<String> = {}",    std::mem::size_of::<Option<String>>());    // 24

2.3. Enum

Typically, when the size_of function is applied to an enum, the result equals the space of the discriminant plus the space of the largest variant, taking into account the proper memory alignment padding.

  • Discriminant space. For a zero-variant enum or a single variant enum, the discriminant is not needed such that the space is 0. For others, it's the space needed by the largest discriminant or the space specified by #[repr(inttype)].
  • Variant space. For a non-data-carrying enum, it's 0. Otherwise, it's the same space needed by the variant field. For more details, please refer to the visualizing memory layout of Rust's data types tutorial.

Example

// zero-variant enum
pub enum Foo {}

// single-variant enum, the variant size = 0
pub enum Bar {
    A(())
}

// single-variant enum, the variant size = 8
pub enum Baz {
    A(u64)
}

// largest discriminant = 1 (u8, 1 byte) + largest variant size = 16
pub enum Qux {
    A(u8),
    B(u128)
}

// discriminant (u16, 2 byte) + largest variant size = 4
#[repr(u16)]
pub enum Fred {
    A(bool),
    B(u32) = 200,
    C(u8) = 404
}

// discriminant (u64, 8 byte) + largest variant size = 4
#[repr(u64)]
pub enum Thud {
    A(bool),
    B(u32) = 200,
    C(u8) = 404
}


fn main() {
    println!("Foo : {}", std::mem::size_of::<Foo>());  // 0
    println!("Bar : {}", std::mem::size_of::<Bar>());  // 0
    println!("Baz : {}", std::mem::size_of::<Baz>());  // 8
    println!("Qux : {}", std::mem::size_of::<Qux>());  // 24
    println!("Fred: {}", std::mem::size_of::<Fred>()); // 8
    println!("Thud: {}", std::mem::size_of::<Thud>()); // 16
}

3. Conclusion

In Rust, the size_of function returns a fixed value based on the size information of a given type, but it may not accurately calculate the memory occupied by the object.

Therefore, when calculating the size of an account, it is important to note that using the size_of function may lead to inconsistencies between the calculated size and the actual size of the account. To ensure accuracy, it is recommended to manually calculate the account size using the corresponding value from the official Anchor documentation.

Related Posts

Security

IDL Guesser

The Solana ecosystem thrives on innovation, but many Anchor-based programs do not publish up-to-date IDLs, which complicates the analysis of such programs and their transactions. To tackle this, we developed and open-sourced a prototype tool called IDL Guesser. This tool aims to automatically recover instruction definitions, required accounts (including signer/writable flags), and parameter information directly from closed-source Solana program binaries. This blog outlines the approach behind IDL Guesser and discusses potential areas for future improvement.

Read more