Sec3 logo — Solana smart contract security firm
Back to Blog
Education

Solana Internals Part 3

Sec3 Research Team

Solana recently experienced severe performance degradation due to network congestion. The TPS (number of transactions processed per second) dropped by orders of magnitude (from thousands to tens) for several hours.

Technically, this problem is caused by performance bugs in Solana, in particular — the transaction processing unit (TPU). During market volatility, bots are heavily spraying duplicate spam and that bogs down the TPU.

This article elaborates on the design of the TPU and highlights some intricacies.

Transactions

Screenshot of the Solana Explorer showing a Transaction History table with columns for Transaction Signature, Slot, Age, and Result. Several transactions are listed with their base-58 encoded signatures, slot numbers around 116,418,640 -- 116,441,681, ages ranging from "an hour ago" to "5 hours ago", and results showing "Success" (with one "Failed").

What Is Included In a Transaction?

When a user submits a transaction, it includes a pre-compiled representation of a sequence of instructions, called “message:

Transaction::new_unsigned(message: Message::new(
    instructions: &[bpf_loader_upgradeable::set_upgrade_authority(
        program_address: pubkey,
        current_authority_address: &authority_signer.pubkey(),
        new_authority_address: new_authority.as_ref(),
    )],
    payer: Some(&config.signers[0].pubkey()),
))

The message must be signed by one or or more keypairs:

let signatures: Vec<Signature> = keypairs.try_sign_message(&self.message_data())?;
for i: usize in 0..positions.len() {
    self.signatures[positions[i]] = signatures[i];
}

The signed signatures are also included in the transaction, and together with the message content, are sent to the Solana cluster via RPCRequest:

let serialized_encoded: String = serialize_and_encode::<Transaction>(input: transaction, encoding)?;
let signature_base58_str: String = match self.send(
    request: RpcRequest::SendTransaction,
    params: json!([serialized_encoded, config]),
) {

The Transaction Processing Unit

impl Tpu {
    #[allow(clippy::too_many_arguments)]
    pub fn new(
        cluster_info: &Arc<ClusterInfo>,
        poh_recorder: &Arc<Mutex<PohRecorder>>,
        entry_receiver: Receiver<WorkingBankEntry>,
        retransmit_slots_receiver: RetransmitSlotsReceiver,
        sockets: TpuSockets,
        subscriptions: &Arc<RpcSubscriptions>,
        transaction_status_sender: Option<TransactionStatusSender>,
        blockstore: &Arc<Blockstore>,
        broadcast_type: &BroadcastStageType,
        exit: &Arc<AtomicBool>,
        shred_version: u16,
        vote_tracker: Arc<VoteTracker>,
        bank_forks: Arc<RwLock<BankForks>>,
        verified_vote_sender: VerifiedVoteSender,
        gossip_verified_vote_hash_sender: GossipVerifiedVoteHashSender,
        replay_vote_receiver: ReplayVoteReceiver,
        replay_vote_sender: ReplayVoteSender,
        bank_notification_sender: Option<BankNotificationSender>,
        tpu_coalesce_ms: u64,
        cluster_confirmed_slot_sender: GossipDuplicateConfirmedSlotsSender,
        cost_model: &Arc<RwLock<CostModel>>,

Upon receiving a transaction, the TPU has three main stages to process it.

  1. fetch_stage batches input from a UDP socket and sends it to 2.

  2. sigverify_stage verifies if the signature in the transaction is valid and send the transaction to 3.

  3. banking_stage processes the verified transaction

> All these three stages are executed by different threads communicated via message passing using crossbeam_channel (a multi-producer multi-consumer channel).

1. fetch_stage

The TPU creates a channel of unbounded capacity with (packet_sender_, _packet_receiver):

The fetch_stage reads the packets on the transaction sockets, and simply forwards them to the sigverify_stage using packet_sender .

2. sigverify_stage

The sigverify_stage receives the transaction packets from packet_receiver and uses TransactionSigVerifier to verify if the signature in each packet is valid.

It assumes each packet contains one transaction, and the packets are verified in parallel using all available CPU cores (and it can also be done on GPU if available).

Note that the TPU creates another channel (verified_sender, verified_receiver), and it uses verified_sender to forward the verified transactions to the next stage (banking_stage).

The verifier is of significant interest

> It not only verifies the signature but is also piggybacked to filter out redundant packets and discard excessive packets in order to improve performance. The fixes to the recent performance degradation are applied in this component.

It contains three steps:

  • deduper — The filter that removes duplicated transactions (typically sent by bots)

  • discard_excess_packets — The filter that discards excessive packets from each IP address. It groups packets by IP addresses, and allocates max_packets evenly across addresses.

  • verify_batches — it uses ed25519_dalek to verify message signatures in those packets that are not discarded in the previous steps.

let mut dedup_time: Measure = Measure::start(name: "sigverify_dedup_time");
let dedup_fail: usize = deduper.dedup_packets(&mut batches) as usize;
dedup_time.stop();
let num_unique: usize = num_packets.saturating_sub(dedup_fail);

let mut discard_time: Measure = Measure::start(name: "sigverify_discard_time");
if num_unique > MAX_SIGVERIFY_BATCH {
    Self::discard_excess_packets(&mut batches, max_packets: MAX_SIGVERIFY_BATCH)
};
let excess_fail: usize = num_unique.saturating_sub(MAX_SIGVERIFY_BATCH);
discard_time.stop();

let mut verify_batch_time: Measure = Measure::start(name: "sigverify_batch_time");
let batches: Vec<PacketBatch> = verifier.verify_batches(batches);
sendr.send(msg: batches)?;

The discard_excess_packets function is defined as:

pub fn discard_excess_packets(batches: &mut [PacketBatch], mut max_packets: usize) {
    // Group packets by their incoming IP address.
    let mut addrs: HashMap<IpAddr, Vec<&mut ...>> = batches: &mut [PacketBatch]
        .iter_mut(): IterMut<PacketBatch>
        .rev(): impl Iterator<Item = &mut _>
        .flat_map(|batch: &mut PacketBatch| batch.packets.iter_mut().rev()): impl Iterator<Item = &mut ...>
        .filter(|packet: &&mut Packet| !packet.meta.discard()): impl Iterator<Item = &mut ...>
        .map(|packet: &mut Packet| (packet.meta.addr, packet)): impl Iterator<Item = (IpAddr, ...)>
        .into_group_map();

    // Allocate max_packets evenly across addresses.
    while max_packets > 0 && !addrs.is_empty() {
        let num_addrs: usize = addrs.len();
        addrs.retain(|_, packets: &mut Vec<&mut Packet>| {
            let cap: usize = (max_packets + num_addrs - 1) / num_addrs;
            max_packets -= packets.len().min(cap);
            packets.truncate(len: packets.len().saturating_sub(cap));
            !packets.is_empty()
        });
    }

    // Discard excess packets from each address.
    for packet: &mut Packet in addrs.into_values().flatten() {
        packet.meta.set_discard(true);
    }
}

The ed25519_dalek::PublicKey.verify function is defined as:

impl Verifier<ed25519::Signature> for PublicKey {
    /// Verify a signature on a message with this keypair's public key.
    ///
    /// # Return
    ///
    /// Returns `Ok(())` if the signature is valid, and `Err` otherwise.
    #[allow(non_snake_case)]
    fn verify(
        &self,
        message: &[u8],
        signature: &ed25519::Signature
    ) -> Result<(), SignatureError>
    {
        let signature = InternalSignature::try_from(signature)?;

        let mut h: Sha512 = Sha512::new();
        let R: EdwardsPoint;
        let k: Scalar;
        let minus_A: EdwardsPoint = -self.1;

        h.update(signature.R.as_bytes());
        h.update(self.as_bytes());
        h.update(&message);

        k = Scalar::from_hash(h);
        R = EdwardsPoint::vartime_double_scalar_mul_basepoint(&k, &minus_A, &signature.s);

        if R.compress() == signature.R {
            Ok(())
        } else {
            Err(InternalError::VerifyError.into())
        }
    }
}

It takes a signature and a message as input, and verifies the signature with respect to the message using the key pair’s public key.

Note that the ed25519_dalek::PublicKey.verify function is non-trivial and subtle, and it is not audited.

3\. banking_stage

let banking_stage: BankingStage = BankingStage::new(
    cluster_info,
    poh_recorder,
    verified_receiver,
    tpu_verified_vote_receiver: verified_tpu_vote_packets_receiver,
    verified_vote_receiver: verified_gossip_vote_packets_receiver,
    transaction_status_sender,
    gossip_vote_sender: replay_vote_sender,
    cost_model.clone(),

The banking_stage creates a thread which executes in a loop to process the received transactions batch by batch. The number of transactions in each batch is limited by

Builder::new(): Builder
    .name("solana-banking-stage-tx".to_string()): Builder
    .spawn(move || {
        Self::process_loop(
            &verified_receiver,
            &poh_recorder,
            &cluster_info,
            &mut recv_start,
            forward_option,
            id: i,
            batch_limit,
            transaction_status_sender,
            gossip_vote_sender,
            &data_budget,
            cost_model,

The banking_stage uses an important component called bank to load and execute transactions. The function is defined as:

bank.load_and_execute_transactions(
    batch,
    max_age: MAX_PROCESSING_AGE,
    enable_cpi_recording: transaction_status_sender.is_some(),
    enable_log_recording: transaction_status_sender.is_some(),
    &mut execute_timings,
);

For each transaction, the bank uses MessageProcessor to process the transaction message:

let mut process_message_time: Measure = Measure::start(name: "process_message_time");
let process_result: Result<ProcessedMessageInfo, _> = MessageProcessor::process_message(
    builtin_programs: &self.builtin_programs.vec,
    tx.message(),
    &loaded_transaction.program_indices,
    &mut transaction_context,
    self.rent_collector.rent,

This method calls each instruction in the message over the set of loaded accounts.

For each instruction, it calls the program entrypoint and verifies that the result of the call does not violate the bank’s accounting rules.

if name.ends_with("loader_program") {
    let entrypoint: Symbol<fn(&Pubkey, &[u8], ...) -> ...> =
        Self::get_entrypoint::<LoaderEntrypoint>(name, &self.loader_symbol_cache)?;
    unsafe { entrypoint(&program_id, instruction_data, invoke_context) }
} else {
    let entrypoint: Symbol<fn(&Pubkey, &[KeyedAccount], ...) -> ...> =
        Self::get_entrypoint::<ProgramEntrypoint>(name, &self.program_symbol_cache)?;
    unsafe {
        entrypoint(
            &program_id,
            invoke_context.get_keyed_accounts()?,
            instruction_data,
        )
    }
}

Internally, the bank creates an InvokeContext to execute each instruction:

let mut invoke_context: InvokeContext = InvokeContext::new(
    transaction_context,
    rent,
    builtin_programs,
    sysvar_cache: Cow::Borrowed(sysvar_cache),
    log_collector,
    compute_budget,
    executors,
    feature_set,
    blockhash,
    lamports_per_signature,
    current_accounts_data_len,

Each transaction has a limited compute budget (by default 200_000 units), defined in ComputeBudget :

pub struct ComputeBudget {
    /// Number of compute units that an instruction is allowed.  Compute units
    /// are consumed by program execution, resources they use, etc...
    pub max_units: u64,
    /// Number of compute units consumed by a log_u64 call
    pub log_64_units: u64,
    /// Number of compute units consumed by a create_program_address call
    pub create_program_address_units: u64,
    /// Number of compute units consumed by an invoke call (not including the
    /// the called program)
    pub invoke_units: u64,
    /// Maximum cross-program invocation depth allowed
    pub max_invoke_depth: usize,
    /// Base number of compute units consumed to call SHA256
    pub sha256_base_cost: u64,
    /// Incremental number of units consumed by SHA256 (based on bytes)
    pub sha256_byte_cost: u64,
    /// Maximum BPF to BPF call depth
    pub max_call_depth: usize,

The bank involves a lot of complications to execute an instruction, such as

  • loading the specified programs
  • creating the rbpf vm to execute the bfp code
  • dealing with CPI (cross-program invocation) via syscalls
  • verifying that the called programs have not misbehaved
  • measuring the computing units, etc.

We will elaborate on these details and the bank life cycle in the next article.


About sec3 (Formerly Soteria)

sec3 is a security research firm that prepares Solana projects for millions of users. sec3’s Launch Audit is a rigorous, researcher-led code examination that investigates and certifies mainnet-grade smart contracts; sec3’s continuous auditing software platform, X-ray, integrates with GitHub to progressively scan pull requests, helping projects fortify code before deployment; and sec3’s post-deployment security solution, WatchTower, ensures funds stay safe. sec3 is building technology-based scalable solutions for Web3 projects to ensure protocols stay safe as they scale.

To learn more about sec3, please visit https://www.sec3.dev

Related Posts

Education

Solana Programs Part 4

The Metaplex Candy Machine is among the most popular smart contracts used for NFT minting on Solana. Recently, it has even implemented sophisticated logic for detecting and taxing bots. How does the candy machine program work internally? What are its intended use cases and dependencies? How does it detect bots? This article elaborates on these technical details.

Read more