What is the Solana Virtual Machine (SVM)?

Full Report

This article describes the process of executing a transaction within the Solana Virtual Machine. Unlike an EVM, where execution means executing opcodes in a VM, the SVM (Solana Virtual Machine) refers to the entire execution pipeline. This article explains the SVM in great detail. They choose to analyze codebases (Agave and Firedancer clients) rather than rely on a specification. The SVM has several main components: Banking Stage: Orchestrates the execution of a transaction at a specific slot. Gets the information from the Transaction Processing Unit (TPU). Handles parallel execution via checking account overlaps in transactions to be executed. BPF Loader: Loads and JIT compiles an sBPF program. sBPF VM: The sandboxed execution environment. uses a variant of BPF. These are also system calls to leave the sandbox. AccountsDB: Persistent state layer where all account data, including the running code, lives. A Solana program is usually written in Rust. It expects a single entrypoint function with the ProgramID of the program itself, an array of account information, and a byte slice of data needed for the transaction. The solana-program serves as the main library for interacting with the runtime, including transferring SOL and executing syscalls. Once compiled with Rust, keeping all its invariants, the code is lowered to LLVM IR. This is then translated into eBPF for actual usage. This allows for a pre-built sandbox to be executed within Solana. Solana originally forked eBPF for its own purposes but has since reverted to the original implementation. The Instruction Set Architecture (ISA) contains 11 registers and uses a RISC-like design for about 100 opcodes. There is the loaded program code, stack, heap, input data passed within the transaction, and other read-only data within defined memory regions of the VM. The execution has a verifier for illegal jumps, unhittable code paths, call depth restrictions and many other things. The runtime has many Syscalls for executing special functions for interacting with the outside world, such as logging and other contract calls. The program binary, that is uploaded by the user, is simply an ELF file. It contains a bytecode section, read-only data, BSS/data, and a symbol/relocation table. To upload the code to Solana, the special BPFLoaderProgram is used. Currently, there are two accounts associated with a program: one for the actual code and another for holding metadata about it. First, all of the bytecode is written via a combination of InitializeBuffer and Write() to an account. Once this is finished, a separate instruction runs various checks, such as bytecode and ELF verification. The pipeline for executing a transaction is now as follows: User signs a transaction and forwards their intent to an RPC provider for propagation and execution. The transaction is sent to the TPU on each Solana validator that received the TX. The transaction is sent to the TPU on each Solana validator that received the TX. This includes signature verification and the fetching of necessary information for execution. The banking stage performs the actual execution. This includes loading, JITing and execution, as described before. Post execution verification is the final step. This will ensure state consistency, account ownership checks, lamport checks, and more. Commit all of the account modifications to storage. Overall, a great post on teaching the nitty-gritty details of the SVM. This explains much of the complexities of execution, which is much appreciated!

Analysis Summary