State Explosion and Why it Matters – Part 1



State Explosion and Why it Matters: Part One

A state machine is a conceptual model that describes a system’s various states and the transitions between those states. In the context of blockchain, a state machine represents the changing states of the blockchain network as transactions are processed, and the overall system evolves.

However, one of the fundamental problems with state machines is the “state explosion problem.” With regards to blockchain, the state explosion problem arises when the size of the blockchain’s state grows exponentially due to increasing transaction volume, smart contract execution, and data storage within the blockchain. This growth in state size presents significant challenges in terms of scalability, storage requirements, network performance, and potential centralization concerns, necessitating innovative solutions to mitigate the impact of state explosion.

In this series of articles, we talk about state explosion, how EVM chains are challenged with this issue, and explore the different strategies to resolve this problem.

What We’ll Discuss

In part one of this series, we discuss in depth:

  • The State Trie Architecture of EVM
  • What state explosion is
  • What is blockchain state explosion
  • The relationship of nodes in a blockchain network and the EVM State Trie
  • How state explosion can impact the claims of blockchain technology
  • How BSC handles state explosion

State Explosion: The Problem at Hand

Blockchain technology has gained significant attention and adoption across various industries due to its decentralized and immutable nature. However, as blockchain networks scale and handle increasing transactions, a critical challenge known as “state explosion” arises. This phenomenon refers to the exponential growth of the blockchain’s state, leading to various performance and scalability concerns.

What is the State?

One of the most important concepts to clear before we dive into the details of the state explosion problem is what the state is in terms of blockchain. It is well known that data in blockchains are stored in blocks, and once saved, the data becomes immutable. It is important to understand whether all data saved is “state” and cannot be changed.

The state doesn’t only refer to current data but also to data in use. Data saved on the blockchain can be divided into two categories:

  • State — the current data in use.
  • History — the data other than the state, in simpler words, the ‘past data.’

In terms of blockchain, history is immutable to answer our question of which data is immutable. However, with new transactions happening constantly, the state changes accordingly. To better understand the impact of state explosion, it is important to understand the storage layout of the state trie of the Ethereum Virtual Machine (EVM), which is the core of several blockchains like Ethereum, BNB Chain, etc.

Merkle Patricia Tree (MPT)

The Ethereum Virtual Machine (EVM) is a computer program that executes smart contracts on EVM-compatible blockchain, like Ethereum from where it originates, BNB Smart Chain, Polygon, etc. The EVM stores information about the blockchain network using a special tree called a Merkle Patricia tree (MPT).

This tree structure helps track the system’s current state and the transactions that occur. In this tree, the bottom-most nodes store the actual data in blocks as block hashes, while higher-level nodes contain hashes of their child nodes, as shown in the figure below. When data is changed, the corresponding node hashes are updated up to the top of the tree. We can check if the data is the same by comparing the topmost hash.

This tree also allows us to prove the validity of specific data without storing all the information, saving storage space and ensuring the data’s integrity. For more in-depth details, refer here.

Representation of Nodes in a Merkle Tree

Now that we have a brief overview of the Merkle trees, let’s dive into the main objects in the EVM state storage layout. Remember, all storage tries to use MPT as their data storage structure.

State Trie Architecture

The state tree is the core of an EVM-based blockchain network. It has four types: world state trie, account storage trie, transaction trie, and transaction receipt trie. Each state trie is constructed with Merkle Patricia Tree/Trie (MPT), and only the root node (top node of state trie) is stored in the block for efficient use of storage.

The three main state tries: world state trie, transaction trie, and receipt trie are stored in the block. The account storage trie makes the leaf node in the world state trie.

State Trie Architecture (Reference: https://www.lucassaldanha.com/ethereum-yellow-paper-walkthrough-2/

World State (aka Global State Trie)

The World State Trie represents the global state of a blockchain network, encompassing current account states, balances, contract codes, and more. It plays a crucial role in determining transaction outcomes and smart contract execution.

When transactions or smart contracts are modified or created, the World State is updated accordingly and recorded in a new block. This ensures consistency among participants, allowing independent verification of transactions and contracts by the network participants.

The World State Trie is connected to the Account Storage Trie through the “storage root” field, with the Account Storage Tier serving as the leaf nodes in the World State Tier.

World State Trie and Account State

Account Storage Trie

EVM has two account types: Externally Owned Accounts (EOA) and Contract Accounts. EOAs are controlled by private keys, while Contract Accounts are smart contracts controlled by code.

The account state contains information about an EVM account, such as balances and transaction counts. Each account has its own state, with all fields except “codeHash” being mutable. Once a code is deployed on EVM, it cannot be changed and requires a new deployment.

For Contract Accounts on EVM, the Account Storage Trie is used to store associated data. EOAs, on the other hand, have an empty “storageRoot” field, and the “codeHash” field represents the hash of an empty string. EOAs do not have persistent storage and their state is primarily defined by their balance.

Contract Accounts rely on the Account Storage Trie to store and manipulate data. It uses a mapping of 32-byte integers to enable flexible and structured data storage within a contract. For more details, refer here.

Account Storage Trie and Account State

Transaction Trie

Transactions are essential in a blockchain network and are eminent to provide transparency and security, as they are responsible for the change of states in the EVM. Further, once a transaction is added to a block, it becomes immutable and cannot be modified. This immutability ensures the integrity of account balances (world state).

The Transaction trie stores transaction information in an MPT for efficient retrieval and verification on an EVM chain. Each leaf node represents a unique transaction containing sender and recipient addresses, values, gas prices, etc. These nodes are hashed with their parents to create the trie. The root node’s hash, called “transactionRoot,” is stored in the block’s header, referencing all transactions in the block. EVM has one transaction trie per block. For more details, refer here.

Transaction Trie and Transaction 

Transaction Receipt Trie

The Transaction Receipt Trie organizes and stores transaction receipts in a block. Each leaf node corresponds to a unique transaction and holds the receipt data. These leaf nodes are combined and hashed with parent nodes to form the entire Receipt Trie. The receiptRoot, stored in the block header, serves as a reference to the transaction receipts.

Receipt data includes transaction status, gas consumed, logs generated, and other metadata. The Receipt Trie validates transaction execution and facilitates efficient validation and auditing with Merkle proofs in the EVM-based blockchain. For more details, refer here.

Receipt Trie and Blocks

Importance of State Trie for Nodes on Blockchain Network

In a blockchain network, nodes play a crucial role in managing the state trie. While the goal is to allow consumer-grade devices to function as nodes, higher hardware requirements are typically necessary due to the local maintenance of the state trie. The state trie represents the current state of the blockchain network, ensuring its integrity. This section briefly explains the connection between nodes and the state trie.

  1. Storage of State Trie: Every node in a blockchain network keeps a local copy of the state trie, which serves as a reference to the current state of the entire network. This allows nodes to access and retrieve account information, validate transactions, execute smart contracts, and perform other operations.
  2. Synchronization: Nodes must synchronize their state trie with the network to maintain consistency. This involves receiving and processing blocks from other nodes, updating the state trie with new transactions, and propagating the updated state to other nodes. Synchronization ensures that nodes have the most recent version of the state trie and can effectively participate in the network’s consensus process.
  3. Validating Transaction: Upon receiving a new transaction, a node must validate it by verifying signatures, checking inputs, and ensuring the sender’s account has enough balance. To do this, the node accesses the state trie to retrieve the relevant account information and perform the necessary checks. The state trie is used by nodes to authenticate and validate transactions.
  4. Executing Smart Contracts: On a blockchain network, nodes are responsible for the execution of smart contracts. To do so, it is required to access the state trie for reading and updating the contract data. When a contract interacts with the blockchain, it can change the state trie by updating storage variables or creating new transactions. Nodes are responsible for executing these transactions and updating the state trie accordingly to reflect the changes made by smart contract interactions.
  5. State Updates: Nodes update the state trie with new transactions and blocks on the blockchain. They manage account balances, contract storage, and maintain an accurate network state. By adhering to consensus rules and coordinating state updates, nodes ensure consistency and integrity.

The state of the blockchain is maintained by nodes within the network through the use of the state trie. This trie is a crucial component that allows nodes to confirm transactions, carry out smart contracts, and maintain the network’s security. By working together to manage the state trie, nodes ensure the secure and decentralized operation of the blockchain network.

What is Blockchain State Explosion?

State explosion problem refers to the state growing rapidly and being out of control. Blockchain platforms that offer smart contract programmability, like Ethereum, BNB Chain, etc., face this problem because their users save all kinds of data on-chain, e.g., state data, history data, contract data, account data, transactions, etc. mass adoption, this problem gains severity and requires attention to make sure the blockchain platform maintains its scalability and decentralization.

Impact of State Explosion

When a node participates in a blockchain network, it maintains a copy of the state trie locally. The state trie can be quite large, especially as the number of accounts and transactions increases over time. Each node needs to store and update the state trie to remain in sync with the network and perform various operations, such as transaction validation and execution of smart contracts.

With mass adoption and an increase in the state tire at an accelerated speed, state explosion can impact several different aspects of blockchain technology.

  1. Scalability: If the blockchain’s state data grows at an accelerating pace, this can hinder the scalability of the blockchain network. As the state explodes, the processing and validation of transactions in a timely manner becomes a huge challenge, resulting in congestion and longer confirmation times. Therefore, minimizing the overall throughput of the blockchain network.
  2. Storage Requirements: As the state size increases, the storage requirements for running a full node in the blockchain network become more demanding. This can pose a barrier to entry for participants with limited resources, potentially leading to centralization as only a few well-equipped entities can afford the necessary storage capacity.
  3. Network Performance: State explosion can negatively affect the performance of the blockchain network. Larger state sizes can lead to slower block propagation times, increased resource utilization, and a higher probability of forks or conflicts during the consensus process. These factors can degrade the overall efficiency and reliability of the network.
  4. Decentralization Concerns: The proliferation of state explosions can introduce centralization risks. Smaller participants or nodes with limited resources may struggle to keep up with the storage and computational demands of a growing state. This could result in a concentration of power and influence within a limited number of well-funded entities, undermining the decentralized nature of the blockchain.
  5. Maintenance and Upkeep: Managing and maintaining a blockchain with an exploding state becomes more complex and resource-intensive. It requires continuous allocation of resources for storage, backup, and data management. This can be a burden for network participants and limit the accessibility of running full nodes.

How BSC handles State Explosion

BNB Smart Chain (BSC) is one of the rapidly growing EVM blockchain platforms. Offering lower gas costs, faster finality, complete EVM compatibility, smart contract programmability, and several innovative solutions to Web3 developers, it is one of the biggest competitors of Ethereum, the pioneer of EVM chains.

The peak of BSC daily transactions reached 16 million on 25th Nov 2021, that is ~188 TPS continuously running for 24 hours. None of the other EVM blockchains have faced such large online traffic yet.

Due to the higher volume of traffic, the storage size requirements on BSC are also growing very rapidly. As of the end of 2022, a pruned BSC full node snapshot file is approximately 1.6TB in size, compared to approximately 1TB just one year ago.

The 1.6TB storage consists mainly of two parts:

  • Block Data (~2/3): this includes the block header, block body, and receipt;
  • World State (~1/3): this includes the account state and Key/Value (KV) storage state.

With a higher influx of transactions and smart contract deployments/interactions, the need for storage requirements will also increase linearly. Making it a problem that needs quick attention and a forward-compatible solution. A solution that will keep the storage size and hardware requirements for the nodes in check.

Over the years, BSC has implemented several solutions to keep this problem at bay, like sharding, pruning, layer 2 solutions, storage data structure optimizations, etc. However, BEP206 and BEP215 have surfaced as the latest and most suitable solutions to maintain the storage size and scalability of the BNB Chain.

BEP206 proposes a practical solution to address the problem of increasing world state storage on the BSC by removing expired storage state. Whereas BEP215 aims to introduce a state revival transaction type based on BEP-206. The details of these proposals will be covered in the upcoming parts of this article series. Note that both of these proposals are still in progress and under the community discussion phase.

Conclusion

Over the years, blockchain technology has gained immense popularity. However, with this, one of the fundamental issues that have surfaced and are of prime concern is the explosion in the size of the state trie, giving rise to the state explosion problem which is imminent to state machines.

Addressing the challenges posed by state explosion is crucial for ensuring the scalability, performance, and decentralization of blockchain networks. In brief, state explosion can damage the very integral claims of blockchain technology like decentralization, high throughput, scalability, etc.

These challenges require innovative solutions to make sure the mass adoption of blockchain technology is maintained. In the next part of this series, we discuss the different proposals that have been suggested to mitigate the impact of state explosion and enable the widespread adoption of blockchain technology.