Sharding is a concept in database design, and as implied by its name, sharding involves creating smaller parts from a larger one. In the context of databases, sharding results in the creation of smaller partitions in the ledger. These partitions are thus referred to as shards.
It is important to note that in sharding, the partitioning is done horizontally as opposed to vertically. A shard may contain data that is in all the other shards; however, these partitions are designed to include data that is accessible only through it, which means that the data in each shard is unique to it. To access the data and use it, one must queue the specific shard that contains said data.
Sharding is employed in database architecture because it can improve the performance of a database or search engine. The design tool does this because it reduces the index size of a ledger. As a result, the ledger can provide search results quicker. Additionally, because different shards can be stored on different servers, the tool can be beneficial for large corporations with large data sets that they need to store separately such as multinational corporations operating in different countries.
Sharding in Distributed Ledgers
Sharding has grown in popularity within the cryptocurrency community as a result of widespread concerns over blockchain scalability issues. For instance, the Bitcoin Network processes about seven transactions per second, and Ethereum is only slightly faster, handling around 15 operations per second. These are both paltry compared to large payment processors like Visa and Mastercard.
While the bitcoin community has dealt with its scaling issues in various ways, the Ethereum project has outlined a more streamlined approach to solving its scalability concerns. Ethereum’s approach involves switching to a Proof of Stake (PoS) algorithm, which will work in tandem with a sharded database design.
How Would Sharding Work on Ethereum?
During his keynote speech at an event held at the School of Business of the Singapore University of Social Sciences, Ethereum’s cocreator Vitalik Buterin attempted to explain the concept of sharding the Ethereum ledger in a straightforward manner. In his talk entitled “The Road Ahead,” he stated:
“Imagine that Ethereum has been split into thousands of islands. Each island can do its own thing. Each of the islands has its own unique features and everyone belonging on that island, i.e., the accounts, can interact with each other and they can freely indulge in all its features. If they want to contact with other islands, they will have to use some sort of protocol.”
Currently, on the Ethereum network, as well as other blockchains, each node stores the global state. The globals state refers to the account balances, contract code and storage and all additional relevant information. Additionally, all nodes process all transactions. While this provides for a very secure ledger, it dramatically limits to what extent the network can scale because, within this design, a blockchain is only as good as a single node on its network.
In other words, the speed of a blockchain is defined by how quick a single node is as all nodes must perform the same transaction over and over.
To address this challenge, the Ethereum network plans to implement a version of sharding. Within this new design, it will not be compulsory for each node participating in the system to handle the entire history of the blockchain when attempting to add a new transaction to the ledger. Instead, a node would only need to process the data in which the shard is located.
— Vitalik Non-giver of Ether (@VitalikButerin) April 30, 2018
Both quadratic and exponential sharding will be employed. Quadratic sharding refers to when a ledger is partitioned such that there is only one shard under the main chain. However, in exponential sharding, there can be shards within shards.
While the developers have yet to define what criteria they will use to partition the global states into shards, they have described how this might happen:
“For example, a sharding scheme on Ethereum might put all addresses starting with 0x00 into one shard, all addresses starting with 0x01 into another shard, etc. In the simplest form of sharding, each shard also has its own transaction history, and the effect of transactions in some shard k are limited to the state of shard k.”
Using this example, each node processing transactions related to wallets in a particular shard will only need to process the information that is in that specific shard. This increases speed as well as the number of transactions each node can handle.
Additionally, the new design will include a smart contract on the main chain. The smart contract will be responsible for managing how consensus is achieved between the shards and the main chain. The contract will be called the sharding manager contract. Additionally, the new design will introduce different types of network participants. Put otherwise:
“Multiple shards are handled separately by different subsets of securing participants, aka securitors (which include notaries, proposers, miners, and validators).”
In addition to new types of network participants, the new design will also result in different kinds of nodes. The super-full node stores every collation of every shard, as well as the main chain, fully proving everything. The top-level node will handle all main chain blocks, giving them “light client” access to all shards. Moreover, single-shard nodes are a type of top-level node, which also fully download and verify every collation on some specific shard that it cares more about. Lastly, the light node downloads and validates the block headers of the main chain blocks exclusively and only access the data it needs for a specific transaction.
While random cross-shard communication would negate the entire concept, the developers have provided for it in the cases where it is necessary. Using a receipt system, shards will be able to verify information from each other.
It is important to note that sharding can only be implemented within the Ethereum Network after it has moved to PoS. However, once implemented, it is expected to improve the network significantly.