- Running an archive node requires downloading 4 terabytes of data.
- Most Ethereum clients instead opt to store only the essential blockchain data.
- Ethereum's state bloat issue remains to be solved.
Archive nodes running the Ethereum blockchain have now climbed to over four terabytes in size, nearly doubling in a year.
This value represents the total amount of data a user would need to download if they intend to run an archive node—a special type of full node that is running in archive mode. These archive nodes store a complete snapshot of the Ethereum blockchain, including all the transaction records that have ever happened, unlike typical full nodes, which simply record a ledger of verified transactions.
Archive nodes are not necessarily needed, since full nodes have a copy of all transactions but they are useful for certain tasks, such as finding out how the balance of an Ethereum address at a point in time.
The two most popular ways of running Ethereum are Parity and Geth. The Parity chain now weighs 4,016 GB, while Geth’s is 3,949 GB.
The size of both the Parity and Geth archives have increased by around 13% since the start of 2020, at a moment when there was a large increase in Ethereum transactions. At this rate, Ethereum archive nodes are on track to hit 5,000 GB by the end of 2020.
In comparison, the Bitcoin blockchain currently weighs just 271 GB, despite being around for half a decade longer than Ethereum. Bitcoin’s blockchain stays smaller because it has a stricter limit on the number of transactions that can happen per block and it is typically used for standard payments rather than more complicated tasks like smart contracts.
According to Ethernodes, around 76% of Ethereum's 5,942 nodes are currently running Geth, whereas 21% are on parity. Less than three percent of nodes run alternative clients, like Nethermind or OpenEthereum.
However, of these almost 6,000 nodes in operation, only a small fraction operate in archive mode, and most instead operate as a simple full node with pruning enabled to increase sync times. Full nodes only need to sync around 308 GB of data to get up to speed with the current state of the Ethereum blockchain, while a warp node only needs to download a snapshot of 30,000 blocks to get up to sync.
Part of the reason archive nodes are less common is due to the technical requirements of running one—after all, not everybody has 4 TB of free space lying around to host a copy of the entire Ethereum blockchain.
It’s also time consuming and difficult. It took Eric Wall, CIO at Arcane Assets, 35 days to sync an Ethereum full node from scratch. And that’s only 200 GB—five percent of an archive node.
Ethereum Syncing Diary - Day 25
This was a triumph
... Blocks synced: 8,536,062/8,536,062
I'm making a note here: HUGE SUCCESS
... Blockchain size: 195 GB
It's hard to overstate--
... # blocks synced last 24h: 59k (9 days worth)
--my satisfaction pic.twitter.com/As1jHa0fVb
— Eric Wall IS RIGHT (@ercwl) September 12, 2019
So, a solution to Ethereum's state bloat problem is needed now more than ever. It looks like we’re gonna need a bigger bloat.