Ethereum

‧

Dec 16, 2025

Getting Ethereum Ready for GigaGas

author

Senior QA Engineer at Nethermind, co-author of EIP-7935, focused on quality assurance and performance testing for Ethereum infrastructure.

co-author

Ben Adams

Marek Moraczynski

Carlos Bermudez

credits

Marius van der Wijden

Getting Ethereum Ready for GigaGas

Ethereum

•

December 16, 2025

TL;DR

‍

We introduced a new benchmark that uses identical workloads, making results stable and easy to reproduce.
Nethermind leads in both real-mainnet and high-pressure merged-block tests. The gap widens under heavier loads, where execution bottlenecks become more visible.
With the small blocks, the performance of Reth was slightly slower than Geth and Besu. However, there is visible performance degradation with bigger blocks.
This benchmark can also be used by L2 teams to make network-specific optimizations and perform tests on top of their state instead of the mainnet state.

‍

Current Approach

‍

When deciding on potential gas limit increases, client developers have to think like an attacker. They need to figure out what the worst-case scenario an adversary could target and how we can improve it.

‍

Though Ethereum Mainnet is rarely attacked, it's important to measure client improvements under the average case on real mainnet load. Client teams usually do this by syncing two versions and checking for significant differences between them.

‍

However, this approach is flawed. Since sync times are inherently variable, clients sync different heads, have different sets of peers, and so on. This lack of repeatability and high variance leads to misleading conclusions and hides real performance improvements or regressions that matter in practice.

‍

A New Benchmark

‍

At Nethermind, we developed a new benchmarking framework that removes the randomness of live-network testing by replaying the exact same mainnet blocks for every client. Instead of relying on whatever the network may produce on any given day, we use a fixed set of historical blocks that remain unchanged.

‍

Each run starts from the same mainnet snapshot, on the same hardware, with no 12-second gaps between blocks. This ensures that every client receives identical input data under identical execution conditions, enabling fair comparisons of results.

‍

Nethermind’s new, open-source, client-agnostic framework enables testing on a real mainnet database, rather than an empty or artificial state. Given that clients can behave differently when running with a database full of real clients, this framework reveals flaws and improvements that benchmarks running only on an empty state cannot properly capture.

‍

It also introduces a faster feedback cycle: run → reset → improve → run again - always against the same workload. This helps iterate quickly on potential improvements.

‍

Since the workload remains constant, the tests are fully repeatable. You can make a performance improvement today and measure its real impact tomorrow with confidence.

‍

Two Types of Controlled Tests

‍

The benchmark supports two modes aimed at capturing different aspects of execution performance. Both run blocks back-to-back with no pauses, increasing the speed of execution and removing amortization effects. This setup also exposes how clients handle garbage collection and state writes without any “downtime,” processing blocks in a continuous stream.

‍

1. Real Mainnet Payloads

‍

This mode replays historic mainnet blocks exactly as they occur onchain.

‍

*Figure 1. Gas consumption in the real mainnet payloads run, averaging 18.3 MGas per block with peaks of up to 36 MGas.*

‍

2. Merged Mainnet Payloads

‍

This mode merges multiple consecutive mainnet blocks into a single, synthetic “super-block”. It preserves the exact transaction order from the original block range, but delivers them to the client as one extremely large block. This allows stress-test execution clients under conditions far beyond today’s mainnet limits, simulating future scenarios with significantly higher block gas limits or the kinds of large batches commonly produced on Layer 2 rollups. By doing so, it helps reveal performance bottlenecks, edge cases, and potential scalability issues in a controlled and reproducible way.

‍

*Figure 2. Gas consumption in the merged mainnet payloads averages 1.1 GGas per block, with a maximum of 2 GGas. One hundred blocks were merged into a single block.*

‍

Benchmark Results

‍

Running the same workload under the same conditions makes performance differences clear. Below are the results for both test types: real mainnet payloads and merged payloads.

‍

Test methodology

‍

Block replay and load generation are handled with K6, with results visualized in Grafana. Nethermind used modest hardware: an OVH Advance-2 class machine with over 2 TB of snapshot storage and standard CPU/RAM, to align with EIP-7870 and home-staker viability.

‍

Using modest hardware avoids skewed results and keeps the benchmark representative of solo and home Ethereum operators today.

‍

For the test, we selected a block from the range of 22.360.000 to 22.370.000.

‍

Throughput (GGas/s) is used as the primary execution performance metric. It is calculated as the total gas consumed by a block, expressed in gigagas (GGas), divided by the block execution time in seconds:

‍

\[ \text{Throughput (GGas/s)} = \frac{\text{Block Gas Used (GGas)}}{\text{Block Execution Time (s)}} \]

‍

Real Mainnet Payload Results

‍

*Figure 3. Real mainnet blocks run showing block-by-block throughput across clients.*

‍

‍

Nethermind executes the real mainnet blocks at a mean 697MGas/s, which is significantly faster than other clients under identical replay conditions, and maintains stable throughput throughout the block range.

This exemplifies the focus we’ve set over the last years on performance.

‍

Merged Payload Results

‍

This test merges 100 consecutive real mainnet blocks into a single execution payload, averaging a block size of 1.1 GGas. It pushes clients far beyond typical mainnet block sizes and reveals deeper bottlenecks under extreme load.

‍

*Figure 4. 100 merged mainnet blocks run showing block-by-block throughput across clients.*

‍

‍

Even when processing massive merged blocks, Nethermind continues to lead throughput by between 2x and 10x. While Besu and Geth improved their average processing performance, Reth interestingly experienced 2x performance degradation compared to the non-merged tests. This can point client teams to adopting some of the improvements we have made in Nethermind.

‍

EDIT: The issue has been reported to Reth and the root cause has been identified. Tests will be rerun once improvements to the tooling are implemented.

‍

Possible Next Steps

‍

Instead of running one block after another, we can gather data on how clients would perform with different slot times, as this gives more time for amortization effects.
We want to test with a state much bigger than the Ethereum mainnet.
Add the Erigon client (wasn’t compatible due to a snapshot taken with an older version of Erigon 3).
Add the Ethrex client (database snapshot is not available in https://ethpandaops.io/data/snapshots/ for the specific block).
Test on the most recent blocks and larger ranges to capture more diverse activity.

Author

Kamil Chodoła

Senior QA Engineer at Nethermind, co-author of EIP-7935, focused on quality assurance and performance testing for Ethereum infrastructure.