Releasing NOMT v1.0-preview

Today, I’m happy to announce the preview release of NOMT 1.0. For the last year, we at Thrum have been building NOMT as a custom blockchain state database solution combining the throughput of Solana with the interoperability of a merkle tree.

NOMT is resilient to crashes and drive failures of various kinds through the use of standard database design techniques: write-ahead logging and shadow paging. We have built a custom simulation and testing tool, codenamed “torture”, which has been used to ensure reliability in an enormous combination of workloads in the presence of random crashes.

Our performance target for NOMT was to achieve 25,000 transfers per second with a database of 1 billion items on low-cost (< $1200) consumer hardware, and I am pleased to announce we have cleared this benchmark. For smaller databases with fewer than 128 million entries, NOMT can achieve over 50,000 transfers per second.

NOMT is written in Rust and has a thread-safe API, useful for parallel VMs, and includes optional in-memory overlays for handling unfinalized blocks.

We expect NOMT to scale well with the rapid advancement of NVMe SSDs as a technology. SSDs are getting faster at an incredible rate, with new technologies such as PCIv5 (a higher-throughput communication bus) and 5th generation controllers such as the Phison E28 on the horizon.

The intended use-case for NOMT is as a foundational building block for high-throughput blockchain nodes: for use in SDKs and custom node implementations. It is unopinionated on data formats and supports values up to gigabytes in size. The first intended user is the Sovereign SDK and this work has been supported with a grant by Sovereign Labs.

Benchmarks and Methodology

We performed benchmarks on a machine with the following specifications:

The total hardware cost for this setup comes out to less than $1200, including the motherboard and power supply.

Our benchmark scenario emulates a simple value transfer from a random existing account to a random target account. We tested 3 scenarios: transferring to already-existing accounts, transferring to 50% new accounts, and transferring to 100% new accounts. This benchmark performs two random queries, one for each account and then updates. These scenarios are implemented within our custom benchmarking tool benchtop.

In each benchmark, we generated a merkle witness, gathering all the merkle nodes necessary to prove the reads and writes.

In two key ways, this is a “worst-case” benchmark: most blockchain usage follows a power-law distribution, where something like 80% of the state reads are on 20% of the state. Fully random accesses are the absolute worst-case for a blockchain, but are needed to serve the long tail of global usage rather than the fat tail of power usage. Furthermore, generating merkle proofs including all read data adds additional overhead which could be skipped on a normal validating full node.

We ran each test for 5 minutes to appropriately measure the steady-state performance of the drive.

Random access is the worst-case scenario for blockchains, so the workload under power-law distributions is expected to be higher.

Random Transfers: 0% fresh (updates only)

  67M items 128M items 1B items
Corsair MP600 Pro LPX (4TB) 55.7k TPS 49.8k TPS 27.1k TPS
Samsung 990 Pro (2TB) 37.8k TPS 33.6k TPS 22.7k TPS
       

Random Transfers: 50% fresh

  67M items 128M items 1B items
Corsair MP600 Pro LPX (4TB) 55.3k TPS 48.1k TPS 21.6k TPS
Samsung 990 Pro (2TB) 35.1k TPS 31.2k TPS 20.9k TPS
       

Random Transfers: 100% fresh

  67M items 128M items 1B items
Corsair MP600 Pro LPX (4TB) 53.0k TPS 42.1k TPS 20.1k TPS
Samsung 990 Pro (2TB) 33.9k TPS 29.0k TPS 18.9k TPS