add arrow-down arrow-left arrow-right arrow-up authorcheckmark clipboard combo comment delete discord dots drag-handle dropdown-arrow errorfacebook history inbox instagram issuelink lock markup-bbcode markup-html markup-pcpp markup-cyclingbuilder markup-plain-text markup-reddit menu pin radio-button save search settings share star-empty star-full star-half switch successtag twitch twitter user warningwattage weight youtube

Storage Benchmark Methodology

Testing Setup

  • Storage benchmark machine configuration
    • M.2 NVME SSDs are always in the M2_1 slot. M2_1 has 4 PCIe 5.0 lanes directly connected to the CPU.
    • SATA devices are always connected to the SATA S_4 connector. M.2 SATA devices are connected via an M.2 SATA to 2.5" SATA adapter.
  • Operating System: Ubuntu 24.04.1 LTS
  • All tests are run with fio 3.38.

Background

Modern consumer Solid State Drives (SSDs) are, as a general rule, made of two storage regions. A higher-performance region and a lower-performing region. A major goal of this approach is to hide the shortcomings of the slower region by servicing I/O requests with the high-performance region as necessary. Data is often transferred between the two regions in the background so that the high-performance region is available for new requests.

This strategy allows SSDs to achieve higher capacities at lower prices. If done well, the end-user may not see any noticeable drawbacks in most usage.

The hybrid design does have some downsides however. It is generally impossible to know how an SSD will perform in real usage based on the specs provided by a manufacturer. Almost all consumer SSDs will specify read or write speeds "up to" a given data rate. Those numbers are the best performance seen from the high-performance region of the SSD.

Usually, important details like the size of the high-performance region or the performance characteristics of the slower region are not provided. How the drive manages the cooperation between the two regions can have a big impact on user experience.

These details matter. An SSD with an unusually slow lower-performing region or a small high-performance region can perform very poorly in some reasonable and common scenarios. For example, we've seen some SSDs that perform a large file copy slower than a traditional spinning platter hard drive (HDD).

HDDs often have a high-performance cache as well, but their caches are much smaller. Unlike SSD NAND caches, HDD caches are quickly exhausted by intense I/O.

What We Test And Why

With all of this in mind, we've chosen our benchmarks with the goal of exercising the current two-part design of consumer SSDs. We also have chosen similar benchmarks for HDDs.

There are many parameters that can be varied when benchmarking storage. They include:

  • Test duration
  • I/O type (read vs write)
  • I/O pattern (sequential vs random)
  • Queue depth (how many I/O operations are outstanding at once)
  • Block size (the amount of data involved in each operation)
  • Disk state
  • How full is the drive
  • How much idle time has the drive had to recover from previous operations
  • What areas of the drive were last operated upon

Large amounts of performance data can be generated and collected from a single selection of these parameters. It is important to run the tests repeatedly, as performance can vary between repeated runs of the exact same test. Combining multiple runs with varying just a few of the parameters can result in swaths of data and hundreds of graphs. Comparing one disk to another for each combination of parameters can be interesting and illustrative, but is also time intensive.

We've picked a handful of the more reasonable combinations of these parameters with the goal of providing benchmark results that can be compared between drives and give a meaningful, high-level view of the kind of performance an end-user will see.

Where We Get Our Hardware

We purchase drives from online retailers. For newly released products this means we may not have benchmark results until retail availability. No manufacturer samples have been benchmarked to date. If we are unable to acquire hardware through retail channels and a sample is offered, we may include it by clearly indicating it is a sample in the benchmark results. After we are able to purchase the drive from retail channels, we will retest and submit updated benchmarks.

SSD Benchmarks

Full Disk Write

The first benchmark starts with an empty disk and sequentially writes to fill it entirely. A queue depth of 32 is used with a block size of 1MB. This exercises both performance regions of an SSD. The "Full Disk Write Throughput" for the operation is reported. Additionally, the "Full Disk Write Throughput (Lowest 10 Seconds)" is reported. This is the lowest throughput rate seen during any 10 second window as the drive is filled, and is generally a reflection of how well the slower-performing SSD region handles intensive writes.

60 second I/O operations on first 50% of disk

For these benchmarks, we exercise the disk for 60 seconds at a time, using typical block sizes of 1MB for sequential I/O and 4KB for random I/O. We run the benchmarks at queue depths of 1, 2, 4, 8, 16, 32, and 64.

Before running these benchmarks, we fill the first 50% of the disk. We then perform the benchmarks for sequential and random reads.

After the read benchmarks are complete, we move on to benchmarks for sequential and random writes restricted to the first 50% of the disk. Before each write benchmark, we reset the disk to fresh out-of-box (FOB) state. Without this important step, previous write benchmark runs would affect the performance of the current run.

The sequential tests perform their I/O on a randomly chosen sequential 1GB aligned chunk of the disk. A new 1GB aligned chunk is chosen when the I/O on the previous chunk is complete. This is designed to represent fragmentation that will typically be seen in a larger file.

More details on SSD methodology

Devices where state affects performance (e.g. most SSDs) have their state reset to fresh out-of-box (FOB) via "NVMe format -s (secure)" / "ATA SECURITY ERASE" commands. We refer to such a reset with the terminology "FOB Reset". There are some devices, like Optane drives, where state does not affect performance. For these devices, FOB Resets are skipped.

The entire benchmark process for an SSD consists of the following steps:

  1. FOB Reset and 3 minute rest. This step is skipped if the drive is brand new.
  2. Full sequential write of disk at Queue Depth 32, then 10 minute rest.
  3. FOB Reset of SSD, then 3 minute rest.
  4. Fill disk to 50% with sequential at with Queue Depth 32, then 10 minute rest.
  5. For each queue depth, a 60 second duration sequential read restricted to the first 50% of disk, followed by 1 minute rest.
  6. For each queue depth, a 60 second duration random read restricted to the first 50% of disk, followed by 1 minute rest.
  7. For each queue depth, an FOB Reset and 3 minute rest, then a 60 second duration sequential write restricted to the first 50% of disk.
  8. For each queue depth, an FOB Reset and 3 minute rest, then a 60 second duration random write restricted to the first 50% of disk.
  9. Repeat steps 1-8 two more times.

HDD Benchmarks

Our HDD Benchmarks are similar to our SSD benchmarks, with some changes to account for differences between the two types of storage.

First, a full sequential write of a large HDD can take prohibitively long. A 20TB HDD that can sustain writes at 200MB/s will take over a day to fill. In general, a full drive write on an HDD has more predictable performance characteristics than one on an SSD and has a long period of time during the write to show any interesting anomalies. As such, the full sequential write is only performed once on an HDD.

Next, there are two different HDD technologies that require differences in testing strategies: CMR and SMR

CMR HDDs do not track used sectors or have a concept of how "full" they are. Because of this, we do not perform step 4 or FOB Resets on CMR HDDs.

SMR HDDs do track used sectors and do have a concept of how "full" they are, so filling and FOB Reset steps must be done. However, FOB Reset via the same mechanism used for SSDs can take hours on an HDD. Performing an FOB Reset prior to each sequential write duration run at each queue depth would take unreasonably long. Because of this, when benchmarking SMR HDDs, we use TRIM commands instead of a full FOB Reset in steps 7 and 8.

More information on benchmarking SMR HDDs can be found in our blog post

Protocol-Specific Differences

For NVMe devices, we monitor temperature during each benchmark by running smartctl -A on the device every half second. This does impose a small load on the drive as it must spend a small amount of time servicing these requests.

For ATA devices, we do not monitor temperature because the overhead is higher, especially with HDDs. Running smartctl can result in performing disk reads, leading to significant penalties on HDD benchmarks that perform sequential storage access.