Rest in Peace, Optane
Intel’s Optane memory modules launched with a lot of fanfare in 2015, and were recently discontinued, in 2022, with similar fanfare. It was a sad day for me, a lover of abstraction-breaking technologies, but it was forseeable and understandable.
At the time of Optane’s launch, a lot of us were excited about the idea of having a new storage tier, sitting between DRAM and flash. It was announced as having DRAM endurance and speed with the persistence and size of flash. It was a futuristic memory technology, but the technology of the future met the full force of Wright’s Law.
Wright’s Law
Each doubling of production volume corresponds to a 20% decrease in cost.
This is the simple statement of Wright’s Law. It was originally developed by a man named Theodore Wright in the 1930’s when looking at the production of airplane parts. Since then, it has been verified to hold in many different industries, although the cost reduction seems to vary between 10 and 25% depending on the industry. Despite the industry difference, the history of semiconductor technology has actually followed Wright’s Law more closely than Moore’s Law: The density increases predicted by Moore’s Law tend to speed up and slow down with demand growth, exactly as predicted by Wright’s Law.
During the last 7 years, DRAM and Flash have both experienced massive increases in density and production volume, and in turn, Wright’s Law has given us cheaper, faster, bigger devices.
DRAM Production 2015-2022
The lifetime of DDR4 memory roughly matches the lifetime of Optane. DDR4 was introduced a few years before 2015, and became cost-effective compared to DDR3 around 2015-2016. Today, DDR5 RAM and the CPUs that can use it have just recently been introduced to the market. Over this time, RAM price per bit has roughly halved, and global RAM production has grown by a factor of about 10-15. However, due to competition between AMD and Intel, the number of memory channels per CPU has also roughly doubled. Thus, a server in 2022 holds about 4x as much RAM as a server in 2015, and memory cost as a share of system cost has increased by a factor of 2. Memory bandwidth has increased by a factor of about 3 due to a doubling of memory channels and the channels getting faster.
This means that both the capacity argument of Optane (“a system with Optane DIMMs can hold more memory”) has diminished over time, and the performance gap between Optane and DDR memory has also grown. A two-socket server populated with RDIMMs can now hold several TB of memory, so the value proposition of Optane for memory capacity has been eliminated.
Flash is a Spectrum
Flash memory comes in four flavors, based on the number of bits stored in each cell. Single-level cell (SLC) flash stores one bit per cell. It has the lowest density, but the longest endurance and highest speeds. MLC (multi-level cell) holds two bits per cell, and is worse on speed and endurance than SLC, but better than TLC (triple-level cell). Most SSDs today use TLC flash: it offers a good balance between price and endurance, and most workloads are read-heavy, so there isn’t a lot of stress on the memory arrays. QLC (quad-level cell) has recently been introduced to the market as a capacity SSD technology: QLC has even lower endurance and speed than TLC, but increases the capacity a lot. PLC (penta-level) is on the horizon, promising to continue the trend.
On the back of this distinction, we can see a natural tiering of flash: using TLC or QLC for capacity (in place of disk for all but the largest datasets), and using SLC for caching and write-heavy workloads. This means that SLC flash is a direct competitor to Optane, but it has the advantages of being much cheaper per bit without being a lot slower.
Flash Production 2015-2022
After Optane’s launch, Flash memory technology advanced quickly. In 2015, NVMe drives had been out for about a year or two. NVMe drives and the controller chips that ran them were still working out the kinks. Understandably, Intel also created NVMe drives with Optane, offering much better read and write latencies than NVMe drives with flash. Not only did Optane offer faster access than flash, but it had the advantage of being a simpler, more reliable form of memory that didn’t need complicated controller algorithms.
As time went on, companies quicly improved the speed, reliability, and density of each flash technology. In 2015, TLC was roughly in the position QLC occupies today: mostly for read-only and read-heavy workloads. Now, TLC is the workhorse flash technology used for general-purpose drives, as its speed and endurance have improved. Similarly, SSD controller chips have also improved over time, allowing them to work faster and make better use of the flash chips on an SSD. Lots of money goes into R&D work for flash, and it shows.
In the last 7 years, flash has grown to be the dominant storage technology: going from almost 20x more per bit than hard drives to just 5x more per bit. In addition, SSD latency has dropped by a factor of 2 due to control chips getting more powerful and algorithms getting better. Flash bandwidth has increased to the point where an SSD can saturate a PCIe gen 4 x4 link. A modern SLC SSD can be found with read latency under 30 microseconds, and even TLC SSDs can get close to 100 microsecond latencies.
Flash has advanced so much over the last 7 years that if the trend continues, I may be eulogizing the magnetic hard drive in 2030. Flash is eating the storage world for good reason: it can offer high speeds and high capacity, and while it had some rough edges, we have learned how to work with it very well.
Optane and Wright’s Law
Clearly, as time has advanced, Optane’s position in the memory hierarchy has been getting crushed between expanding DDR4 DRAM and flash that is getting faster and faster. Both of these technologies have had much more innovation and R&D over the last 7 years than Optane, and it shows. In retrospect, Intel was fighting an uphill battle if they wanted to treat Optane simply as a point on the memory hierarchy. To make the project successful, they needed it to be a unique value add.
The Value Proposition of Optane
After the dust settled, Optane modules only ended up having a small price difference with DRAM modules of the same size, but were almost as slow as SLC flash, and only had 3-5x the endurance of flash memory. The NVMe drives were faster, but not that much faster, than SLC flash. The only value-add that was left was the abstraction difference with normal memory: Optane offered persistence on the memory bus.
If you have worked on a database or a storage system, you will know how valuable this can be: lots
of code in high-throughput storage systems is there to make sure that transactions persist as
quickly as possible. Conceivably, if the normal memory writes involved in that transaction can
make it persistent, you can delete a ton of code. You could even conceivably run a small database
on an mmap
-ed Optane file, and persistence would be given to you for free. Oh wait… a lot of
databases mmap
their backing files already.
Persistence on the Memory Abstraction
It turns out that what developers mostly would have wanted from Optane they got from BSD and Linux
in the form of mmap
. mmap
-ing a file allows you to treat the contents of the file as though
they are in memory, and allows the filesystem and a cache to handle the rest. The parts of the
file you have accessed recently are in memory, while other parts of the file are fetched from disk
when you demand them - your access triggers a page fault, which in turn triggers a read from the
filesystem. Writes to the file are handled in the background. This is not the fastest way to
access files, but it is a very convenient one, and it works well enough for many high-performance
key-value databases like LevelDB, LMDB, SQLite, QuestDB, RavenDB, and more.
mmap
had one more secret weapon: the filesystem. If the media backing the filesystem wore out,
the filesystem (and the layers underneath it) could detect and correct issues. Optane had no
such protection.
When Intel thought they were enabling a whole new class of high-performance databases by offering
persistence on the memory bus, they were really offering a performance boost compared to persistent
databases using mmap
. It might not even have been a performance boost: Optane slowed down your
median read and write operations in exchange for avoiding those page faults, so if your working set
was small or mostly fit inside RAM, mmap
and NVMe flash was actually faster.
Persistent Memory and Caches
It turns out that offering persistence on the memory bus runs into abstraction problems with the current memory controllers on CPUs. The memory controllers and cache hierarchies are designed assuming that memory is dumb and volatile, and they can indefinitely delay writes to save bandwidth.
The initial solution that Intel came up with was an instruction, CLFLUSH
, that flushed a cache
line to memory. However, CLFLUSH
was a serializing instruction, like CPUID
, so it ended up
flushing the CPU pipeline as well as the cache hierarchy when it was issued. Worse, flusing the
cache line would invalidate it, so if you wanted to read the value back after writing it, you would
incur a cache miss. It was later supplemented by CLFLUSHOPT
and CLWB
which also could be used
to flush and write back cache lines without incurring the same performance penalty.
However, when Optane memory started to come out, CLFLUSH
was the only instruction from this suite
that was available, meaning that initial performance tests suffered from both the speed difference
between Optane and DRAM and the substantial overhead of the CLFLUSH
instruction. Intel would
probably have had a much easier time selling Optane if their core architecture were ready.
Alternative Methods of Persistence
Also between 2015 and 2022, several companies were offering a different kind of persistent DIMM. Instead of using a special type of memory, this kind of persistent DIMM used normal DRAM, but added some flash memory and a small controller circuit on the back that saved the contents of the DRAM to flash every time the power started to dip.
To make sure that the contents of the DIMM were safe, these persistent DIMMs had a supercapacitor or small lithium ion battery (usually kept in a 2.5 inch drive bay and connected to the DIMM through a cable) that kept the DIMM powered while the rest of the system was going down. On power-up, the DIMM would restore the memory contents.
These were later specified by JEDEC as “NVDIMM-N” modules (standing for Non-volatile Dual Inline Memory Module - NAND flash type).
Still, these alternatives have some problems: they are a lot thicker than normal memory DIMMs, and they can’t offer the capacity that Optane modules could offer. However, they don’t have the same durability problems that Optane and flash do, since the flash is rarely written, and they operate at the same speed as the rest of the system’s memory.
Thank You, Optane
So many technologies have become successful around Optane and the promises it held. Unfortunately, Optane was not one of them.
SSDs using SLC flash offer blazing fast performance with a block abstraction, and enterprises and database developers learned how to take advantage of the differences between SSDs that were small and fast, using SLC flash, and SSDs that were slower and larger, with TLC and QLC flash. Some SSD manufacturers also saw the idea of a caching drive and competed by adding SLC caching to their TLC SSDs. Most high-end consumer SSDs today do this for you.
For the few customers who needed persistence on the memory bus, the JEDEC NVDIMM standards emerged, with the flagship NVDIMM-N modules allowing you to have a DRAM module that was both persistent and fast, and an additional standard to cover future persistent memory technologies. Intel’s new instructions allow users to take advantage of these new modules, adding a fundamental capability to CPUs.
Optane has helped us learn to build computing systems that take advantage of the spectrum of “legacy” storage and memory technologies. I, for one, am sad to see it go, but happy that I won’t miss it.