Solid-state storage is being adopted in the enterprise but not at the pace that many observers originally predicted. While there are a variety of reasons for this, one that can’t be ignored is the cost. Solid-state storage
What is MLC flash?
Solid-state flash drives are made up of memory cells. Traditionally, with SLC memory, those cells are written to once per segment of data. MLC flash, on the other hand, writes two data segments to the same cell, effectively doubling the capacity of the flash storage. However, this method means that MLC storage has lower performance and reliability than SLC memory. MLC flash is also likely to wear out faster than SLC since flash storage can handle only so many write cycles per cell. As a result, MLC flash storage has been relegated to consumer devices like laptops and phones. However, significant improvements have been made in both the understanding of MLC and in the technology that surrounds MLC, and now some suppliers are proposing its use in the enterprise.
Protecting MLC flash storage
While it’s true that MLC will fail more often than SLC flash, advancements in intelligence around MLC and how it’s protected are changing the dynamics. First, MLC production processes have improved; now some suppliers offer EMLC (enterprise MLC) with write cycles that are as high as 6x that of standard MLC (30,000 vs. 5,000). Second, the process of writing data to the cells has improved such that no single cell of the flash disk becomes hot; wear leveling makes sure that writes are spread evenly across the available cells. Third, most if not all EMLC systems have spare unreported capacity so that if a cell does wear out, its data can be written to a new drive and cell. And besides these special considerations around the flash memory itself, it’s also important to remember that in many cases this memory will be installed into an enterprise-class storage system, so technologies like RAID and mirroring can be used to provide further protection from failure.
MLC flash memory performance issues
Now let’s talk about the performance concerns around MLC. While MLC is not as fast as SLC, it is faster than a single 15,000 rpm drive in both read and write operations. Many data centers are looking for a measurable but cost-correct performance boost. SLC flash may be overkill whereas MLC may be just right.
The big limiter in the performance of MLC, or that of any flash-based solid-state drive (SSD), is when the drive reaches what’s called steady state. This is when the drive has been completely filled up for the first time and there are no more empty cells to put data into. From that point forward, any new data that the flash controller needs to write must be written to cells that have no in-use data on them. The not-in-use data is erased -- which in the flash world means the cell is written to with zeros -- and then the new data is written to the cell. Obviously, these two steady-state writes takes time; factor in the parity writing in a RAID algorithm, and the performance gets worse. And, this write cycle can deliver erratic and unpredictable I/O performance, especially when the system is busy with a lot of write traffic and is near capacity.
To combat this problem, most flash controllers now have the ability to do something called garbage collection. During idle times the flash controller will scan the drive looking for cells that store data that has been marked as removable by the operating system (typically a delete command) and perform the erase write ahead of time. Garbage collection is more important in MLC or EMLC-based systems because they are slower at processing a write cycle (more data per cell) so having those cells cleaned out ahead of time is critical. Another technique that storage systems use is preserving some flash memory as unallocated. For example, if 20% is left in reserve, in most cases the write cycle would not have to be performed while data is being written to the drive. The flash controller will use the spare cells.
MLC flash makes sense for customers who have applications where performance needs to be improved but not to an extreme level. There is enough technology and redundancy now surrounding these systems that they can be implemented with confidence into many environments without the risk of data loss.
This was first published in March 2011