NVMe Performance Uncovers New Chokepoints.

By Hubert Yoshida posted 08-30-2019 05:16

The storage market is eagerly awaiting the arrival of NVMe storage controllers. NVMe is the connection protocol that was specifically written to replace the SCSI storage protocol to take advantage of SSD’s speed, parallelism, and lower latency. The command set is leaner, and it supports a nearly unlimited queue depth that takes advantage of the parallel nature of flash drives (a max 64K queue depth for up to 64K separate queues)

While a few vendors have delivered NVMe storage controllers, the performance improvements have not been that impressive compared to SCSI or SAS (Serial Attached SCSI) attached SSD storage controllers. The lower latencies expected of NVMe over SCSI have not been fully realized. The reason this happens is that the lower NVMe latencies are exposing storage controller bottle necks. This was the subject of several presentations at the recent Flash Memory Summit in Santa Clara. Marc Staimer of Dragon Slayer Consulting also posted a blog about this last June.

This was not a surprise to Hitachi engineers. I blogged about this last year when the hype for NVMe was at its height. Since the standards for NVMe were still being finalized, our engineers worked on removing the bottle necks in the storage controller first, with the introduction of the Hitachi Storage Virtualization Operating System RF (SVOS RF) which runs on all our VSP storage controllers from small midrange to large enterprise storage arrays. The letters RF were added with this release to stand for Resilient Flash, an indication that this release was being optimized for flash.

Here is an excerpt from that blog post to show what we did to prepare for NVMe.

The latest version of Storage Virtualization Operating System RF (SVOS RF) was designed specifically to combine QoS with a flash aware I/O stack to eliminate I/O latency and processing overhead. WWN, ports, and LUN level QoS, provide throughput and transaction controls to eliminate the cascading effects of noisy neighbors which is crucial when multiple NVMe hosts are vying for storage resources. For low latency flash, the SVOS RF priority handling feature bypasses cache staging and avoids cache slot allocation latency for 3x read throughput and 65% lower response time. We have also increased compute efficiency, enabling us to deliver up to 71% more IOPS per core. This is important today and in the futurebecause it allows us to free up CPU resources for other purposes, like high speed media. Dedupe and compression overheads have been greatly reduced by SVOS RF (allows us to run up to 240% faster while data reduction is active) and hardware assist features. Adaptive Data Reduction (ADR) with artificial Intelligence (AI) can detect, in real time, sequential data streams, data migration or copy requests that can more effectively be handled inline. Alternatively, random data writes to cells that are undergoing frequent changes will be handled in a post-process manner to avoid thrashing on the controllers. Without getting into too much technical detail, suffice it to say that the controller has a lot to do with overall performance and more will be required when NVMe is implemented.  The good news is that we’ve done a lot of the necessary design work within SVOS to optimize the data services engine in VSP for NVMe workloads.

These upgrades to SVOS RF increased our overall performance so that we compared very favorably with the NVMe vendors at that time, even though we were still using SAS attached SSDs and FMDs. These comparisons are shown in my referenced blog post and are also in an, August 6, 2018, Gartner Critical Capabilities for Solid State Arrays report. We were able to deliver comparable NMVe performance without sacrificing any of the high reliability, scalability and enterprise services that distinguishes our VSP storage from NVMe vendors like Kaminario and Pure Storage. Since the cut-off date for this Gartner report was before our latest announcement of VSP arrays, the comparisons did not take into account our latest hardware upgrades.

Not all applications will need the performance and latency of NVMe SSDs. Since NVMe SSDs will be more expensive than SAS/SATA SSD’s, there is an opportunity to save costs by providing a mix of SAS/SATA SSD and getting the same performance as competitive NVMe storage vendors.

In my previous blog post, I stated that Hitachi product management leadership had confirmed that our VSP core storage will include NVMe in 2019. Stay tuned and see what we can deliver with NVMe.