Last year Marc Staimer published a blog on how NVMe Performance challenges expose CPU chokepoints. In his blog Marc pointed out that NVMe systems will have diminishing marginal performance gains when hardware is added, and storage software consumes more CPU resources. The general industry consensus is that scaling capacity and performance using more NVMe and NVMe-oF just requires more hardware, but the rub is that these systems offer noticeably diminishing returns. Hardware grows much faster than the performance gains.
According to Marc, the root cause of this NVMe performance challenge isn't hardware. It's storage software that wasn't designed for CPU efficiency, especially under the new demands of NVMe. NVMe solves a lot of the bottlenecks associated with SAS and SATA but solving one bottleneck puts more pressure on other parts of the workflow causing other bottlenecks. Storage controllers have also taken on a lot of new functions in recent years, such as deduplication, compression, snapshots, clones, replication, tiering and error detection and correction, which are as important as performance. And many of these features are CPU intensive. When storage software is consuming CPU cycles, they aren't available for storage I/O to the high-performance SSD drives.
When disk drives with its mechanical latencies were the performance bottleneck storage software did not need to be efficient. But now solid state drives and NVMe have pushed the bottleneck to the CPUs. Fixing this requires a rearchitecting of the storage controller and rewriting the storage software to be more efficient. Hitachi Vantara, has been doing this prior to releasing our NVMe storage systems by rewriting our Storage Virtualization Operating System, and the results are most obvious in the competitive comparisons with comparable NVMe storage systems.
Here is a hardware comparison of 5 midrange NVMe storage systems, showing the number of controllers, cores, cache memory, and max NVMe drive. Also shown are the maximum IOPs and IOPS per core, and the minimum latency in microseconds.
The Storage Virtualization Operating System allows the VSP E990 to deliver the highest IOPS in its class with the fewest processor cores and the smallest cache. While competitors have access to similar multicore processors and use the same NVMe low-latency command set and multiple queues per device for fast concurrent I/O, the E990 is differentiated by the efficiency of its SVOS software. With only 2 controllers with 56 cores and 1TiB of cache memory, the VSP E990 has the lowest latency and double the maximum IOPS of the other five vendors. Less cores and memory also reduces power, cooling and cost.
In addition to optimizing the performance of NVMe, SVOS has also been optimized for other storage functions like deduplication and compression in the controller. This now enables us to offer a Total Efficiency Guarantee of up to 7:1 in capacity savings. To meet ever-growing performance and capacity needs, data centers are leveraging flash with data reduction technologies. But many organizations are not seeing the benefits they hoped for as performance and/or data reduction does not live up to the hype. That’s because general-purpose flash storage systems are not designed to handle data reduction and larger I/O tasks concurrently.
If you are unable to store the capacity that we guaranteed, Hitachi will deliver additional storage to fulfill this total efficiency guarantee. This guarantee is part of Hitachi’s Flash Assurance Program, which provides flash investment protection underpinned by an industry-leading 100% data availability guarantee.
For more information on the NVMe performance advantages of the VSP E990, please see this blog by Colin Gallagher, Vice President, Infrastructure Solutions Product Marketing at Hitachi Vantara.#Blog#Hu'sPlace