NVMe (non-volatile memory express) is a standard designed to fully leverage the performance benefits of non-volatile memory in all types of computing and data storage. NVMe’s key benefits include direct access to the CPU, lightweight commands, and highly scalable parallelism which all lead to lower application latency and insanely fast IO. There has been a lot of hype surrounding NVMe in the press and how it can put your IT performance into the equivalent of Tesla’s “Ludicrous mode”, but I would like to discuss some “real world” considerations where NVMe can offer great benefits and perhaps shine a caution light in areas that you might not have considered. As NVMe is in its infancy as far as production deployment, its acceptance and adoption are being driven by a need for speed. In fact Hitachi Vantara recently introduced NVMe caching in its hyperconverged Unified Compute Platform (UCP). This is a first step to mobilizing the advantages of NVMe to accelerate workloads by using NVME in the caching layer.
Parallel vs Serial IO execution - The NVMe Super Highway
What about storage? So where are the bottlenecks and why can’t SAS and SATA keep up with today’s high performance flash media? The answer is that both SAS and SATA were designed for rotating media and long before flash was developed, consequently these command sets have become the traffic jam on the IO highway. NVMe is a standard based on peripheral component interconnect express (PCIe) and its built to take advantage of today’s massively parallel architectures. Think of NVMe as a Tesla Model S capable of achieving 155mph in 29 seconds, stifled by old infrastructure (SAS/SATA) and a 55mph speed limit. All that capability is wasted. So what is driving the need for this type of high performance in IT modernization? For one, Software-Defined Storage (SDS) is a rapidly growing technology that allows for the policy–based provisioning and management of data storage independent of the underlying physical storage. As datacenter modernization is at the core of IT planning these days, new technologies such as Software-Defined Storage are offering tremendous benefits in data consolidation and agility. As far as ROI and economic benefits, SDS’s ability to be hardware agnostic, scale seamlessly, and deliver simplified management is a total victory for IT. So then what is the Achilles heel for SDS and its promise to consume all traditional and modern workloads? Quite frankly, SDS has been limited by the performance constraints of traditional architectures. Consequently, many SDS deployments are limited to applications that can tolerate the latency caused by the aforementioned bottlenecks.
Traditional Workload: High-Performance OLTP and Database Applications
Traditional OLTP and database workloads are the heartbeat of the enterprise. I have witnessed instances of customers having SDS deployments fail because of latency between storage and the application, even when flash media was used. Surely the SDS platform, network, and compute were blazing fast, but the weak link was the SAS storage interface. Another problem is that any type of virtualization or abstraction layer used to host the SDS instance on a server is going to consume more performance than running that service on bare metal. In an SDS environment, highly transactional applications will require the additional IOPS to keep latency from the virtualization layer in check and deliver the best quality of service to the field. At the underlying storage level, traditional SAS and SATA constrain flash performance. The bottom line is that NVMe inherently provides much greater bandwidth than traditional SAS or SATA. In addition, NVMe at the media level can handle 64,000 queues compared to SAS (254 queues) and SATA (32 queues). This type of performance and parallelism can enable high-performance OLTP and deliver optimized performance with the latest flash media. So the prospect is that more traditional high-performance OLTP workloads can be migrated to an SDS environment enabled by NVMe.
Caveat Emptor – The Data Services Tax
The new world of rack scale flash, SDS, and hyper converged infrastructure offerings promise loftier performance levels, but there are speed bumps to be considered. This is especially true when considering the migration of mission-critical OLTP applications to a software-defined environment. The fact of the matter is that data services (compression, encryption, RAID etc.) and data protection (snaps, replication, and deduplication) reduce IOPS. So be cautious when considering a vendor’s IOPS specification because in most cases the numbers are for unprotected and un-reduced data. In fact, data services can impact IOPS and response times to the extent that AFA’s with NVMe will not perform much better than SCSI-based AFA’s The good news is that NVMe performance and parallelism should provide plenty of horsepower (IOPS) to enable you to move high performance workloads into an SDS environment. The bad news is that you will need your hardware architecture to be more powerful and correctly designed to perform data services and protection faster than ever before (e.g. more IO received per second = more deduplication processes that must occur every second). Note that you also need to consider whether or not your SDS application, compute, network and storage are designed to take full advantage of NVMe’s parallelism. Also note that a solution is only as fast as its weakest link and for practical purposes it could be your traditional network infrastructure. If you opt for NVMe on the back-end (between storage controllers and media) but do not consider how to implement NVMe on the front-end (between storage and host / application), you may just be pushing your performance bottleneck to another point of IO contention and you won't get any significant improvement.
Modern Workload: Analytics at the Edge
It seems as though “Analytics” has replaced “Cloud” as the IT modernization initiative de jour. This is not just hype as the ability to leverage data to understand customers and processes is leading to profitable business outcomes never before possible. I remember just a few years ago the hype surrounding Hadoop and batch analytics in the core data-center and in the cloud. It was only a matter of time before we decided that best place to produce timely results and actions from analytics are at the edge. The ability to deploy powerful compute in small packages makes analytics at the edge (close to where the data is collected) a reality. The fundamental benefit is the network latency being saved by having the compute function at the edge. A few years ago analytics architecture data would travel via a network or telemetry-to-network and then to the cloud. That data would be analyzed and the outcome delivered back the same way it arrived. So edge analytics cuts out data traversing the network and saves a significant chunk of time. This is the key to enabling time sensitive decisions like an autonomous vehicle avoiding a collision in near real-time. Using NVMe /PCIe, data can be sent directly to a processor at the edge to deliver the fastest possible outcomes. NVMe enables processing latency to be reduced to microseconds and possibly nanoseconds. This might make you a little more comfortable about taking your hands off the wheel and letting the autonomous car do the driving…
The Take Away
My advice to IT consumers is to approach any new technology with an open mind and a teaspoon of doubt. Don’t get caught up in hype and specs. Move at a modernization pace that is comfortable and within the timelines of your organization. Your business outcomes should map to solutions and not the other way around. “When in doubt, proof it out”, make sure your modernization vendor is truly a partner. They should be willing to demonstrate a working proof of concept, especially when it comes to mission-critical application support. Enjoy the new technology; it’s a great time to be in IT!
More on NVMe