AI and ML Require Changes in Storage Infrastructure

By Hubert Yoshida posted 01-28-2020 18:29

Like

The Global AI-Powered Storage Market is Expected to Grow from USD 10.4 Billion in 2019 to USD 34.5 Billion by 2024; Growing at a CAGR of 27.1%. - ResearchAndMarkets.com. Businesses are increasingly using data assets to accelerate their competitiveness and drive greater revenue. Part of this strategy is to use machine learning and AI tools and technologies. That growth will be driven by new storage requirements that differentiate AI storage from traditional storage infrastructure.

The following are some of the Storage Requirements that differentiate AI storage:

Vast Storage Scalability: AI workload are very different from generic workloads. AI and machine learning workloads require huge amounts of storage to extract translate and load input data, do exploratory data analysis to see what data is relevant, create test data, build and train the models, retrain the models to reflect changing data patterns, inferences, and new discoveries and keep them running. Data does not lose value, so data must be retained for as long as there is perceived value, and regulatory requirements like GDPR may require retention of data in a form that explains the decisions made by AI. Data protection and data dispersion will also create copies of data. This generates orders of magnitude more data, than the statistical analysis of a big data repository. Scalability not only requires the ability to scale capacities into the tens of petabytes, but also scale connectivity to thousands of servers. Scalability includes the ability to minimize the footprint and cost of storage through intelligent use of deduplication, compression, virtualization, tiering, archiving, indexing cataloging, and shredding.

The Hitachi VSP 5000 family provides unmatched scalability in performance and connectivity. The VSP 5000 comes in all flash and hybrid configurations that start at 3.8 TB and can scale to 69 PB of raw capacity. It can also extend capacities by virtualizing other storage systems and connecting to the cloud. Connectivity scales from 32 x 32 Gb/s to 192 x 32 Gb/s FC, 192 x 16 Gb/s FICON, or 96 x 10 Gb/s iSCSI. The VSP 5000 can scale up, scale out and scale deep through cloud and virtualization of external storage systems and comes with a suite of AI tools to minimize the footprint and cost of scalability. The VSP 5000s have up to 7:1 data reduction ratio with deduplication and compression and can automate tiering between NVMe flash, SAS flash, HDDs, external storage or cloud.

Extreme Storage Performance: While AI and ML use algorithms that are extremely demanding of computing requirements, the storage infrastructure must be accessible to a large number of compute servers with high-bandwidth and low-latency connections to the data. While the data ingestion and data engineering phase may be sequential, the bulk of the work in training the models and doing the analysis is very random. AI and ML data is stored with metadata which their algorithms rapidly sort through to determine which data is relevant to the problem being solved. A lot of the data may be cold until it is suddenly called to action. This requires multiple levels of storage with different performance and costs with metadata being on the highest performance level and less active data on the lower performance and cost levels.

The VSP 5000 series leads all other Storage systems in performance with 21,000,000 IOPS, with latencies of less than a millisecond. Latency is the most important performance characteristic of today’s storage systems. NVMe has given a big boost to reducing latency by capitalizing on parallel, low latency paths to the media. However, NVMe has exposed other bottlenecks in the storage I/O path. The VSP storage controller and operating software was redesigned to optimize the performance of NVMe, and the result has been supported by its extraordinary performance numbers. Offloading I/O functions to FPGAs and an internal switch fabric, not only help to improve performance for NVMe attached storage, but also improves performance in other areas of storage performance. In addition to NVMe the VSP incorporates intelligent tiering to ensure that the right level of performance is available when it is needed across, NVMe flash, SAS Flash, HDD and external storage. The deduplication method uses machine learning models to optimize deduplication block size and use either in-line or post-process dedupe to get as much deduplication as possible while reducing the performance impact of the dedupe processing.

Intelligent Storage Systems: The requirements of AI and ML on enterprise storage infrastructure has become more demanding. It is no longer enough to deploy capacity and install faster drives or buy an all flash array and assume that flash solves all your problems. An AI or ML application is not one type of processing like transaction processing in a DBMS application. There are many phases with different processing requirements, from data ingestion, data engineering, data discovery and visualization, model development and training, model deployment, and retention, all with various performance, tiering, and protocol requirements. Intelligent storage systems which utilize AI and ML are needed to continuously learn and adapts to its infrastructure environment to better manage and serve data.

In addition to ensuring that service level agreements (SLAs) for uptime and data protection are met, Hitachi Vantara provides intelligent storage that provides the real-time monitoring needed to ensure systems and software are optimized individually and as a holistic unit, including application, hypervisor, server, network and storage. AI and ML is provided to automatically adjust for peak performance and uptime as well as maximum return on investment (ROI). With AI, leaders see the opportunity to focus more time on strategic initiatives while the data center monitors itself. In effect, this allows an organization to begin implementing an autonomous data center that intelligently predicts and prescribes changes that increase uptime and operational efficiency for AI and ML workloads while meeting SLAs. Designing AI and ML capabilities is a major focus for Hitachi engineering teams. Across every area of the company, development teams are researching how to develop AI to make smarter, broader insights. In the data center, Hitachi’s AI and ML efforts are currently focused on Hitachi Ops Center Automator and Hitachi Ops Center Advisor. These offerings are designed to predict, prescribe or execute actions based on two core approaches to machine learning: Decision tree learning and association rule learning.

For more information on the VSP 5000 series see the press release and the flowing link

Blogs