Blogs

Next Generation Demands for the Global StorageSphere

By Hubert Yoshida posted 09-09-2020 22:26

  
Despite an estimated 25% decline in U.S. GDP for 2020, by economic centers such as the Mercatus Center at George Mason University , International Data Corporation, IDC expects the worldwide installed base of storage capacity will continue to expand. While there is a dramatic decline in worldwide IT spending due to the COVID-19 pandemic, IDC predicts that the worldwide installed base of storage capacity, will continue to grow according to their May 2020 - Market Forecast - on the Worldwide Global StorageSphere. IDC estimates that the installed base of storage capacity worldwide will grow by 6.8 zettabytes this year, an increase of 16.6% over 2019. Over the course of the 2019-2024 forecast, IDC expects the installed base of storage capacity to achieve a compound annual growth rate (CAGR) of 17.8%.


The Mercatus study acknowledges that we are still in the midst of the COVID-19 crisis, and that it will be a while before traditional microeconomic indicators will be able to gauge the true magnitude of the economic slowdown and the impact of mitigating measures like a partial economic slowdown and social distancing. However, the growth of data and the need to store that data seems to continue unabated.

The impact varies by industry and country. It would seem that industries and economic societies will remain in business in proportion to their degree of digitalization. Industries will remain productive in direct proportion to their degree of digitalization because at least that portion of their workforce can continue working from home and contribute services that do not depend as much on in-person interactions.

There are many indicators that the pandemic has accelerated the move to digitalization as more of us work remotely and look for new ways to connect with customers and with technology oriented services. Digitalization companies have fueled an unprecedented growth in the stock market. Digitalization is also fueling the need to store and analyze more data.

Digitalization is about extracting the value of data at scale and transforming it into business value. While this is driving the need for more data capacity, it is really driving the need for new solutions to deliver intelligent and dynamic data services that will power users’ journeys to digital transformation for years to come. This is a demand for scale and performance which has never been more intense. Today organizations are witnessing a transition in both technology and business models that give IT organizations tremendous agility with respect to how they will operate in the future. Businesses have more flexibility in terms of how they purchase and deploy software that includes numerous public cloud sources offering subscriptions for elastic compute and storage. With the advent of IoT and sophisticated business analytics, organizations are prioritizing scale, performance, hybrid cloud capabilities and investment protection. Traditional block and file storage will not be able to meet the new demands for data. The only way to meet the demands of exabyte and zettabyte scale data is through object storage systems. However, even object storage will need to step up its game to the next generation of object storage.

Hitachi Content Platform for Cloud Scale (HCP for Cloud Scale) is a next generation object store that builds on insights gained from Hitachi Content Platform (HCP). Launched more than 15 years ago, HCP has established itself as a trusted S3 compatible object store and experienced tremendous market success; analysts continue to rank it a leader in head to head product evaluations. HCP for Cloud Scale is shaped by our experience as well as customer feedback from thousands of enterprise professionals who manage their company’s data storage needs.


The next generation object storage must integrate new technologies that provide S3, Hybrid Workflows, Containers and Micro Services, Advanced Meta Data Management and Storage Side Compute.

S3: The acceptance of public cloud, and specifically the success of Amazon’s AWS S3 API, has made it the de-facto standard for I/O to object storage, with adoption by virtually all object storage vendors. While HCP was an early adopter of S3, HCP for Cloud Scale provides the scale and features to build a data lake architecture around an S3 compatible object store to augment or replace HDFS. Its architecture scales in the two dimensions necessary for success: (1) Scale-out capacity to accommodate petabytes of historical data that can now grow independently from compute, (2) Scale out performance to accommodate hundreds of compute servers which need the object store to deliver the capacity fast enough to avoid bottlenecks.

Hybrid Workflows: A Hybrid workflow is more than just simple data tiering; it is about leveraging the abundance of cloud-based ANYTHING-as-a-service (XaaS) to achieve desired outcomes at the best cost. Instead of purchasing a software title and using on-premises infrastructure to perform a particular operation, the service may be “leased” in a public cloud as needed. Furthermore, instead of leaving data in public cloud storage, which results in ever-growing monthly storage expenses, a hybrid workflow intelligently transfers only the necessary data between on-premises and public cloud storage and then deletes the unnecessary data to minimize storage and egress costs. HCP for Cloud Scale offers a variety of built-in features that help users take advantage of the abundance of on demand compute or storage service options that are available. Such services include AWS Lambda (FunctionsaaS), AWS Transcribe (Transcription-aaS), AWS Translate (Translation-aaS) and AWS Transcode (TranscodingaaS). HCP for Cloud Scale federates any S3-compatible bucket and uses them as backend storage, be it an on-premises storage target or AWS S3 bucket in the cloud. This capability is more powerful than tiering and provides the flexibility to choose public cloud storage for initial data placement.

Containers and Microservices: The latest variable in the software defined storage conversation, is the emergence of container orchestration services as an alternative to whole machine reservations. Like their hypervisor cousins, container technologies abstract hardware but consume far less overhead to operate. Instead of requiring an entire operating system (~ 2GB) for each virtual machine, containers offer a thin runtime environment (~ 20 MB). HCP for cloud scale combines container technology with a carefully constructed design across the system, including the use of optimized data structures, algorithms, and messaging protocols. Virtually all services follow a publish/subscribe architecture to facilitate loose coupling between container services. When one container service has a task to do, it can publish the work on a queue. Scaling is achieved by spawning multiple receiver services to compete for work, dividing it and linearly increasing the execution of a task backlog. HCP for Cloud Scale incorporates micro-virtualization, where literally hundreds of containers may cohabitate an operating system and share host resources. All interactions with the operating system kernel are read-only, and all storage mountpoints are 100% private to each container. The software can be deployed on virtually any enterprise class Linux distribution.

Advanced Meta Data Management: Next generation object stores must provide integrated index and search technologies to enable fast isolation of object sets that align to a policy, enhanced access protection beyond simple ACL, and easier ways to view the same data by different users. This should be possible whether the applications and storage are located entirely on-premises, in the cloud, or a hybrid deployment connecting multiple clouds. HCP for Cloud Scale provides built-in data management to govern data. Given policy guidance, it can make decisions and take actions to manage that data directly, without the need for an external data management application that might complicate and slow data access for the sake of data governance.

Storage Side Compute: Modern use cases demand elastic compute capabilities. Consider the compute load triggered by the S3 Select API. This API can scan semi-structured objects and return sections matching SQL search criteria. When deployed on-premises, HCP for Cloud Scale taps into elastic compute resources using scalable microservices to process the data. It can also access off-premises compute resources using queuing notification built into the product. In both cases the processing resources are invoked only when needed, making infrastructure available to other activities during downtime. Organization can routinely set up workflows to mold raw data to match a schema using simple information like date formats, currency or variable castings. Other curation activities can include: transform – anonymize, cleanse, or curate data; classify – annotate data with metadata, organizational relationships, customer, or application identification. All such work requires a strong storage-side computing resource.

Summary

HCP for Cloud Scale is a software defined S3-compatible, next generation, object storage solution with built-in data security and protection, advanced metadata-based intelligence, and modern hybrid cloud workflow capabilities. It has been designed from the ground up to address the growing requirements and broader benefits of software defined storage. Its innovative microservices design enables high performance and massive scalability to support 100’s of nodes and billions of objects, and it eliminates classic database and network bottlenecks that plague distributed system designs. With its hardware agnostic architecture, HCP for Cloud Scale can be deployed with ease on servers and cloud platforms, including Hitachi Unified Compute Platform configurations and generic “white box” servers capable of running docker and a compatible Linux distribution.


#Hu'sPlace
#Blog
0 comments
0 views

Permalink