Hu Yoshida

Continuous Cloud Infrastructure: Taking Storage Virtualization to the Next Dimension

Blog Post created by Hu Yoshida Employee on Nov 7, 2014

Apr 22, 2014

 

Continuous Cloud Infrastructure: Taking Storage Virtualization to the Next Dimension

 

Today, Hitachi Data Systems announces the next step in storage virtualization, taking it to the next dimension to provide a continuous cloud infrastructure.

When Hitachi announced Virtual Storage Platform (VSP) with its modular architecture, we marketed it as 3D storage with the ability to scale up with the addition of cache modules and Intel processors, scale out with front and back end director boards, and scale deep through virtualization of external storage.

Last week I blogged about the progression of Hitachi’s storage virtualization from internal infrastructure virtualization of cache partitions and host port connectivity, to virtualization of external storage arrays, to the virtualization of LUNs or volumes. What is unique about Hitachi storage virtualization is that all the virtualization is done with software within the VSP and HUS VM controllers, which enables all these dimensions of virtualization, internal, external, and volume virtualization to leverage each other for agility, automation and availability. Unlike other approaches to storage virtualization that uses software in an external appliance, the software running within the internal controllers does not add another level of management complexity and can be managed through the same management interface that manages the storage array.  The purpose of that blog post was to pose the question: What is the next step in storage virtualization?

The next step in virtualization must go to another dimension, and that dimension is to extend virtualization across  storage controllers and across systems. We need to take what was essentially vertical virtualization, from the storage controller down through the external storage that it virtualized, to horizontal virtualization across multiple systems. This concept is very much like what is done with server virtualization. Hypervisors like VMware and HyperV can create virtual machines and move them across multiple physical servers for load balancing and availability.

Horizontal storage virtualization enables us to provide more than just load balancing and availability. It enables the creation of a global active device (or global virtual volume) that spans across multiple controllers so that an application can continue to access the volume if one controller should fail. This gives us true active/active availability with zero recovery point objective (RPO) and nearly zero recovery time objective (RTO). An entire system can be replaced with no outages, which also enables the holy grail of non-disruptive migration to next generation technology. This provides the base for a continuous cloud infrastructure for high performance, scale up business requirements.

Up to now business continuity has been dependent on synchronous and asynchronous replication where data is replicated from a primary system to a secondary system located some geographic distance away for recovery purposes. Synchronous replication requires each write to be acknowledged before the next write is sent and this limits the distant between sites to 100s of meters. Asynchronous replication allows data to be replicated over any distance with a book keeping method for keeping the data in sequence and determining what data packets are lost during the transmission time between the storage controllers when a failure occurs at the primary site.

So how does the continuous availability of a global active device that spans multiple systems differ from synchronous and asynchronous replication? The virtualization of a volume across storage systems is similar to synchronous replications in terms of distance, since a write to this global virtual volume goes to both controllers in synchronous mode.  The difference is that the application in synchronous replication only talks to the primary system’s storage controller and the primary controller does the synchronous replication to the secondary system’s controller. When a failure occurs at the primary site, the application must be restarted at the secondary site and provide transaction consistency. While the data at the secondary site is time consistent with the primary site when the failure occurred, the I/O s associated with a transaction may not have completed. The application must roll back the logs and back out the transactions whose I/Os have not completed before rolling the logs forward with consistent transactions to resume processing. This means that the recovery point objective and recovery time objectives will not be zero.

Global volume virtualization across storage systems will maintain transaction consistency and mask the failure of a controller. Whichever controller receives a write request will ensure that the other controller receives the write in its cache before responding with a write complete. Since both storage systems are active on the same global virtual volume, when one of the storage systems goes down the application can continue to run on the remaining storage system with no down time or loss of any I/Os.  The only RTO delay may be an alternate path retry when a storage system fails. When the failed system comes back online, it can be added to the virtual volume without any disruption. While Global virtualization of a volume provides continuous availability in case of the storage systems failure, it does not protect against the corruption or deletion of data by a user, or the outage of a server or entire data center.  Full protection and recovery still requires enterprise level, snap shots, clones and synchronous and asynchronous replication in addition to global volume virtualization.  While global volume virtualization may be done with an appliance that sits in the network between the application and the storage systems, this adds a level of complexity and does not provide the integration with enterprise data protection features that is provided by the Hitachi VSP G1000 and Storage Virtualization Operating System  Global volume virtualization, across separate storage systems that are separated by distance will provide the next dimension in storage virtualization that will contribute to a continuous cloud infrastructure.

Today’s announcement of a Continuous Cloud infrastructure consists of four parts:

  • VSP G1000 – The hardware is a unified storage platform that can provide approximately 4 million random read IOPs for 8 KB blocks or more than 1.2 million NFS operations for file. This is about 4 times the IOPs performance of the VSP while improving power efficiency by more than 10%.
  • SVOS – Storage Virtualization Operating System is the foundation for VSP G1000 storage virtualizaton and software that extends virtualization beyond a single controller to enable global virtualization across storage control units for zero RPO recovery and non-disruptive migration. SVOS will be the operation system for core Hitachi Infrastructure platforms in the future.
  • Command Suite v8 – Integrated management for server, file, and block infrastructure that include new SVOS capabilities and will include workflows like nondisruptive migration, supporting the HDS services delivered capability available today. Improved application awareness and provisioning templates as well as a simplified and streamlined GUI and common REST API interfaces.
  • Unified Compute Platform and UCP Director 3.5 – UCP Director adds support for SVOS and G1000 with enhanced server profiles provisioning and disaster recovery support.  The UCP for VMware with Cisco UCS will also support SVOS and VSP G1000.

I will be going into more details around this announcement in the next few weeks. In the meantime here are some links to more information.

See the HDS NewsHub for updates

Outcomes