Introduction to Virtual Storage Platform 5000 Architecture

By Charles Lofton posted 11-09-2020 23:12

  

INTRODUCTION

The VSP 5000 is Hitachi’s newest enterprise platform, featuring industry-leading performance and availability. The VSP 5000 scales up, scales out, and scales deep. It features a single, flash-optimized SVOS image operating on as many as 240 processor cores, sharing a global cache of up to 6 TiB. VSP 5000 controller blades are linked together by a highly reliable PCIe switched network, featuring Hitachi Accelerated Fabric (HAF) interconnect engines with hardware-assisted direct memory addressing (DMA).  VSP 5000 Cache architecture has been streamlined, permitting read response times as low as 70 microseconds. Improvements in reliability and serviceability allow the VSP 5000 to claim an industry-leading 99.999999% availability. In this blog post, we’ll take a brief look at the highlights of the VSP 5000 architecture.
 

SCALE UP, SCALE OUT, SCALE DEEP

he entry-level VSP 5100 offers up to 4.2 million IOPS from two controllers (40 cores), with 99.9999% availability. The VSP 5100 can be non-disruptively scaled up to the VSP 5500-2N, (four controllers, 80 cores) offering industry-leading 99.999999% availability.   The VSP 5500-2N scales out to the VSP 5500-4N and VSP 5500-6N, as summarized in Figure 1. The VSP 5500-6N is capable of up to 21 million IOPS. Finally, all VSP 5000 models can scale deep by virtualizing external storage.

Figure 1. The VSP 5000 series offers flexible configuration options with industry-leading performance and availability.

SINGLE SVOS IMAGE, GLOBAL CACHE

Like previous Hitachi enterprise products, all VSP 5000 processors run a single Storage Virtualization Operating System (SVOS) image, and share a global cache. Dedicated cache boards have been eliminated, with cache now distributed across individual controllers for fast memory access and better resiliency. Yet cache remains a global resource, accessible by each controller over Hitachi Accelerated Fabric via hardware-assisted DMA. The DMA hardware assist is done in FPGAs (Field Programmable Gate Arrays) in each controller’s HAF interface. The new DMA implementation reduces CPU overhead and improves performance for inter-controller transfers.  Figure 2 shows how the VSP 5000 components (a four-node system in this example) are connected to each other. Each board (CHB, DKB, HAF) is connected to a controller via 8 x PCIe Gen 3 lanes, and thus has 16 GB/s of available bandwidth (8 GB/s send and 8 GB/s receive). Each controller therefore has up to 64 GB/s of front end bandwidth (provided by four CHBs), 32 GB/s of back end bandwidth (two DKBs) and 32 GB/s of interconnect bandwidth (two HAF engines). 

Figure 2. A four-node, eight-CTL example of high-level VSP 5000 architecture.

HARDWARE COMPONENTS

Figure 2 shows all of the key VSP 5000 components, including the HAF interconnect engines and PCIe switches, the channel boards (CHBs), the controllers (CTLs), the disk adapters (DKBs), and drive boxes (DBs). The VSP 5000 controller board is powered by two 10-core Intel CPUs operating at 2.2 GHz. The two CPUs function as a single 20-core multi-processing unit (MPU) per controller. Each controller has eight DIMM slots into which 32 GB or 64 GB DDR4 (2133 MHz) DIMMs may be installed, for a maximum of 512 GB cache per controller. Up to four 4-port 8/16/32 Gb fibre channel CHBs per controller may be installed. iSCSI and FICON CHBs are also available. Two DKBs per controller must be installed on systems having internal drives. Each SAS DKB has 2 ports that connect to drive boxes over 4 x 12 Gbps links per port.


FRONT END CONFIGURATION

The default mode for VSP 5000 FC ports is target only. Target only mode supports a command queue depth of 2,048 per port, for compatibility with VSP G1500. VSP 5000 also offers an optional bi-directional port mode, under which a port can simultaneously function as a target and initiator, with each function having a queue depth of 1,024 (see Figure 3).  The highest-performing VSP 5000 front end configuration would use “100% straight” access, in which LUNs are always accessed on a CHB port connected to the controller that owns the LUN. Addressing a LUN on the non-owning controller (known as “front end cross” I/O) incurs a small additional overhead on each command. However, our testing shows that front-end cross I/O does not have a significant performance impact under normal operating conditions (up to about 70% controller busy). Configuring for round-robin multi-path I/O is recommended for most cases because it results in a good balance between performance (on average 50% front end straight I/O), and ease of use for configuration and load balancing (not having to align a primary path to the owning controller for each LUN).

Figure 3. VSP 5000 Bi-Directional Port Functionality.

BACK END CONFIGURATION

For optimal performance, the first level of drive boxes connected to the DKBs must include four SAS expanders. As shown in Figure 4, having four SAS expanders allows any type of drive box to be installed in the second and subsequent levels. With the configuration shown, any of the four VSP 5000 controllers in the controller box (CBX) pair can access any drive in the CBX pair without performance degradation due to inter-CTL overhead.

Figure 4. Either an SBX (4 x DBS2) or an FBX (4 x DBF3) Must Be Installed in Level 1.


In VSP 5500 systems having more than one CBX pair (that is, 4N and 6N systems), it is possible that back end cross I/O will occur.  Back end cross I/O occurs when the drive being accessed is in a different CBX pair from the controller that owns the target LUN. Back end cross I/O requires extra HAF/ISW transfers, and therefore may degrade performance. Fortunately, there is a new flash-optimized Hitachi Dynamic Provisioning (HDP) algorithm that will reduce the frequency of back end cross I/O to the point where it has little or no effect on performance. Briefly, the new system allocates a LUN’s pages in the same CBX pair as the controller that owns it, as shown in Figure 5. The new algorithm only applies to flash drives, because balanced page allocation is more important than locality of reference for spinning disk.

Figure 5. HDP Flash-Optimized Page Allocation.


As mentioned earlier, each SAS DKB has 2 ports that connect to drive boxes over 4 x 12 Gbps links per port. This yields a total of 64 x 12Gbps SAS links per CBX pair. Each CBX pair therefore has the same number of back end paths as a single-chassis VSP G1500 with the high performance back end option. Because of the increase from 6 Gbps to 12 Gbps links, theoretical SAS bandwidth per CBX pair is twice that of the VSP G1500 single chassis (768 Gbps vs. 384 Gbps). To maintain enterprise-level performance and availability, the VSP 5000 also requires fixed slot assignments for building parity groups, as shown in Figure 6. RAID levels 14D+2P, 6D+2P, 7D+1P, 3D+1P, and 2D+2D are supported.

Figure 6. An example of the parity group slot assignment system for highest performance and availability.



RESILIENCY ENHANCEMENTS

Resiliency enhancements allow the VSP 5000 to claim an industry leading 99.999999% availability. These improvements include:
  • An 80% reduction in rebuild time for flash drives.
  • A more robust cache directory system that allows latency-sensitive host applications to stay up by avoiding write through mode.
  • Reserved areas on separate power boundaries that allow quick recovery of a redundant copy of shared memory when a hardware failure or power outage occurs.
  • Resilient HAF interconnect featuring four independent PCIe switches. Up to seven X-path cables can fail without risk of taking the system down.
  • 100% access to all drives in a CBX pair even when up to three DKBs fail.


PERFORMANCE ENHANCEMENTS

Significant performance enhancements in the VSP 5000 include:
  • A streamlined cache directory system that eliminates redundant directory updates and improves response time.
  • An increase to twenty-four dedupe stores and twenty-four DSDVOLs per pool for improved multiplicity of I/O when deduplication is enabled.
  • Smaller access size for adaptive data reduction (ADR) metadata reduces overhead.
  • ADR access range for good metadata hit rate expanded to ~2 PB per CBX pair.
  • Inter-controller DMA transfers and data integrity checks offloaded to HAF to reduce CPU overhead.
  • Support for NVMe drives allows extremely low latency with up to 5X higher cache miss IOPS per drive.
  • Support for ADR in an HDT pool enables capacity savings + higher performance by accessing the busiest pages without ADR overhead.


MAINFRAME SUPPORT

he VSP 5000 supports mainframe architecture, just as all previous generations of Hitachi enterprise storage have done. There are two notable differences in how the VSP 5000 handles mainframe I/O vs. earlier enterprise products. First, command routing is done by the Hyper-Transfer (HTP) protocol chip on the FICON CHB instead of the VSP G1500 LR ASIC. Second, CKD-FBA format conversion is now offloaded to the HAF interconnect engine. Because all mainframe I/O must go through the HAF for format conversion, there will be no performance difference between front end straight and front end cross I/O on mainframe. 

We’ve briefly reviewed the highlights of VSP 5000 architecture, including improvements in performance, scalability and resiliency. For additional details, see the VSP 5000 home page.
#Blog
#ThoughtLeadership
#FlashStorage
2 comments
29 views

Permalink

Comments

12-17-2020 19:10

Many thanks...yes this article can be shared with customers and partners.

12-17-2020 06:26

This is very good article. Can this be shared to customers and partners?