Data Protection

 View Only

Global-Active Device Best Practices

By Jonathan De La Torre posted 09-09-2024 18:14

  

This blog provides the key considerations and best practices for setting up global-active device effectively.

Introduction

Global-active device (GAD) is a data mirroring technology that enables high availability and disaster recovery for storage systems, allowing you to maintain synchronous copies of data at remote sites. To ensure optimal performance and resilience, it is crucial to follow best practices when configuring global-active device.


What is Global-Active Device?

Global-active device is a technology designed to ensure high availability and disaster recovery in storage systems by allowing for real-time data replication through data mirroring between storage systems located in different geographic locations. A typical global-active device system consists of storage systems, paired volumes, a consistency group, a quorum disk, a virtual storage machine, paths and ports, multi-path software, and cluster software. There are many benefits of GAD, such as continuous server I/O when a failure prevents access to a data volume, server failover and failback without storage impact, and load balancing through the migration of virtual storage machines without storage impact. This is critical for businesses that require continuous data availability and the ability to quickly recover from unexpected failures.

Terms to Know

·         Main Control Unit (MCU) – The storage system typically located at the primary site that contains the primary volumes.

·         Remote Control Unit (RCU) – The storage system typically located at the remote site that contains the secondary volumes.

·         Pair volumes – Global-active device pair volumes consist of a volume in the primary storage system and a volume in the secondary storage system.

·         Consistency group – A group that consists of multiple global-active device pairs.

·         Quorum disk – A volume from a third storage system, on-premises iSCSI-attached server or virtual machine, or cloud-based iSCSI virtual machine that is used to determine the global-active device behavior when a storage system or path failure occurs.

·         Virtual Storage Machine – A special resource group typically created in the second storage system, which has the same model and serial number as the primary storage system. A virtual storage machine allows the servers to treat global-active device volumes from the two storage systems as coming from the same source.

·         Multi-path software – Software on the server used to set redundant paths to volume and distribute host workloads evenly across the data paths.

·         Remote connections – Physical paths, which can be Fibre Channel or iSCSI, used for replicating data between the primary storage system and secondary storage system.

Best Practices

Pair Volumes

When creating pairs using the paircreate command, specify the copy pace using the -c argument. For minimal host impact, use a slower initial copy pace (for example, -c 3). For higher initial copy performance (with host impact), use a higher copy pace (for example, -c 15). Note that setting the copy speed too fast can overload paths between storage systems.

When selecting the number of copy activities on the Virtual Storage Platform (VSP) storage system, the default value is 64, with a range from 1 to 512. Increase this number for higher throughput (only useful if there are more than 64 pairs); however, to prevent quorum disk blocking and GAD pair suspension, avoid exceeding 200.

Consistency Groups

To enable batch operations for multiple pairs simultaneously, register GAD pairs to consistency groups. This ensures that in the event of a failure, GAD pairs are suspended together, maintaining consistency.

Storage Area Network (SAN) Configuration

Maintain two independent SANs (SAN A and SAN B) and manage them separately without interconnections to ensure performance and resilience. Ensure all servers have at least two Host Bus Adapters (HBAs), one for each SAN, with the number of HBAs varying based on application requirements. Additionally, when planning SAN configurations across geographically dispersed sites, consider the implications of distance on network performance. While it is technically possible to configure replication paths up to 500 KM, for optimal performance and minimal latency, maintain the physical distance between replicated systems to a maximum of 200 KM when using cross paths. Any distance beyond this can significantly impact data transfer speeds and system responsiveness, particularly in high-availability scenarios.

Resource Distribution Best Practices

Distribute volumes evenly between controllers (CTLs), with at least one volume per Microprocessor Unit (MPU) core. Ensure that host paths and remote connections are equally distributed between CTLs, balancing hardware resources between storage systems. To sufficiently process incoming data, the storage system that is the replication target must have the same or a greater amount of hardware resources as the replication source.

Multi-Path Configuration Best Practices

Ensure that the number of paths is equal to or less than the number of HBAs; use single initiator port to single target port mapping. For GAD, use four paths. Implement ALUA to avoid sending traffic down non-optimized paths while keeping these paths available for immediate use if needed. Use the same protocol for host and remote connection paths, except in specified cases.

In cases where different protocols are necessary because of hardware limitations or specific network requirements, ensure that each path is configured optimally according to the capabilities of the connected devices. Additionally, set the command timeout time on the host connections to be longer than the timeout time on the remote connections.

Quorum Disk Configuration

To avoid single points of failure, place the quorum disk in a different location than the primary and secondary storage systems. Services like Global-Active Device Cloud Quorum from Amazon Web Services (AWS) or Azure are preferred for cost savings. A single quorum disk can manage up to 48K GAD pairs; however, do not exceed 4K pairs.

Ensure a latency of less than 100 ms when selecting services like Global-Active Device Cloud Quorum from AWS or Azure. This is crucial to meet the performance standards necessary for optimal system functionality.

Remote Port Configuration

To protect mission-critical data in your storage system from illegal access, secure the volume in the storage system through the Enabled Port Security UI or through CCI with the raidcom modify port -port <PORT ID> -security_switch y  command.

If you want to use iSCSI for remote connections, note that only 10 Gbps iSCSI are supported for the remote connections.

Virtual Storage Machine

Configure the Virtual Storage Machine (VSM) on the secondary storage system using the same model and serial number as the primary storage system.

Summary

By following these best practices for configuring global-active device, you can ensure optimal performance, resilience, and high availability for your storage systems. Properly distributing resources, configuring replication paths, and strategically placing the quorum disk are key factors in building a robust GAD setup. Regular monitoring and maintenance are essential to keep your storage environment running smoothly.

In addition to the general best practices, you must properly size a global-active device solution based on your I/O requirements and your VSP storage systems and configuration. This can be done using the CPK sizing tool and reviewing the performance test results from Global Product and Solutions Enablement (GPSE). For more details, contact your Hitachi Vantara representative.

0 comments
17 views

Permalink