Data Protection

 View Only

ShadowImage Best Practices

By Dang Luong posted 08-05-2024 17:58

  

This blog provides a brief introduction and a set of best practices for one of the core copy program products on the Hitachi Virtual Storage Platform (VSP): ShadowImage (SI).

Introduction

Hitachi ShadowImage uses local mirroring technology that you can use to create and maintain full copies of data volumes within a storage system.

Using ShadowImage volume copies (for example, backups, secondary host applications, data mining, and testing) allows you to continue working without stopping host application input/output (I/O) on the production volume.

A typical configuration consists of a storage system, a host connected to the storage system, the ShadowImage software, a primary or source volume (P-VOL), secondary or target volumes (S-VOLs), and interface tools for operating ShadowImage, such as Ops Center Protector, Command Control Interface (CCI), or REST API.

Figure 1 shows a typical configuration.

Figure 1: Typical ShadowImage Configuration

Figure 1: Typical ShadowImage Configuration

Volume Pairs

A volume pair consists of a P-VOL and one to three layer-1 (L1) S-VOL pairs. Because S-VOLs are updated asynchronously, the P-VOL and S-VOLs may not be identical, except immediately after a split.

If a pair is split, any further updates to the P-VOL are not reflected in the S-VOL.

Splitting or deleting a pair allows host access to the S-VOL.

Figure 2 shows an example

Figure 2: ShadowImage Environment

Figure 2: ShadowImage Environment

Cascaded Pairs

Cascaded pairs are volume pairs created in the first and second layers. A pair that is made up of an L1 S-VOL and a layer-2 (L2) S-VOL is referred to as an L2 pair. You can pair each ShadowImage L1 S-VOL with two L2 S-VOLs, with a maximum of nine L1 and L2 S-VOLs combined with a P-VOL.

Figure 3 shows the structure of cascaded pairs.

Figure 3: ShadowImage Cascaded Pairs

Figure 3: ShadowImage Cascaded Pairs

Initial Copy Workflow

The storage system process during Initial Copy is as follows:

  1. SMPL – S-VOLs are not paired – create copy pair.
  2. COPY (PD)/COPY – Initial Copy is copying P-VOL data to the S-VOL.
  3. PAIR – Initial Copy is complete, and the volumes are in PAIR status.
  4. The P-VOL continues receiving updates from the host during the Initial Copy.

Figure 4: Initial Copy Workflow

Figure 4: Initial Copy Workflow

Update Copy Workflow

Update copy asynchronously copies new data (differential data) from the P-VOL to the S-VOL.

The storage system process while updating a copy is as follows:

1.      After write I/O operations occur to a P-VOL, the storage system starts the update copy operation.

2.      The storage system marks I/O to the P-VOL in PAIR status as differential data and stores the location of the data in bitmaps for transfer to the S-VOL.

3.      The timing of the update copy operation is based on the amount of differential data that accumulates and the elapsed time since the previous update.

Figure 5 shows an example of the update copy operation.

Figure 5: Update Copy Workflow

Figure 5: Update Copy Workflow

Best Practices

·  Volume Capacity: The P-VOL and S-VOL must be the same size in blocks. If the capacity is displayed in GB or TB, a small difference between P-VOL and S-VOL capacity might not be displayed. To view the capacity in blocks, verify with the raidcom command: raidcom get ldev -ldev_id <ldev#>. Look for the output “VOL_Capacity(BLK)”.

·  ShadowImage replicates data asynchronously. Even in Pair state, there is no guarantee that the P-VOL and S-VOL are identical. The only time when the P-VOL and S-VOL are 100% identical is immediately after a pair split.

·  Performing pair tasks, such as creating, splitting, and resynchronizing ShadowImage pairs, can affect host server I/O performance on the storage system.

·  To avoid the performance impact and minimize the risk, follow the guidelines:

·   Assign S-VOLs and P-VOLs to different pools in case of failure. Ensure that enough drives are used to provision the P-VOL and S-VOL pools to provide the required performance capability.

·   ShadowImage can create high levels of internal activity in your storage system. Ensure that the configuration is appropriate for the internal and host workloads. Items that can help are additional drives, cache adapters, cache, BEDs, and MPUs.

·   To perform copy operations for multiple pairs in the same pool, perform the operation for one pair at a time.

·  It is better for performance to have P-VOL and S-VOL assigned to the same MPU before initiating pairing. Otherwise, the storage system must move S-VOL under the same MPU as P-VOL, which causes some overhead for the pairing operation.

·  Quick Split and Quick Resync can be used together to minimize downtime to near 0 seconds. Differential data between P-VOL and S-VOL is copied in the background. Quick Split is the default setting. Normal Resync is the default.

·  An alternative to Quick Split is Steady Split. This copies all the differential data from the P-VOL to the S-VOL before making the S-VOL accessible.

·   The syntax to perform Quick Split with CCI is: pairsplit -fq quick

·   The syntax to perform Steady Split with CCI is: pairsplit -fq normal

Note: There are other criteria that determine whether Quick Split or Steady Split is performed. For more details, see https://docs.hitachivantara.com/r/en-us/svos/10.2.x/mk-23vsp1b014/planning-shadowimage/quick-split-and-steady-split-performance-planning.

·  Split operations can cause a decrease in host IOPS and an increase in host response time. This happens when the application sends an update write to an area in the P-VOL that has not been reflected in the S-VOL. The storage system holds on to the write I/O longer than normal, copies the to-be-changed data from the P-VOL to the S-VOL, and then completes the write I/O. It is best to perform Split at a less busy time and/or with fewer ShadowImage pairs.

Another option is to enable local replica 2. This option notifies the storage system to prioritize host I/O over copy operations. The tradeoff to enabling local replica 2 is that copy operations will take longer to complete.

·  The response time of the Pair state is typically higher than that of the Simplex and Split states because of how ShadowImage copies changed data from P-VOL to S-VOL asynchronously. In Pair state, dirty data in the S-VOL accumulates. As a result, the storage system schedules more destage operations to prevent cache write pending from increasing. This causes a delay in the response time of the incoming host I/O.

In Split state, data is not copied from P-VOL to S-VOL, so there is no degradation to host performance.

In a situation where you must stay in Pair state for a long time and want to decrease or eliminate the performance hit from the background copy jobs, you can set local replica options 20, 21, or 22. They work as follows:

·   Local replica 20: This results in half of the current copy pace.

·   Local replica 21: This results in one-fourth of the current copy pace.

·   Local replica 22: This disables background copy.

Note:

·   These options are only effective in Pair state.

·   Because local replica 22 disables background copy, the differential data between P-VOL and S-VOL can become very large. As a result, performing Split could take a long time to complete.

·  Command line tool raidcom can be used to toggle local replica options. The syntax is as follows:

·   To enable local replica 20: raidcom modify local_replica_opt -opt_type open -set_system_opt 20

·   To disable local replica 20: raidcom modify local_replica_opt -opt_type open -reset_system_opt 20

·   To query the current enabled options: raidcom get local_replica_opt -opt_type open

Summary

In addition to the general best practices, you must properly size a ShadowImage solution based on your I/O requirements and your VSP storage system and configuration. This can be done by using the CPK sizing tool and reviewing the performance test results from Global Product and Solutions Enablement (GPSE). For more details, contact your Hitachi Vantara representative.

0 comments
13 views

Permalink