Introduction
Universal Replicator (UR) provides a solution for the recovery of processing operations when a data center is affected by a disaster situation. In a Universal Replicator implementation, a secondary storage system is located at a remote site from the primary storage system at the main data center, and the data on the primary volumes (P-VOLs) at the primary site is asynchronously copied to the secondary volumes (S-VOLs) at the remote site. Figure 1 shows an example.
With each host update to the P-VOLs, journal data is created with the record update, and metadata is created by Universal Replicator. This journal data is then retrieved asynchronously by the remote storage system and is used to replicate on the S-VOL the data update that occurred on the P-VOL, ensuring the consistency of the primary and secondary volumes by using the metadata that Universal Replicator created.
The main advantage of Universal Replicator asynchronous replication is the ability to replicate data over very long distances without impacting the performance of the I/O operations at the primary site.
Best Practices
This section shows the recommendations and best practices for designing an optimal UR solution.
Configuring various elements of a Universal Replicator solution requires an understanding of the business requirements and production system. The following are the key focus areas for designing a highly resilient Universal Replicator system and producing optimal performance from the storage system.
- Designing UR Data Paths: The most important factor that determines the ability of a UR system to transfer data in a timely manner.
- Sizing Journal Volume: Journal volumes must be sized to meet all possible data scenarios based on your business requirements.
- Multiple Journal Groups: When the number of journal groups increases, more read journal (RDJNL) commands are sent, and initial copy throughput improves.
- Journal Group and Volume Ownership: Journal groups and data volumes must be distributed across available MPUs to achieve a balanced load in the storage system.
- Advanced System Settings: Initial Copy throughput at a medium copy pace can be significantly increased with Advanced System Settings 17 and 18.
- Journal Settings: Apply appropriate journal settings, such as copy pace, flow control, and cache mode, based on business requirements.
- Distance Considerations: There is no distance restriction in UR, and because UR is asynchronous, the host response time is not affected because of distance.
UR Path Design
For better performance, when mapping the LUNs to the target ports, use a 100% straight mapping approach where the I/O from the host will be performed. In addition, the front-end ports and the P-VOLs must belong to the same controller. However, this is not applicable if you are configuring the paths for high availability to the hosts.
UR uses bidirectional logical paths between the two storage systems:
- Control Path (MCU -> RCU): Use for “Pair operation commands”.
- Data Path (RCU -> MCU): Use for “Read JNL command” and JNL transfer.
Figure 2: UR Remote Paths
Control paths use minimal line bandwidth because they are only used during pair operations. In contrast, data paths handle RDJNL commands issued from the RCU to the MCU, and the responses, including JNL data, are transferred back from the MCU to the RCU using these paths. Consequently, data paths consume more line bandwidth than control paths.
In a performance-critical environment, more paths from the RCU to the MCU are required to efficiently utilize available line bandwidth. However, in the event of a failure where the roles of the MCU and RCU are reversed, the performance may be affected because of the design of the remote paths.
Therefore, you must increase the remote paths from RCU to MCU when high performance is required, while maintaining the same number of paths during normal business operation.
Note: Not all storage interfaces can be used to create UR remote paths. For latest information, check the Product Compatibility Guide.
Sizing Journal Volume
In a UR solution, the journals remain relatively empty when the data path keeps up with the inflow rate of the updated data to the secondary site. However, there can be times where the network between the sites is down or becomes unstable. It is important to size the journals properly to meet your Recovery Point Objective (RPO). RPO is the maximum time that can pass after a failure, or a disaster occurs before the data loss is greater than the operation can tolerate.
To correctly size the journals, you must determine the following:
- The amount of changed data that the applications generate.
- The maximum amount of time that the journals can accumulate updated data. This information depends on the desired recovery point objective.
Journal volume capacity is determined considering worst-case scenarios, such as link failure between two data centers. Table 1 is an example of calculating the required journal volume capacity using a 50% 8KB write-workload and RPO.
Table 1: Journal Volume Inputs
Parameters
|
Input
|
RPO
|
2 hours
|
Write length of required performance
|
8 KB
|
Required IOPS
|
20,000
|
Read-write ratio
|
1:1 (write = 0.5)
|
Write-workload
|
78 (20,000 x 8 x 0.5 ÷ 1024) MB/second
|
Calculate write-workload for the RPO. In this scenario, write-workload over a two-hour period is calculated.
|
78 x 60 x 60 x 2 = 561,600 MB
|
Basic journal volume size
|
561,600 MB (548 GB)
|
Note: A more efficient system design would be dividing the required journal capacity into multiple journal groups and associating them with a fixed number of P-VOLs as described in the following section.
Multiple Journal Groups
More journal groups deliver higher UR Initial Copy and Resync throughput because more copy operations are active simultaneously with more journal groups.
In an environment with a transfer speed of 256 MB/s or higher, 32 RDJNL commands are sent simultaneously per journal group from the RCU to the MCU, regardless of the number of pairs in the journal group. Each RDJNL command can copy up to 1MB of data from the MCU to the RCU. As the number of journal groups increases, more RDJNL commands are sent. Consequently, the throughput for Initial Copy and Resync, as well as the host I/O performance, improves.
Note: There is an optimal number of journal groups for each storage system, beyond which the throughput begins to degrade. Contact your Hitachi Vantara representative for internal performance test details.
Journal Group and Volume Ownership
UR enhances performance with multiple journal groups. A key factor is that the processing performance per journal group depends on how the load is distributed among different MPUs. To optimize UR throughput in the storage system, it is advisable to distribute UR pairs across multiple JNLGs and assign each JNLG to a different MPU.
Figure 3 shows an example of the journal and volume assignments to achieve a balanced load in VSP One Block 20 with eight journal groups and 512 pairs.
Advanced System Settings
Initial Copy throughput can be significantly increased with Advanced System Settings 17 and 18. They work by increasing the frequency of journal data creation. Advanced System Setting 17 increases the frequency by one level while Advanced System Setting 18 increases the frequency by two levels. To use either one, enable them on both storage systems and then start Initial Copy with copy pace set to Medium (from journal group mirror option). This setting corresponds to the system option mode 600 and 601 for Hitachi Virtual Storage Platform G1x00 (VSP G1x00), VSP F1500, or previous models.
Note: If the storage systems resource utilization comes to a saturation point, the effect of Advanced System Settings 17 and 18 may not be visible, or eventually be detrimental to the total throughput of the copy operation. These two system settings are recommended when storage resources are not over utilized.
Journal Settings
UR offers three types of copy pace options for Initial Copy: Low, Medium, and High. During Initial Copy, whether host I/O can be applied to the P-VOL depends on the copy pace:
- With High copy pace, write I/O is not recommended. If write I/O is issued in High copy pace, the host I/O performance degrades substantially, and pairs can get suspended.
- A better solution is Medium copy pace, where write I/O is accepted without much degradation in the Initial Copy performance.
For Initial Copy or Resync operations, if the most important criteria is host I/O and latency:
- Set the JNLGs to either Low (1 or 2) or Medium (3) copy rate.
For example: raidcom modify journal -copy_size 3.
- Start Initial Copy on fewer pairs at the same time.
For Initial Copy or Resync operations, if the goal is to complete Initial Copy as quickly as possible:
- Set the JNLGs copy rate to any value more than 4 and up to 15.
For example: raidcom modify journal -copy_size 15.
If you have large numbers of pairs, to improve Initial Copy or Resync throughout, increase copy multiplicity by setting maximum initial copy activity up to 128 (default is 64). Ensure that the same number of pairs from all journal groups are active simultaneously during the Initial Copy or Resync. This parameter can be changed with the following command: raidcom modify remote_replica_opt -opt_type ur -copy_activity 128.
Inflow Control JNLG option is used to restrict the flow of updates to the journal volume. When this option is enabled, inflow of write data can be slowed or stopped if there is insufficient space in the master JNL-VOL. Set this option to “No” to ensure that host I/O is not affected. The downside is when the write data or metadata stored in the primary storage system journal volume reaches the total capacity, the related journal group volume pairs are suspended.
The default setting for Cache Mode is “Enable”, which allows the updated data quicky reflected on cache, without going through journal volume on the RCU. However, if the write pending rate on RCU reaches 50%, the journal data is destaged to the journal volume and the operation is the same as when the option is set to “Disable”.
Distance Considerations
In UR, data is copied asynchronously to the remote storage system from the host write operations on the P-VOLs. Therefore, the host response time is not affected by distance. However, beyond a certain distance (~5,000 km), maximum performance degradation may occur because of delays in RDJNL command issuance, resulting in early journal buildup.
Note: The read-only workloads are not impacted, regardless of distance because read I/Os are serviced by the local storage system and no write updates need to be replicated.
The following iSCSI settings on the remote paths can be configured to improve performance:
- Maximum Window Size: 1024 KB
- Ethernet MTU Size: 9000 Bytes
Note: If copy performance decreases because of increased RTT (Round-Trip Time) for long-distance connections, performance degradation can be avoided by dividing journal groups and thereby increasing the number of RDJNLs.
Summary
In addition to best practices, you must properly size a Universal Replicator solution based on your I/O requirements and your VSP storage systems and configurations. This can be done using the CPK sizing tool and reviewing the performance test results from Global Product and Solutions Enablement (GPSE). For more details, contact your Hitachi Vantara representative.