Hitachi has delivered a software, which is Hitachi Infrastructure Analytics Advisor (HIAA). HIAA delivers visualization, intelligence and automation to optimize infrastructure health while quickly identifying and troubleshooting performance issues.
In my first post, I introduced a use case to determine and solve performance bottleneck in Hitachi Unified Computing Platform CI (UCP CI) environment.
Now HIAA – HDCA SaaS edition is available to you. This is a new consumption model of HIAA. Hitachi provides a HIAA instance running on the Public Cloud. But HIAA – HDCA SaaS edition provides same user experience as well as on-premises version of HIAA.
HIAA – HDCA SaaS edition relieves installation process. Hitachi provides the configured HIAA server instance on Public Cloud. Customer can start managing the system using HIAA GUI instantly when user obtain access information. Customers do not have to prepare HIAA server.
2. Maintenance Free
HIAA – HDCA SaaS edition is guaranteed its availability. Hitachi observes HIAA instance and keep it alive. Thanks to this service, customers can focus on managing their system.
3. Subscription model (SaaS)
In HIAA – HDCA SaaS edition, subscription price is determined by the amount of performance information customer wants to upload. Thanks to this pricing model, customers can start using this solution in lower cost than the full upfront payment model.
HIAA – HDCA SaaS edition is better solution for UCP customers those who prefer low-touch infrastructure management. Because customers do not need to care HIAA – HDCA SaaS edition instance. Hitachi provides management service of HIAA – HDCA SaaS edition instance.
HIAA – HDCA SaaS edition provides server-less model to customers. This means customers can implement more higher density of user application and VMs.
This is the system overview. Basic components, which are Probe server and HIAA Server, are same as on-premises version of HIAA. But HIAA server move to Public Cloud. Generally, most of customers protect their system by the firewall.
1. Network configuration
a. The Probe server communicate with the HIAA server via HTTPS protocols. (No other protocols are required.) Please negotiate your network administrators to allow outbound connection HTTPS port 443. For example, changing Firewall setting or though HTTPS proxy.
b. In Public Cloud side, Web Application Firewall (WAF) allow to connect from user network. Hitachi have to list the public IP address of customer network. Please contact Hitachi representative.
2. Access information from Hitachi
Hitachi provide HIAA – HDCA SaaS edition access information to customers.
I am going to introduce an example configuration, which is HIAA – HDCA SaaS edition manages UCP CI in the on-premises data center.
1. Configuration in on-premises data center
There are same processes to install and configure in the on-premises side. (Reference: https://community.hds.com/community/products-and-solutions/vmware/blog/2017/12/29/part2-end-to-end-infrastructure-performance-analysis-of-hitachi-unified-compute-platform-ci-simplified-with-hitachi-infrastructure-analytics-advisor-hiaa)
The only difference point is to switch data upload server into HDCA server which is a part of components in HIAA – HDCA SaaS edition. The probe server in on-premises have to connect HDCA. Once customer login to probe server, then input HDCA server information.
After a few hours, configuration and performance data will be appeared in GUI screen.
2. HIAA – HDCA SaaS edition configuration
As I mentioned before, customers can start to use HIAA – HDCA SaaS edition immediately. Here, I recommend making a “Consumer Group”. This is a nice feature to identify pods later.
1. Generally, administrators use vCenter to observe system health. If unexpected workload raise, administrator will notice that noisy neighbor issue happed. In this situation, administrator will check system performance in vCenter performance monitor. For example, CPU, Data Store and so on.
2. On HIAA GUI dashboard, there are an alert on the storage system.
3. Breakdown of the storage system, there are alerts on Cache and Parity Groups. Administrator have to find a root cause of higher CPR rate.
4. At this moment, HIAA provides analytics feature of bottle neck. BasePoint is the origin point that Administrators start to investigate. Now we are going to start from Cache because we need to check why CWP is so high.
5. HIAA shows suspects of higher workload. What volume has much high workload.
6. One more approach, HIAA can show how busy Parity Groups works. In this use case, to keep up user’s workload, Administrator decides to install more HDDs to enhance drive performance. Because Parity Group Utilization shows exceeds 80%. This shows Parity Group 04-03 and 04-04 is busy at this time.
7. What HDP pool should be added more HDDs? HIAA can show where PG should be installed.
8. After installing HDDs, Auto Rebalance will start working to distribute workload evenly. For a while, Auto Rebalance will continue working. After that, the system performance will be boosted. User can experience enhanced performance.
In this article, I introduced HIAA – HDCA SaaS edition. HIAA – HDCA SaaS edition is working on the public cloud. Hiachi prepare it for customers then the customer can start to use instantly. This provides lower OPEX and CAPEX of performance management to customers. Meanwhile, HIAA – HDCA SaaS edition provides same user experience as well as on-premises model.
We recently rolled out some software upgrades for our VMware ecosystem integrations and certifications which are generally available now or will be within the next 45 days. The specific products updates this month are Storage (VASA) Provider, vRealize Operations (vROPS) Management Pack and Storage Plugin for vCenter. This is in addition to updates to our flagship UCP Advisor software which provides provisioning and lifecycle management for infrastructure and resources from within vCenter. As always, the intent is to continue to empower IT roles who leverage the VMware vCenter/vRealize management stack for operations and automation with native integrated access to services, capabilities and data from Hitachi Infrastructure. I've outlined some of the high level advancements in the respective integration below.
Note, all Hitachi integrations are now posted on VMware marketplace for customers/partners to download.
As a refresher, the Hitachi VASA provider software integration enables storage capability aware policy based management for VMFS/VVol while also enabling a VVol deployment.
One of the major new features additions in the 3.5 release is automating QoS or similar actions when storage policies are changed on a VM or datastore in order to bring it into compliance with that new policy. We focused our initial policy compliance efforts around our Hitachi Active Flash and data tiering (HDT) pooling technology which are used quite frequently in VMware environments. It consists of multiple tiers from both internal and external storage (an example might be FMD, SSD within Hitachi VSP and 3rd tier being a virtualized external 3rd party flash array from Pure/EMC etc). This pooling technology automatically moves or pins data blocks between tiers based on data access rules but there are cases where finer grained control for Application owners/VM admins is beneficial, whether its related to expected application usage behavior change or cost controls.
With this release of VASA Provider, when an administrator applies a new policy to a VM/VMDK or datastore, the Hitachi VASA Provider will initiate storage changes to bring that object into compliance. One example of this is tiered data placement within our pool. If certain VMs/VMDKs or datastores are set with "Tier 3 Latency and Tier 3 IOPS" policy capability within vCenter, the system will automatically move those blocks to that lower tier freeing up higher tiers for new net applications or that high performing database that is growing in size. Similarly, that application that now requires both "Tier 1 Latency and Tier 1 IOPS" is simply applied that policy by VM admin (or App owner). The Hitachi VASA software will invoke actions to pin/promote all blocks to that highest performing tier. We have made this capability available for both VMFS (taking advantage of custom tiering policies) and/or VVol datastores. VMware VVol will obviously allow finer grained granular control at VM/VMDK level given its object based implementation. This was the additional motivation to allow a level of Hitachi infrastructure resource control to be accessible to application owners (not just VM admins) through API/vRO/vRA Catalog services.
This release also includes support for vSphere 6.7, environments with multiple SSO domains and configurations without external service processor (SVP). Also worth noting that the latest editions to Hitachi VSP All Flash storage platform powered by SVOS RF, now officially supports up to 8X increase in number of vSphere Virtual Volumes (VVols). So whether using Hitachi Storage, UCP Converged or Hitachi enterprise cloud environment, download and try the free virtual appliance and take advantage of it.
In this updated release of the Hitachi Storage Management Pack, we have introduced a new troubleshooting storage dashboard to identify and remediate potential risks/issues quickly within vROPS. This dashboard walks operations teams through 12 key questions and metrics that we have determined from past support experiences are helpful to get to quick resolution to VM-Storage potential root cause issues. It covers key health areas such as cache write pending levels, I/O port utilization, latency and storage processor busy metrics within dashboard for full correlated selected timeline view.
We have also made improvements in capacity savings dashboard for customers to easily visualize space savings with deduplication and compression deployed in their all-flash Hitachi storage and UCP converged systems. Administrators can now visually see answers to questions related to deduplication ratio, data reduction savings and capacity trend based on current space efficiency rates continuing.
This releases also supports vROPS 6.7 and VMware’s vRealize Suite Lifecycle Manager (vRSLCM) for automated management pack updates. This will allow customers that.leverage vRSLCM to automatically receive notification of partner management pack updates and have those updates be downloaded directly to their system from VMware's (VSX) marketplace.
While UCP Advisor is morphing to be the flagship management integration for all Hitachi infrastructure, we continue to evolve the storage plugin for vCenter. This release includes substantially improved datastore provisioning times on our recently announced next-generation VSP storage platforms using latest API integration. It also fully supports provisioning against deduplication and compression based storage pools.
From a VMware certification point of view:-
Stay tuned as we continue to evolve these and other VMware ecosystem integrations.
Software is often improved by providing version up software. Of cause, Hitachi provides software version-up.
I determined to install HIAA 3.2 instead of 3.1. HIAA provides scripts for no hustle upgrading. Let me show you overview of upgrading version3.1 to 3.2 in my case.
(Please follow the procedure which provided in the User Guide when you perform to upgrade your system.)
Upgrade installers are provided for both Windows and Linux environment. I am going to introduce the workflow for Linux host.
There are four steps in upgrading process.
You need to obtain the software media. Then, the files should be copied to the host. I copied the files using scp. You can any way to copy the files.
Type these command on the laptop.
#scp -r /mount-point/ANALYTICS root@HDCA_host:/root/
#scp -r /mount-point/DCAPROBE root@probe_host:/root/
Don't forget to backup the settings of your environment. Please refer the users guide how to stop HIAA/HDCA/Probe server services.
To back up HIAA server, backupsystem command is available. This command copy the configuration files. You also need to copy some other files in HDCA server and Probe server. User guides mention which files should be copied.
Here we go. It is time to run upgrading scripts.
Before running upgrade, you must save the certification file. Please follow the detailed instruction in the user guide.
Then, run the upgrade scripts.
# cd /root/ANALYTICS (Destination directory which you put files in section1.)
# ./analytics_install.sh VUP
Wait until some message come up on screen.
Before running upgrade, you must save the certification file. Please follow the detailed instruction in the user guide.
Then, run the upgrade scripts.
# cd /root/DCAPROBE (Destination directory which you put files in section1.)
# ./dcaprobe_install.sh VUP
Wait until some message come up on screen
After finish the all upgrade process, you can access HIAA website. But before you access, please remember clearing the browser cache.
These are all step that I did. If you perform upgrade please read carefully and follow the steps in the user guide.
For your reference, I introduce some links.
Thank you for your time. See you next article.
Hi, it’s time for Part2. I will show you what I did in set up procedure and use case.
I set up the configuration in our Lab. I am going to write some workflow about HIAA with UCP CI particularly.
HIAA Probe communicate with VSP G600 Command Device to retrieve storage information including configuration and Performance statistics. This requires FibreChannel connection between VSP G600 and the server running HIAA Probe server VM.
HIAA Probe cannot communicate with Brocade SAN switch directly. If you would like to check Brocade Switch performance, you need Brocade Network Advisor (BNA) to retrieve information.
There are two options to install HIAA. One is deployment of OVA virtual machine image. Another option is using installer on a host.
My choice was first one (OVA file deployment). I created three VMs as Fig1. VM1 is for HDCA/HIAA server. VM2 is for Probe server. VM3 is VM to run windows. This is the host OS to install BNA.
Further detail HIAA installation instruction is shown below.
After installing HIAA, I added target probe. Probe is a kind of module to retrieve information from target machines. Probe for Hitachi Storage, BNA, vCenter are available.
If you don’t see any information on HIAA without any error, I recommend you wait an hour. It may take time to hand over. Because retrieved information is hand over across three servers (Probe – HDCA – HIAA).
At the time of registering components into HIAA, HIAA recognize them as just individual resources. “Consumer” is a great feature of grouping resources. I made a Consumer which groups UCP CI resources. This provides administrator clear recognition of UCP CI converged system.
I will show you an example of performance analytics use case for UCP CI with HIAA.
1. Finding performance Preblem in vCenter
In most cases, an administrator of virtualized environment manages the infrastructure with vCenter. Let's say the administrator found excessive high latency for storage (fc2) reported on the vCenter. (Fig4)
Fig4. Storage view in vCenter (Latency)
2. Problem Analytics using HIAA
Hand over to HIAA from this step. Login HIAA and check Dashboard.
Fig5. HIAA Dashboard
Alert "Critical" comes up in Dashboard. Then, Jump into E2E view.
Fig6. E2E View
This E2E view shows relationship of VM to LDEV. In this screen, HIAA shows Storage Processor and
LDEV are busy. HIAA E2E view shows VM and Host server is fine. These VMs use LDEV and MP in
G600 which is marked as Critical.
Let's back to vCenter, then open VM Monitor tab. We can see Disk performance mounted on VM.
Disk of VM performance was actually dropped(For example, VM fc-02). Meanwhile, example other
VM fc-18 started Write IO to its disk. I would like to improve performance, but also want to keep
both IO fc-02 and fc-18.
Fig8. Disk of VM fc-18 performance (from VM performance view)
Let’s drill down to find bottleneck.
3. Bottleneck investigation
First, I checked Storage performance. From screen below(Fig9), only MPU-10 is busy. This issue must be occurred uneven assignment of MPUnit. Workload of MPU-10 is above critical line (Red line). The other MPU do not work now. In this example, MPU-10 is primary bottleneck.
Fig9. Sparkling view
4. Performance improvement
Next step, what we can do to solve overload situation in MPU-10? Primary option is offloading workload to other MPUs.
I have to change MPU assignment configuration. This operation makes workloads distributed to other MPU.
Then, we need to identify what LDEV should be moved? Candidate LDEV are shown as below. (Fig10) Click busy MP, HIAA shows LDEVs related with MPU.
Fig10. Relationship LDEV with MPU
I distributed assignment of MP Unit across all MPUnit. Then all MPU started works evenly. (Fig11)
Fig11. After resolving MPU-10 overload
Finally, MP overload was resolved.
Fig12. All resources are fine
I introduced the value of combination HIAA and UCP CI. In the use case section, I showed you one of
the examples to improve performance issue in UCP CI environment.
I hope you can enjoy HIAA and UCP CI solution. Thank you for your time to read.
Please refer further information below,
Videos on YouTube:
"Detecting Performance Bottlenecks using E2E view in Hitachi Infrastructure Analytics Advisor"
"Dynamic Threshold Storage Resource Monitoring With Performance Analytics, Using HIAA"
"Using HIAA to Analyze a Performance Bottleneck in Shared Infrastructure"
"Detecting Performance Bottlenecks Using Sparkline View"
Hitachi Vantara has launched the new converged infrastructure Hitachi Unified Computing Platform CI (UCP CI). Today, I would like to introduce the performance analysis solution with UCP CI.
Hitachi Infrastructure Analytics Advisor (HIAA) delivers visualization, intelligence and automation to optimize infrastructure health while quickly identifying and troubleshooting performance issues. UCP CI is an optimized and scalable converged infrastructure platform. In this series of posts, we will cover use cases of what can be done with HIAA and UCP CI together.
Fig1 shows an example of an end-to-end (E2E) map, which is showing topology of specific running VM to connected switch to used storage LUN.
Fig1: HIAA E2E View
In this series of posts, we will cover:
Hitachi Infrastructure Analytics Advisor (HIAA) includes the tools to properly monitor and analyze performance statistics from the application through its entire data path to the shared storage resources. Generally, Converged Infrastructure, like UCP CI, provides easy management to customers. Meanwhile Converged Infrastructure conceals detailed of infrastructure. This makes troubleshooting difficult.
The features of HIAA provide solutions against these pain-points.
Some of the key features include:
Also, the HIAA team has posted great videos on YouTube. Check them out!
Detecting Performance Bottlenecks using E2E view in Hitachi Infrastructure Analytics Advisor
Dynamic Threshold Storage Resource Monitoring With Performance Analytics, Using HIAA
Using HIAA to Analyze a Performance Bottleneck in Shared Infrastructure
Detecting Performance Bottlenecks Using Sparkline View
Analyze Configuration Changes in Your Infrastructure to Solve Performance Problems
(Updated on 10/19) HIAA v3.2 is now available. HIAA v3.2 supports integration with Hitachi Storage Management Pack for VMware vRealize Operations(vROPS) v1.7. Thanks to this integration, vROPS retrieves storage performance, capacity and related health metrics from HIAA. Note, Hitachi Tuning Manager is no longer supported and management pack is available from VMware marketplace.
Hitachi Vantara has launched UCP CI in September 2017. This is a new series of the Converged infrastructure of Hitachi. The UCP CI architecture consists of Intel-based rackmount servers, Hitachi Storage and Switches.
UCP CI Components Overview:
The combination of HIAA with UCP CI provides many benefits to customers running UCP CI virtualized environment.
UCP Advisor is the management software sold with UCP CI that simplifies configuration and management of the UCP CI converged infrastructure.
HIAA provides an additional value for customers with the ability to monitor, analyze and troubleshoot system performance issues by showing an end-to-end topology and system-wide relationship of hardware and software components.
In addition, HIAA can show detailed performance statistic information of the entire UCP CI stack ranging from storage, SAN and hypervisor (VMware).
The dashboards and charts are extremely helpful for absorbing large amounts of performance related information in an organized and simplified manner.
This is an overview of the configuration built in our Solution Lab.
Fig2: Configuration Overview
We can provide a free version of HIAA for customer trial. There is not functional limitation but it expires 90 days of installation. If you are interested in the trial license, please contact HIAA PM D-List or the author (Koji Watanabe).
Also, you can obtain the 120 days trial version of BNA from Brocade Website.
Today, I have introduced the value of combination of HIAA and UCP CI. These two products provide Low-touch infrastructure and easy analysis of performance management.
I will show you "Performance troubleshooting using HIAA" in the second post of this series. Stay tuned!
Last week I was out in Las Vegas at VMworld 2017 - An incredible event for both VMware and for us at Hitachi! At a high level VMware clearly demonstrated that not only is Private Cloud is accelerating but Hybrid Cloud is now a reality and the future rests on cross-cloud services tied to network and security virtualization.
Beyond the hype (after all this is Las Vegas...) its clear that both the Private and Public Cloud are maturing quite quickly and that enterprise clients are looking to accelerate from Strategy to Execution. While some initial thoughts around the cloud centered around cost savings its clear today that the real gains come from the Agility associated with Private/Hybrid Cloud. Being able to "Run any application, in any cloud on any device" provides enterprises the opportunity to build and run their applications across a wide variety of infrastructure, platform and consumption models driving increased flexibility and more rapid innovation. Most importantly it gives enterprises the flexibility to develop applications on a variable cost basis with the flexibility to bring them back in-house should business requirements change.
For more thoughts on the VMworld show and my personal reflections on the future please visit "The Clouds are Clearing...VMworld 2017 Reflections and Predictions"
I'd also encourage you to read my colleague Bob Madaio's thoughts "A (mostly) Grown-up Take on VMworld"
So what does it mean to Hitachi? Well, the maturation of Private and Hybrid Cloud is exciting because it enables us, at Hitachi, to enhance the depth of the relationships with our clients. Specifically, as it relates to VMware and cloud adoption we leveraged the show to demonstrate 3 key offerings:
The vision of cloud agility is finally coming to life and Hitachi is excited to be at the forefront of solutions that accelerate deployment.
With the recent announcement of our VMware Cloud Foundation (VCF) powered UCP RS system to deliver a hybrid cloud reality (check Dinesh's blog here for details), one of the interesting questions from early prospects is advice or guidance on how others are managing a hybrid private environment which consists of a traditional VMFS environment (and lately VVol) as they bring VMware vSAN based architectures into their environments. The basis for this question or the outcome they want to meet is to provide a pool of resources accessible to the various line of business or application teams which should provide different characteristics while providing those consumers with some level of intuitive control on where their assets will run to ensure they can meet their intended SLAs.
Here are some Amazon EBS Storage options to give a perspective on why this will be important in your VMware powered private hybrid cloud designs.. Each separate EBS volume can be configured as EBS General Purpose (SSD), Provisioned IOPS (SSD), Throughput Optimized. (HDD), or Cold (HDD) as needed. They have stated that some of the best price/performance balanced workloads on EC2 do take advantage of different volume types on a single EC2 instance. For example, they mention they see Cassandra using General Purpose (SSD) volumes for data but Throughput Optimized (HDD) volumes for logs, or Hadoop using General Purpose (SSD) volumes for both data and logs. This level of differentiation is first step in providing tiers of service to consumer of cloud resources.
Source: AWS Storage Options
But again, performance is just one layer. There are many characteristics when it comes to SLAs. Take the "availability" characteristic. As you may know, because an EBS volume is created in a particular availability zone, the volume will be unavailable in other availability zones if original availability zone itself became unavailable. Resources aren't replicated across regions unless you do so specifically. Again, that might be an important characteristic to an app service being rolled out (To be fair to AWS, they recommend creating snapshots as snapshot of volume(s) are available across all of the availability zones within a region)
This is an area that I've put some cycles into with the team when we defined the requirements around the latest release of our Hitachi VASA Provider (VP) version 3.4 to operationally enhance the right consumption of resources for vSAN, VMFS and/or VVol. Based on the VVol/SPBM program, we took advantage of some of the storage container concepts and latest tagging capabilities in vSphere 6.x to provide a better experience. With the latest Hitachi VP software, VMFS datastores (that may be adding additional datastore resources to an existing VCF based vSAN deployment or separate traditional VMFS environment), will be automatically tagged in vCenter with their specific SLA including cost characteristics. Click to enlarge GIF below to get a perspective of how the new VP WebUI (and API) provides the facility to assign capabilities to infrastructure resources, including automated vCenter tagging of VMFS datastores while allowing vSAN datastore(s) to be similar tagged with appropriate category capabilities. The end result is much more intuitive description of the resource capabilities available across vSAN, VMFS and VVol.
With this automated tagging of capabilities to existing and new datastores, vSphere policies can now be much richer and descriptive to consumers. Click to enlarge animated GIF below as it rolls through a typical vSphere policy, in this case a policy describing "Tier 1 Performance and DR Availability" with rulesets for VMFS, VVol and vSAN within the same policy. In my lab environment, this policy with its Tier 1 performance, Tier 2 availability and lowest cost capability found matching storage on all three entities allowing consumer to pick one of choices
The VMFS datastore highlighted below was configured to provide the highest level of availability and performance (GAD multi-datacenter active-active replicated enabled LDEV using accelerated flash on F1500 with data at rest encryption) and the VP software automatically tagged the corresponding datastore with the following capabilities; Tier 1 availability and performance, encryption and cost between 750 and 1000 units. This datastore would be a match when app owners or admins selected the "Tier 1 Performance, Encrypted and Active-Active availability" policy which in my lab environment ruled out vSAN or VVol as potential targets.
Taking the Apache Cassandra application example from Amazon, which I wanted to deploy on the VCF powered UCP RS system. During provisioning, I assigned the appropriate application owner understandable policy for each of the disks:- the high performance data disks for Cassandra VM with lower capacity landed on the vSAN datastore, while the log disk, 10x the size, landed on the iSCSI VMFS datastore. I didn't consume unnecessary storage from my all-flash vSAN as the VMFS datastore (and VVol datastore) was a suitable match for the characteristics for the log data in this example. There is so much more that can be exploited when you think of these capabilities can easily be extended and expressed for other infrastructure resources.
In summary, when it comes to provisioning resources, whether its from vSphere Client or vRealize Automation with its SPBM awareness, these richer policies are select-able to ensure appropriate resources are selected at VM level or indeed VMDK level. Taking a leaf out of Amazon's trees in EC2, this is the type of resource variability and ease of consumption needed to run a sustainable cloud environment meeting diverse needs across many application services as you update and modernize your infrastructure.
Check out the live demonstration of VCF powered UCP RS and Hitachi VASA (VP) Software at #VMworld 2017
I've recently moved from Horizontal Platforms to Vertical Solutions and I feel it might be a good time to revisit one of my old posts (Can we please stop telling Digital Enterprises to “act like a startup”?) and look at how this applies to one of my core customer segments: Retail banking. Specifically, let's look at how their business differs from the Fintech startups and how to apply the three Digital Innovation practices (Infrastructure Modernization, Digital Workplace and Business Insight):
Practice 1: Infrastructure Modernization - "How can I run my apps more efficiently and deliver innovation faster?"
Retail banking needs to provide a full portfolio of services to its customers and not all of these are profitable. By comparison, Fintech startups can choose to offer just the profitable services (e.g. Payments). In order to stay in business these banks are forced to think about their applications in two categories:
There are good reasons to ensure strong isolation between these two parts of the bank. The legacy systems are just not designed for the unpredictable workloads and volume of read/query activity associated with digital banking. Mode One workloads are typically protected by perimeter security whereas Mode 2 workloads face off to a variety of end user devices, third party systems and external threats - the digital systems will therefore implement micro-segmentation and a variety of techniques to guard against DDOS, for example.
But there is another element that is often missed when rethinking the platform to support Bimodal Banking: the Data Integration layer. Both of these sides of the bank still need to fit into a joined up multi-channel strategy and provide a seamless experience to the customer. Both sides of the bank will form part of the customer 360 / KYC picture that the bank needs to implement. Furthermore, the data in Mode 1 systems is often fragmented and these systems need to be insulated from unpredictable workloads and threats and so Mode 2 systems will typically implement a separate caching layer or operational data store. The Data Bridge between Mode 1 and Mode 2 systems is therefore a key success criteria that will determine how rapidly the bank can deliver new experiences to their customer base. We therefore see this as a key part of the Digital Innovation Platform.
...In the next part of this blog I will look at the next two practices: Digital Workplace and Business Insight
Ok, time for part 2. I'm back on and connected after a few days zip-lining and mountain biking through redwood trees and train tracks in northern California. As I was contemplating part 2, the biking time reminded me that infrastructure automation software end game is not too different. You want to spend the best quality bike time on the downhill adrenaline inducing sweeping single track through the trees versus the mundane paved path to the mountain..i.e. Let infrastructure work for you with automation rather than you tediously working the infrastructure to get better ROI from your quality time.
In Part 1 of this series, I started to peel back some of the well known UCP Advisor features that our customers are using when deploying our infrastructure automation software while sharing some of the updates we made in the most recent UCP Advisor v1.2 release. In this blog, I want to touch on aspects of networking mgmt, day 0 + day 90 administration and cool integrated data protection features.
So on to networking. I covered automating all the aspects around deploying storage datastores and compute ESXi hosts in the previous post and I wanted to complete the 3rd leg, the important networking management aspect. From a IP networking aspect, two key aspects I believe are VLAN management and topology views. When you update the VLANs on your distributed virtual switches, UCP Advisor provides an automated facility to synchronize VLANs to the top of rack and/or spine switches that make up your networking fabric. It also provides connectivity information so you can quickly determine the physical infrastructure connectivity topology between ESXi hosts and IP infrastructure. You can visualize some of this clicking on animated GIF below. Of course, firmware upgrade management which I'll chat about in part 3 is included for the networking switches.
Circling back to day 0 type operations from an administration perspective, most environments do/will end up with multiple appliances, whether its 30 satellite offices each with local needs or a datacenter with multiple UCP appliance pods for application, security and/or multi-tenancy requirements. UCP Advisor has a distributed model to manage multiple appliances from single vCenter including enhanced linked mode configurations. (vSphere 6.5 newly supported in 1.2 release). Each appliance or logical configuration has a dedicated control VM appliance (small Win2k16 based CVM) which allows the scalability to be only limited by vCenter max # of ESXi hosts which it can manage, 1000 at last check. Each appliance or logical system can be quickly on-boarded using CSV configuration to describe the appliance or new infrastructure elements (e.g adding a new chassis of compute on day 89) can be on-boarded using UI. The administration tab also covers aspects lack setting the schedule for automated backup of infrastructure config components, specifically the IP network and FC device configurations.
Speaking of data protection, UCP Advisor always provides integrated VM and datastore level operational backup and recovery capabilities when HDID software and its V2I component is recognized as being deployed. This is accessible through the data management services tab. With data protection moving to a snap and replicate model vs traditional backup to meet both scalability and fast self service recovery, I think this is an important inclusion. The ability to have every VM newly deployed to be automatically protected and ability to do full or granular recovery of VM data at the drop of a hat is key when you users need it, especially if its a multi-TB VM and time is money...The GIF visual shows you some aspects of this and more details on the VMware protection options from a previous blog I wrote a while back. For vSAN based UCP HC, HDID offers VADP based backup as well.
In part 3, I'll free wheel home and close out to cover automated firmware management, physical workflows capabilities for bare metal support or custom infrastructure needs and some of the vRO and Powershell integrations that are available to further automate your cloud deployment with HDS UCP and UCP Advisor.. Feel free to drop a comment/questions on any aspect or what you would like to me to cover in more detail
We recently rolled out the latest release of UCP Advisor, v1.2, our flag ship infrastructure automation software for converged, hyper-converged and standalone storage. In a previous blog, I included a longish voice over video which rolled through the various features but I thought I would take the opportunity to peel back the features in a shorter bites while also referencing the latest value features introduced in version 1.2
An essential element in converged automation is simplifying the operations and deployment of ESXi hosts, datastores and virtual to physical VLAN synchronization actions. These entities are what UCP Advisor calls virtual/logical resources. <Click animated GIF for visual>
Taking the all important datastore management which traditionally involve multiple admin groups and many days for completion of service tickets. UCP Advisor provides an intuitive interface and workflows for VMFS/NFS datastore creation and hides all the creation complexities and validation of FC zoning across multiple SAN switches, checking that WWPN of ESXi host(s) are in active zone and storage host groups, performing storage LUN creation/masking and finally attachment to ESXi cluster into single click operation. Provisioning times are now at least sub 1 minute. With v1.2 release, we now provide full end to end workflow support for iSCSI and NFS datastores as well.
But we have taken this a step further and also generate unique vCenter tagging of the storage capabilities of the just created VMFS datastore(s) using associated HDS VASA Provider software (v3.4). Now the characteristics of that datastore (performance, availability, cost, encryption etc. etc.) are tagged and available to vSphere administrators to exploit in vCenter policy based management framework for provisioning operations whether from vCenter or higher level cloud automation. The vCenter tags also enable admins to quickly find all related objects, for example all datastores that match Tier 1 IOPS Performance + provide data at rest encryption. Pretty cool SPBM for VMFS. <Click animated GIF for full visual>
As referenced earlier, UCP Advisor supports vSAN based hyperconverged like UCP HC (updated support in v1.2 for vSAN 6.6), converged infrastructure like UCP 2000 that uses compute and external storage and a mode called Logical UCP which can manage flexible configurations including standalone storage. For vSAN based UCP HC, UCP Advisor provides visibility to health and capacity of the vSAN compute nodes respective SSD/HDD(s) that form the cache and capacity tiers of vSAN datastores and also visibility to non allocated devices. It also provides access to compute inventory, topology and operations such as boot order, power and LID operations and most importantly firmware management which I'll cover in subsequent blog in this series <Click on animated GIF for full visual>
Speaking of ESXi compute nodes, UCP Advisor can also deploy new node(s) or non-allocated ESXi compute nodes into ESXi clusters running on UCP 2000. It will surface up un-allocated compute nodes on UCP 2000 config (which are SAN Boot ESXi nodes), it will check/update the firmware of node(s) match the cluster, verify WWPNs on new host are correctly configured in active SAN zones and after deployment, it will ensure all existing VMFS and NFS datastores in the cluster are now available and presented to the new node(s). Again, this dramatically increases the time to use for new compute resources added into environment and providing the turnaround times now expected in the age of public computing expectations.
<Click on animated GIF for full visual>
In the next part 2 of this series, I will cover aspects of networking mgmt, on-boarding administration, topology views and integrated data protection and more in part 3
I recently published a short video blog Let's hear it - Introduction to UCP Advisor which introduced a new converged and hyper-converged infrastructure automation and delivery software from HDS. Some great feedback but as expected folks asking for more technical details and an opportunity to see the product in action. With that, here is a 20+ min video I put together which walks through the product including compute, storage, network,data protection and advanced infrastructure management capabilities. As mentioned in the previous blog, the intent is to put infrastructure tasks within the reach of the efficient fingertips of administrators to enable them to accelerate and manage the delivery of VM based application services on that dynamic infrastructure.
Reminder: You can view video on YouTube by selecting icon on the bottom but ensure the quality settings are set to 720P to view it if it starts looking blurry.
<updated Video based on version 1.2 released in June 2017, here is link to 1.2 related blog>
Traditional agent-based backup and recovery solutions can dramatically impact the security, performance and total cost of ownership of virtualized environments. As organizations expand their use of virtualization, hyper-converged infrastructure like VMware vSAN, they need to closely examine whether their data protection strategy supports efficient, fast, secure backups that won’t tax storage, network, budget, or computing resources. As data grows, the need for more frequent data protection and a variety of other challenges have forced administrators to look for alternatives to traditional backups.
Initially, most backup administrators chose to back up virtual machines by deploying backup agents to each individual virtual machine. Ultimately, however, this approach proved to be inefficient at best. As virtual machines proliferated, managing large numbers of backup agents became challenging. Never mind the fact that, at the time, many backup products were licensed on a per-agent basis. Resource contention also became a huge issue since running multiple, parallel virtual machine backups can exert a significant load on a host server and the underlying storage. Traditional backup and recovery strategies are not adequate to deliver the kind of granular recovery demanded by today’s businesses. Point solutions only further complicate matters, by not safeguarding against local or site failures, while increasing licensing, training and management costs.
Hitachi Data Instance Director (HDID) is the solution to protect Hitachi Unified Compute Platform HC V240 (UCP HC V240) in a hyper converged infrastructure. The solution focuses on the VMware vStorage API for Data Protection (VMware VADP) backup option for software-defined storage . Data Instance Director protects a VMware vSphere environment as a 4-node chassis data solution with options for replicating data to outside the chassis.
Hitachi Data Instance Director provides business-defined data protection so you can modernize, simplify and unify your operational recovery, disaster recovery, and long-tern retention operations. HDID provides storage-based protection of the VMware vSphere environment.
Data Instance Director with VMware vStorage API for Data Protection provides the following:
Figure shows the high-level infrastructure for this solution
Below are the Use cases and results
Use Case 1 — Measure the backup-window and storage usage for the VMware VADP backup using Hitachi Data Instance Director on a VMware vSAN datastore.
Deploy the eight virtual machine's DB VMDK evenly on two VMware ESXi hosts with VMware vSAN datastores. The workload runs for 36 hours during the backup test. Take the measurement with both quiesce options enabled/disabled. This backup is a full backup, with initial backup and a later incremental backup.
Initial Full backup
Backup time : 52 Min
Storage used : 1920 GB
Incremental Backup with Quiesce ON
Backup time : 4 Min 15 Sec
Storage used : 35.02 GB
Incremental Backup with Quiesce OFF
Backup time : 2 Min 25 Sec
Storage used : 34.9 GB
Use Case 2 — Create a cloned virtual machine from the Hitachi Data Instance Director backup
Restore a virtual machine after taking a Hitachi Data Instance Director backup. Measure the timestamp of the restore operation.
Restore backup with HDID
Restore time : 22 Min 15 Sec
Storage used : 213 GB
With Hitachi Data Instance Director, you can achieve broader data protection options on the VMware virtualized environment. With VMware VADP CBT, the backup window for the incremental backup was relatively short and optimized.
Please Click Here to get tech note Protect Hitachi Unified Compute Platform HC with VMware vSphere and Hitachi Data Instance Director
You’ll recall from my last blog I volunteered for to give a presentation to an organisation in London and I found myself having signed up to deliver an evening lecture to the Institute of Engineering and Technology on the subject of cloud. I had managed to pull some material together and coerce a colleague into sharing some of the load by applying an equal degree of vagueness in the description!
So we had a story, I had a willing partner to help share the challenge, we had overcome the anticipation and were as ready as we could be! The Presentation was polished during the day in between client meetings and we headed to the venue for the evening event.
The building where the lecture was to be given wasn’t intimidating at all, nor was all the signage hanging in the entrance hall in anticipation.
As if the pressure couldn’t have been any greater the venue we were to be using to give our talk was none other than the Alan Turing Lecture Theatre, named after arguably the founding father of modern computing. The registered attendees numbered 100-150, there was to be tea and coffee on arrival followed by a drinks and nibbles reception afterwards with the night concluding around 9PM.
We quickly set up, dumped our bags and then headed to the nearest watering hole for a sherbet and lemonade as a steadier in preparation for the event! On our return we kicked off and we introduced on stage by the event organiser. Surprisingly (for me) the audience seemed to be very aware of Cloud technologies and the Cloud field in general, I was therefore sincerely hoping they would be able to get something out of the event.
Sylvain and I delivered our presentation which was well received. The audience listened intently and made notes. We covered the HEC value proposition, the key differences in Public and Private Cloud and the fact that our HEC solution offers the public cloud consumption experience of self service and pay per use with the security / latency benefits of retaining IT on premise in a clients data centre. We covered our SLA driven approach to selling, our pricing being more competitive than a Public Cloud alternative and having a holistic solution to address a changing market.
Following the presentation, we took some fantastic questions from the audience which were very balanced and somewhat different to what we had heard before due to the diversity of the audience, people were very keen to understand our IoT story as well as our approach to things like machine learning algorithms. The questions would have continued beyond the allowed time however was stopped by the organisers to allow us to retire to the drinks reception.
Now the event was over we could relax and managed to meet many of the members and people from the audience. The feedback was good and they enjoyed the lively debate, some areas of particular interest were what our views were on edge based data analytics and machine learning integration with Cloud IT. I found these discussions to be very enlightening hearing opinions on the industry from outsiders who have a different (and often very well informed) perspective on what we are doing.
I managed to team up with a small group including a Dutchman involved in 3D printing of industrial wind turbine blades (who kindly liberated a bottle of wine for us from the main table) and a retired gentleman who was very well read on the subjects of cloud computing following a 60 year career in IT. I avoided the fact that I was born half way through his career but I think I got away with it.
Although I started this as a “never volunteer for anything” that’s not how I look back on the experience, often we choose to do things squarely inside our comfort zone however its very fulfilling to step outside this now and again. We also tend to stick to the circles socially and professionally of our peers or customers looking to buy what we have to offer. I found it particularly enlightening to hear the opinions of people with a really diverse set of backgrounds which I would never come into contact with ordinarily. So I’d say in conclusion take the time to do things you wouldn’t ordinarily do and hear from people you wouldn’t expect to ordinarily speak to – you’ll be pleased you did.
With memories of Sapper Featherstone, British Army - Royal Engineers circa 1946
Allright, this is the technical part, describing how to built the blueprint and what to configure in NSX to make it work like described in the overview. Let's get started, shall we?
First things first. I have to create a list of requirements in order to master all the challenges such a micro DMZ concept brings. Lets see what we need:
OK - that should be it. I will focus in this part on the NSX config in the blueprint and the designer. Assuming everything else is just fine and had been pre-configured installed by our fine consulting folks. Just like a customer, I am eager to use it - not to install it
OK, I decided to start with the very important and yet super complex NSX integration...
Alright, you got me there, it is actually not that complex to integrate
First I created some NSX Security Tags. These can be used to identify VMs and run actions based on the found tags. Also it might be a smart way of dynamically add VMs to security groups in NSX. In order to use them in the HEC blueprint canvas, the Tags need to be pre-existant in NSX.
OK got it, but were do you create these Tags in the first place?
Well, this is done in the NSX management in vCenter. To create custom security tags, follow these steps:
OK, I created the tags "HEC_DB" and "HEC_Web" and am ready for action. These tags are now useable on VMs for advanced processing.
Also, I created two security groups:
To create those, go to Networking and Security and click on Service Composer in the left hand side menu.
These security groups are later used to apply the firewall rules onto. The Tags will be used to assign the VMs to their respective security group (DB VM to DbServer, WEB VM to WebServer), after the VM deployment.
This means you are now able to enforce firewall rules to VMs where you might not even know the IP address nor their subnet mask just by putting the VMs in NSX security groups.
Welcome, to the power of the Service Composer in NSX!
After the security groups have been created we have to set up the rules of engagement, ahem I mean, the rules for communication between the WEB server and the DB server. Since the WEB server is exposed to the internet, we do not want to have him chatty chatting to the DB server as he whishes. Therefore the communication between these two servers (WEB to DB) has to be limited as much as possible in order to keep the security high! These sophisticated firewall rules are set in so called Security Policies.
We can create a new Security Policy by just clicking on the Security Policies tab and selecting the Create Security Policy icon.
Now you can specify rules for interaction between Security Groups on NSX or even from external sources (like the internet) to Security Groups. In our case, we want the following rules to apply for a secure configuration:
Voilá: That should be it, now VMs in the DB security group will only allow VMs in the WEB security group access via the MySQL port. All other access is blocked. For the WEB servers, we are even stricter, from the perimeter firewall (aka: the internet), only HTTP and HTTPs will be let through to the WEB server. The only other server outside of the DMZ the WEB server can reach is the DB server. The communication is only possible via the MySQL ports to initiate DB queries.
You might wonder how to enforce all of this without specifying a single subnet or IP address? Well that is solved by the Security Tags. As soon as the VMs are assigned to the right policies in the Service Composer, the rules will be enforced on them, automagically!
Assuming everything else is just fine and had been configured correctly, we can now start building the actual application. So lets get started with the design, given that I already have created some installable components, so called Application Blueprints, I can start drag and dropping my way to a versatile multi-tier web application.
I decided to have a DB sever and a WEB server (shocking - isn't it?). In the design canvas I dragged the DB components such as MySQL installation as well as the FST_Industries_DB component on the DB server.
To do this, simply drag and drop the packages onto the VMs. The FST_Industries_DB component is a customising the DB to set up a table space and does make some other minor edits to prepare the DB server for the use of the WEB Server.
After doing that, I dragged Apache, PHP and the FST_Industries_Web component onto the WEB server.
Besides installing all the software assets, the FST_Industries_Web is then creating an on-demand web site which is accessing the DB sever via its full qualified domain name (FQDN). HEC will now install these packages on the specified VMs, it is important to know that all this data is passed on as dynamic variables during the install (IP addresses, domain names, DB names, etc...) Otherwise it would be fairly complex to install anything on demand
After the actual service design is done, we need to ensure that the VMs are tagged to auto assign them into the respective security groups in NSX. Therefore you can drag the Tags directly into the canvas.
Just drop it somewhere, for the sake of a clean graphic I put it on top of each of the VMs. By clicking on the dragged in security tag, the actual tag value can be assigned. You will see a list of possible NSX security tags, pick HEC_DB for one and WEB_DB for the other - done
Now, the tags need to be formally assigned to each of the VMs. This is done by clicking on the VM in the canvas and selecting the Security tab. In there you will see both tags available, just tick the one which applies:
You might wonder why both tags are always displayed in this security settings for the VM. This is because a VM can have multiple security tags - all tags dragged in the canvas will be shown. In our case it is important to make sure to prevent a double select of a tag with a VM, this mite shake up our well thought through security concept (however, it is easy to spot and fix).
Last but not least both VMs need to be placed in a NSX network. For the DB VM, this network ("virtual wire" in NSX slang) needs to be set as an internal and protected network, since possibly other DB servers might run in there as well.
For the WEB server, we want to create the DMZ on demand. That means this network is not pre-existent at the time of deployment.
To accomplish this, we need to define two different types of networks in HEC:
Do not get over excited by the term "External" in this case, that refers to all networks that are pre-existing before the time of deploying a service. The "Routed" network is different, this one is a pure logical construct which only comes to life at the time of deployment. This will be configured to form smaller networks to than place the newly created VMs into them.
Therefore its configuration might be a bit confusing in the first place. To configure the network profiles in HEC, go to Infrastructure -> Reservations -> Network Profiles and click on New to select either External or Routed.
The External one has to be pre-existing, which means it has to be defined in NSX before it can be added to HEC.
This means you have to create a new virtual wire in NSX prior to the selection in HEC.
The Routed one is more difficult, this is why I think it might be worth going over its options quickly. In the form you will see the following fields:
Provide a valid name: DMZ_OnDemand
Description: DMZ network, created on demand each time for every deployment
External Network profile: Transport*
Subnet mask: 255.255.192.0**
Range subnet mask: 255.255.255.240***
Base IP: 172.30.50.1
OK, here we are in the networking nirvana. What does all this mean. Just let me explain the "*" real quick:
*: The transport network for your DLR. This is configured during NSX setup for external network access. To describe how to do this would be to much detail for this blog post. In our case, it is named "Transport", but you can name it also Bob, Jon, or Fritzifratzi if that works better for your use case
**: This is the subnet mask, defining how much devices we want to put into the micro DMZs. In this case it is a /18 subnet mask, which gives us "only" 16,382 addresses. You could also go for a /16 which would give you 65,534 or a /14 for a whopping 262,142 addresses. But be careful, all these addresses are pre-calculated by HEC, which can be quite CPU intense if you chose big ranges.
***: The subnet mask for the different small network areas. Basically it creates the "micro" networks, based on the given subnet mask (255.255.255.192.0) and uses the /28 subnet mask (255.255.255.240) to create a net with 14 useable addresses.
This means HEC will now go ahead and create as many small subnets as possible using the provided big /18 (255.255.192.0) subnet mask. In my case it will create network chunks looking like this:
Now you might wonder why there are small gaps between these address spaces. That is because only the useable 14 addresses are shown. For example, the first address is 172.30.50.1, the network address would be 172.30.50.0 and the broadcast address would be 172.30.50.15. So the entire network is actually 172.30.50.0 - 172.30.50.15. But given how networks work the network address and the broadcast address can't be used for servers, leaving a total of 14 addresses useable. It is important to understand that principle in order to make the networks chunks big enough for the amount of servers to be in them.
If all this network calculations, slicing and subletting is creating the father of all headaches don't give up! There are quite nice websites which do all the calculations mentioned here for you. One of these sites can be found here:
Good, after all this hard work of clicking and brain twisting network mask calculations the setup is finally done.
We configured security tags, automatically assigned them to the right VMs. Firewall rules will assure only allowed protocol communication from one security group to another.
The VMs and its software get installed by HEC, once the tags are assigned and the VMs are installed one is placed in a static and the other one is placed in a routed network. The routed network will be sliced by a subnet algorithm to only allow 14 devices, each WEB server will have its own DMZ.
After all that has been configured by HEC, the NSX security kicks in and our freshly deployed application will work like intended and only let MySQL queries reach the DB server. Also, HTTP / HTTPs queries from the internet can only reach our WEB server running in its very own "private" DMZ. All of this is created for each and every new application being deployed.
Wow, after all this clicking and configuring and calculating we do have a quite comprehensive blueprint, not only setting up a full service with a single mouse click, but also providing enterprise grade IT security for each and every deployment.
Not only through the firewall and security capabilities of NSX, but also through the flexible and purpose ready design of a micro DMZ per WEB server per service. This is an achievement which would be fairly difficult to reach without the capable technologies introduced by HEC.
If you want to see all this running, stay tuned for the next article in this series showing all of this working in our HEC Solution Centre environment which is located in the Netherlands in a wonderful small town called Zaltbommel...