Table of Contents
- Key Concepts and Terminology
- Traditional versus Modern Data Architectures
- The Active Data Lake
- Understanding KVM
- Support for OpenStack
- Hitachi Data Systems – Patents
- Hardware Specifications
- HSP Management API Support
- Additional Resources
When it comes to analyzing large amounts of data, there are many different solutions in the marketplace with a wide variety of approaches to analyzing that data. New concepts and new approaches emerge on a regular basis, and can be mystifying.
One thing is certain: being able to create meaningful insight from large amounts of data that may be arriving in a variety of formats requires a different approach to infrastructure than what you find in a traditional data center today.
At Hitachi Data Systems, with our extensive experience in storage and computing innovation, backed by thousands of patents, we’ve developed the Hyper Scale-Out Platform (HSP) to deliver a scalable, reliable system that enables businesses to start analyzing vast quantities of data faster than ever, regardless of where that data is coming from and what format it arrives in. With a new platform and a new approach comes new terminology and a lot of questions. We developed the HSP CodeX to provide some introduction and insight into how we enable fast, scalable analysis of large data sets, and help demystify the concepts and technology behind HSP.
The Active Elastic Data Lake
Our premise is that customers need a modern data architecture which supports a blended data lake. The ideal data lake will be elastic of course, but also be active in the sense that applications will automatically run where the data is physically located. A scale-out distributed file system enables the elastic part, and a hyper-converged appliance enables the active part. Integration with Pentaho PDI provides the toolset to help blend data together so that you can extract analytical business insight.
Key Concepts and Terminology
With so many vendors and solutions in the marketplace today, you encounter many different definitions of hyper-convergence. To understand hyper-convergence, it helps to understand converged infrastructure. A converged infrastructure bundles compute, networking, storage and server virtualization resources and combines those elements with an integrated management software for “single pane of glass” management. The result is a mix of hardware that may come from different vendors, but is tested to work well together and often comes from a single vendor to simplify support.
Hyper-convergence takes the concept of converged infrastructure and makes it modular. Each module or node added to a hyper-converged infrastructure contains its own compute, storage, networking, and virtualization components. Because each node contains all of these elements, you can scale out quickly by adding additional nodes. Each node adds additional compute, storage, networking and server virtualization resources, so performance and capacity increase with each node.
Figure 1 - Converged Infrastructure - Storage, Compute, and Networking are bundled but separate
Figure 2 - Hyper-converged infrastructure - Storage, Compute, and Networking are integrated into a single 2U physical node
In a traditional infrastructure or a converged infrastructure, storage, compute, networking, and server virtualization are separate from one another and each can be scaled up independently. For example, you can add storage capacity by adding storage shelves and do so without adding additional networking or compute resources.
With a hyper-converged infrastructure like Hitachi’s Hyper Scale-Out Platform (HSP), each node you add increases your compute, networking, storage, and server virtualization resources, allowing you to scale out incrementally and linearly as your needs grow.
Distributed File System
For managing large amounts of data on a hyper-converged scale-out platform, a distributed file system is a critical component. A distributed file system like Hadoop’s HDFS allows you to use the storage of each node as part of a very large, scalable pool of storage distributed across all multiple nodes. A distributed file system allows you to work with very large sets of data and do so with high resilience, insuring that the loss of a disk or an individual node doesn’t impact your ability to recover. With a distributed file system, your pool of data can grow as large as you need by adding nodes.
HSP uses a distributed file system called eScaleFS that can be used for both Hadoop and POSIX/NFS access, allowing other Hadoop components like Hbase, Hive, Sqoop and Spark, as well as third-party application to take advantage of the storage and compute resources of HSP.
When spinning up applications that use a large amount of data, being able to bring all the resources online and group them together in a cluster and manage them as a unit. HSP allows you to do this with the Hitachi Advanced Analytics Services Manager (HAASM), provisioning a cluster in under a minute and performing end-to-end unified monitoring of the cluster.
HSP uses a similar three-copy principle to Hadoop’s to help ensure data resilience in the event of the need for a failover. Each node stored data in 64MB chunks on three separate nodes. As data streams in, it lands on a node. While the first node is receiving data, it begins forwarding a copy of the data to a second node, which in turn begins forwarding a copy of the data to a third node. This process is referred to as the data pipeline. Only one replica of data can be stored on a single node, and in a multi-rack configuration, no more than two copies of data are stored within a single rack.
Figure 3 - Data pipeline
(Click image to enlarge)
As a result, HSP is fully fault tolerant with no single point of failure. If you lose a disk in a node, your data is rebuilt on other disks. If an entire node is lost, data is rebuilt on another node. When there are enough racks, if an entire rack is lost, data is rebuilt on other racks.
Virtualization abstracts the operating system from being tied directly to the hardware it’s running on. Through the use of hypervisors, you can run multiple servers on a single hardware server, each with its own memory, storage, computing, and network resources. This has allowed organizations to save on hardware costs, data center space, and energy costs by consolidating many standalone servers on to a single server managed by the hypervisor.
However, this hasn’t been without its own risks. With many virtual machine (VM) guest operating systems running on a single VM host, a failure of the VM host can take out many servers at once, crippling many business-critical applications. To help offset the risk, makers of virtualization software introduced high availability features that allow VM guests to be moved to other locations in the infrastructure, sometimes automatically, if there’s a failure on a VM host or storage.
Hadoop is a framework for storage and processing of large scale data sets, often referred to as “big data”, on commodity hardware. There are many components in the Hadoop ecosystem, though the two primary components are the Hadoop Distributed File System (HDFS) and MapReduce parallel processing framework.
One of the key features of HDFS is that it stores very large files across a distributed network of hosts, and does so by replicating the data among hosts. By default, HDFS makes three copies of all the data, on the premise that with three copies, it’s very likely that there will always be at least one copy available, even in the event of a failure.
MapReduce handles the actual processing of the data stored in HDFS, using server resources in the Hadoop cluster as needed, performing processing of the data in parallel, and manages communications between the nodes in the Hadoop cluster.
Because of its flexibility and its ability to store and process very large amounts of unstructured data quickly, and with a high degree of fault tolerance, Hadoop has become the go-to for organizations that need to analyze and extract meaning from large amounts of data that may come streaming in from a wide variety of sources and in a wide variety of formats.
When we talk about analytics and big data together, we’re talking about the process of examining and extracting meaning from very large sets of data to reveal information and impart knowledge that we haven’t been able to see before. This can include customer preferences, emerging trends in the marketplace, new sales opportunities and ways to make a business more dynamic and efficient.
The data used for analytics can come from a wide variety of sources and data formats, including both structured data like the data commonly stored in SQL databases and unstructured or semi-structured data, which can come from emails, documents, social media postings, web pages, and streaming data from devices. An analytics system gathers these different types of data together and provides a way for people to bring these disparate data sources together, examine them, and come away with new insight that was not previously possible to obtain.
A data lake is a repository for large amounts of data in its native format. HSP uses data lakes to build a data-centric system. With data lakes, data is ingested directly into storage and retained there. As applications need to analyze and process the data, they’re spun up and can access the data directly from the data lake. By taking this approach, HSP reduces the amount of time it takes to get results from your data sets because data is ingested only once.
A software defined infrastructure is really the ultimate commoditization of the elements of a datacenter. Storage, networking, compute, and virtualization moves from hardware into software that can manage all these components on commodity hardware without administrative intervention. Resources are allocated as needed according to the rules of the software-defined infrastructure.
Traditional versus Modern Data Architectures
The Legacy Tax
Legacy systems that rely on traditional data center architectures are becoming increasingly expensive and inefficient to maintain. Applications are aging, and require high licensing costs, maintenance agreements, and other fees just to keep them operational. And they often require specialized knowledge, which adds administrative costs. Together, the costs of keeping these legacy systems running is referred to as the “legacy tax”, and the high cost of the legacy tax can stifle IT innovation.
One of the ways that Hadoop implementations provide resiliency is by storing 3 copies of data, each on different nodes. This provides resiliency, but can introduce a performance bottleneck through the use of a Name Node. When a client requests a file in a generic Hadoop implementation, the request is sent to the name node, which looks up the location of the data and returns it to the client. In turn, the client then requests the data from the node or nodes holding the data. Because of this design, the Name Node can introduce latency, as client requests may have to wait for the Name Node to be available to service the request.
(click image to enlarge)
For fetching data from other nodes in the same cluster, HSP provides a 40GbE Ethernet private backplane. External connectivity to the network is provided via a 10GbE connection.
In a Hadoop cluster, each Hadoop node manages its own memory, which helps enable easy scale out. For example, if you have an application that needs to look at 100 images to determine if the images are good, one single node will look through the images serially. If you scale the application out to run on 10 nodes, each node will look at 10 images, and each of those nodes will adjust their own memory needs accordingly.
In addition to memory management features, HSP stores logs in NVRAM. This helps to accelerate write commits, because the NVRAM is battery-backed. All memory is local, and memory is not aggregated, helping to avoid problems with memory bandwidth.
The Importance of a Distributed File System
A distributed file system is a crucial component in scale out systems, like Hadoop and HSP. Hadoop’s HDFS is widely used, but Hitachi Data Systems developed its own file system, Enterprise Scalable File System (eScaleFS) to provide improvements over HDFS.
Hitachi made some advances on Hadoop’s distributed file system, HDFS, and these are found in our own distributed file system, eScaleFS. Why, when HDFS is used widely for so many Hadoop implementations, would we introduce our own file system? We have a number of improvements over HDFS.
One of the key features of eScaleFS is that it can be used both for Hadoop and POSIX/NFS access, allowing other Hadoop components – Hbase, Hive, Sqoop, Spark and others – as well as third-party applications to take advantage of the storage and compute of HSP.
In addition, eScaleFS avoids the Name Node bottleneck that generic Hadoop implementations can have. In a generic Hadoop implementation, clients request the location of data from the Name Node, which returns the location of the data to the client. This can become a performance bottleneck, and there is only minimal resiliency in the form of a backup Name Node.
eScaleFS improves on this by allowing each node in the cluster to independently and deterministically compute the location of the data. eScaleFS uses the CRUSH algorithm allowing data to be fully distributed, self-balanced, and self-configurable. Nodes an access data independently and directly without having to go through a specific node, eliminating the Name Node bottleneck and improving performance both for file serving and Hadoop workloads.
Figure 5 - Data request in eScaleFS -- All nodes can locate data
(click image to enlarge)
eScaleFS also has several other improvements over HDFS. Hitachi uses a version of the PAXOS algorithm which allows notes to store and report on their status to all other nodes, increasing availability, resilience and self-healing. As new nodes are added or if a node goes offline, the system will automatically reconfigure and rebalance itself.
With eScaleFS, all the nodes in a cluster use a deterministic has, so each individual node knows where all the data is stored without needing a name node. Each node had the entire map of data, and data is evenly distributed among the nodes in the cluster. This helps eliminate the single point of contention that can occur in Hadoop architectures that rely on the name node.
Because eScaleFS includes a POSIX interface, it can appear as a /ext device to the operating system. This allows you to ingest data in parallel using NFS. External clients can mount the file system via NFS and move data, while local VMs can access data within the cluster via NFS. Because each node has an external network port, data can be ingested or exported in parallel for very high throughput.
The table below shows a quick comparison of distributed file systems:
Distributed Autonomous Systems
When we take a look at how systems have traditionally functioned, the data center has been populated with systems made up of discrete components that each performed a specific task. Servers perform computing tasks, SAN and NAS perform storage tasks, and switches and routers performed networking tasks. Going further, a specific server may be responsible for fulfilling a certain role. If that server has a failure, the role it fulfills or the application it hosts may be unavailable. Administrator intervention is often required before a system can be brought back online. In addition, individual systems can become performance bottlenecks, again requiring administrator intervention or additional hardware to restore performance to adequate levels.
Distributed autonomous systems attempt to address some of the problems of traditional system architectures by distributing roles and responsibilities among all the nodes within the system. Being distributed helps to eliminate single points of failure (SPOF), because other nodes within the system can continue performing the roles in the event of a failure. Being autonomous reduces the need for administrator intervention when a failure occurs. This is often referred to as “self-healing”, because the entire system is aware of the hardware failure and the management software can handle failures and keep the system running normally without administrator intervention. Distributed autonomous systems, then, provide a level of resiliency and automation not found in traditional systems. And because roles are distributed among all the nodes in the system, performance scales linearly. As more performance is needed, new nodes can be added, and the system will automatically adjust to deliver better performance.
The Value of Flash Drives
When it comes to enterprise storage, flash brings several things to the table. Certainly, the characteristic many people are most familiar with is speed. Flash delivers much higher throughput than traditional disk drives, reducing the time needed to ingest and analyze very large sets of data.
The Active Data Lake
The active data lake is a concept in which large amounts of data in a variety of sources and formats can be ingested directly into storage, forming the data late. Applications that need to analyze the data are spun up on VMs where the data is located, “sipping” the data they need from the data lake.
With an active data lake, the data is persistent while the VMs that run the applications are ephemeral; they are spun up and spun down in order to meet the demand. You may have 3 VMs running an application when demand for analysis is low, and you may increase up to 10 VMs when demand is higher, spinning them back down when demand drops again. As the VMs come and go, the data remains in the same storage location, providing fast access to applications that need to analyze the data in the lake.
Figure 6 - The Data Lake
HSP includes two major architecture components, the eScale Distributed File System and the Distributed Appliance Management System.
Figure 7 - eScale Distributed File System
Figure 8 - Distributed Appliance Management System
HSP is a hyper-converged solution made up of individual nodes, each with their own compute, storage, networking, and virtualization components. Each node runs a custom Hitachi KVM hypervisor and the Hitachi Advanced Analytics Services Manager, which handles the provisioning, monitoring, securing and troubleshooting of data analytics solutions.
Each HSP node is a 2U devices, with dual Intel Xeon 12-core CPUs, 192GB memory, 8GB NMVRAM, and supports both Near-line SAS drives (NL-SAS) or an all-flash storage configuration. Networking for each node is made up of two 10GbE front-end and two 40GbE back-end networking adapters. HSP supports up to 20 nodes per rack, and a total of up to 100 (as tested) nodes. You start with a minimum of five nodes and can scale out in one node increments.
(click images to enlarge)
HSP uses a built-in System Manager to manage HSP VMs, hardware resources, and clusters. Because HSP is a software-defined solution, new features and functionality can be introduced through software updates.
HSP Clusters are made up of individual nodes. Each node has local components for compute, storage, networking, and server virtualization that get added to the cluster. The CRUSH algorithm is used to allow the data in the data lake to be full distributed across all members of the cluster, with self-balancing and self-configuration. The eScaleFS distributed file system implements a version of the PAXOS algorithm for high availability, resiliency, and self-healing features.
Figure 9 - HSP Cluster Components
Distributed Storage Fabric
Figure 10 - Individual nodes make up an HSP cluster
Workflows with HSP
Traditionally, converged solutions designed for handling large amounts of data ingest data into durable enterprise-class storage, then as analysis is needed, a copy of the data needed is shipped off to an object store, so that the application can analyze the data. This results duplication of data and extra time spent processing data prior to analysis, as well as adding cost for all the additional resources needed.
Figure 11 - Traditional Data Workflow
With HSP, we provide a data-centric platform. Unstructured data streams in and is stored in a durable, enterprise-class scale-out NAS. When you need to analyze data, you spin up new VMs on HSP and can perform real-time streaming analysis with apps such as STORM and Spark, as well as periodic batch analysis using an app such as Hadoop. With HSP, the data remains in the data lake, and applications analyze it in place, reducing the time and cost associated with making extra copies and moving data for analysis.
Figure 12 - Data Workflow with HSP
HSP provides a fully fault tolerant architecture. If you lose a disk, you can rebuild on other disks. If you lose a node, you can rebuild on other nodes. And with enough racks, you can recover from the loss of an entire rack by rebuilding on another rack using copies of the data. As a result, HSP gives you a highly resilient cluster for analyzing very large amounts of data.
Metadata is at the core of any intelligent system and is even more critical for any filesystem or storage array. In terms of DSF, there are a few key components that are critical for its success: it has to be right 100% of the time (known as “strictly consistent”), it has to be scalable, and it has to perform at massive scale. DSF utilizes a “ringlike” structure as a key-value store which stores essential metadata as well as other platform data (e.g., stats, etc.). In order to ensure metadata availability and redundancy a RF is utilized among an odd amount of nodes (e.g., 3, 5, etc.). Upon a metadata write or update, the row is written to a node in the ring and then replicated to n number of peers (where n is dependent on cluster size). A majority of nodes must agree before anything is committed, which is enforced using the PAXOS algorithm. This ensures strict consistency for all data and metadata stored as part of the platform.
Data Path Resiliency
Reliability and resiliency are key concepts within HSP. The system is designed to handle in an elegant and non-disruptive manner. Because it’s a distributed system, HSP is built to handle component and service failures.
With HSP each cluster has an IP address, each node has an individual IP address, and nodes can take over the floating IP address of a dead node. Load balancing with HSP can be done via round robin, or for organizations that use an orchestration tool, you can use your own orchestrator, which helps administrators start managing HSP without having to learn new tools. In addition, you can also perform orchestration with Hadoop’s own orchestrator, YARN.
HSP provides the same data locality that you get with Hadoop, but with HSP you’re not limited to just Hadoop applications. You can spin up VMs where your data is located to begin analysis faster and prevent the problems associated with moving very large data sets across the network.
Kernel-based Virtual Machine (or KVM) is a full virtualization solution on x86 hardware. It contains virtualization extensions and consists of a kernel module that provides the core virtualization infrastructure as well as a processor-specific module. KVM provides the server virtualization features of HSP.
One of the key features of KVM that makes it valuable for analytics is that processes can create virtual machines. Each of the VMs created can have memory, virtual CPUs (vCPUs), and in-kernel device models.
More information on KVM can be found here: http://www.linux-kvm.org/page/Main_Page
HSP Physical Node Architecture
(click image to enlarge)
Figure 14 - Node Architecture
Support for OpenStack
What is OpenStack?
OpenStack lets users deploy virtual machines and other instances which handle different tasks for managing a cloud environment on the fly. It makes horizontal scaling easy, which means that tasks which benefit from running concurrently can easily serve more or less users on the fly by just spinning up more instances. For example, a mobile application which needs to communicate with a remote server might be able to divide the work of communicating with each user across many different instances, all communicating with one another but scaling quickly and easily as the application gains more users.
Supported OpenStack APIs
HSP supports several important OpenStack APIs, such as Glance, Nova and Swift.
Glance provides discovery, registration and delivery services for disk and server images. It has the ability to copy or snapshot a server image and then to store it promptly. Stored images then can be used as templates to get new servers up and running quickly, and can also be used to store and catalog unlimited backups. Virtual-machine images can be stored in various locations, including simple filesystems and object-storage systems such as OpenStack Object Storage (Swift).
Swift is an object storage system designed to store files, videos, analytics data, web content, backups, images, virtual machine snapshots and other unstructured data at large scale with high availability and 12 nines of durability. Swift can be configured with as few as two nodes with a handful of disk drives and scale to thousands of machines providing hundreds of Petabytes of storage distributed in geographically distant regions. Swift is designed from the ground up to scale horizontally without any single point of failure.
Nova is a compute service used for hosting and managing cloud computing systems. It is a component based architecture enabling quicker additions of new features. It is fault tolerant and recoverable. It is built on a messaging architecture and all of its components can typically be run on several servers. This architecture allows the components to communicate through a message queue. Deferred objects are used to avoid blocking while a component waits in the message queue for a response.
More information on OpenStack can be found here: https://www.openstack.org/
Hitachi Data Systems – Patents
General IT Patent information: http://www.hitachi.co.jp/products/it/unified/patent_en/
Three patents specific to the Hitachi Hyper Scale-Out Platform:
Decentralized distributed computing system (U.S. Patent 9,110,719)
Independent data integrity and redundancy recovery in a storage system (U.S. Patent 9,021,296)
Configuring a virtual machine (U.S. Patent 9,069,784)
Product Model: Hitachi Hyper Scale-Out Platform 400 (per node)
Product SKU: SSG-SOP400-SN0001A
Form factor (height)
2U (rack units)
Redundant 920W platinum power supplies; 220V-30A, L6-30P connectors
Two Intel Xeon E5-2620 2.40GHz 15M cache processors (E5v3)
192GB RAM, with additional 8GB NVRAM
Hard disk drive or HDD (data)
Twelve 3.5” 6TB SAS 7,200rpm HDD (72TB Raw)
HDD (for application or operating system)
Two 2.5” 300GB SAS 10,000rpm HDD
Network (back end)
40GbE dual-port QSFP+ NIC
Network (front end)
10GbE single-port SFP+ NIC
Operating temperature range
10 – 35 degree C
8 – 90% noncondensing
HSP Management API Support
HSP provides REST APIs for management of HSP resources, application development, and integration with existing infrastructure applications like automatic provisioning and billing systems.
Cluster Management API
With the HSP cluster management API, you can manage the following resources:
- IP Addresses
- File systems
- File system shares
VM Management API
With the HSP VM management API, you can manage the following resources:
- VM templates
- VM instances
For additional information, please contact your account manager or sales rep to get the API guide.
Q. What are the key values of Hitachi’s Hyper Scale-Out Platform (HSP)?
A: HSP delivers several important values to organizations seeking to make faster, better business decisions using large amounts of data, including:
- Shared Nothing Design means that there is no single point of failure (SPOF) or single point of contention
- Distributed data means large data sets can be ingested and accessed faster
- Triple copy of data delivers availability and resiliency across the entire architecture
- Data location API keeps data near compute resources, lowering latency
- In-place POSIX data for better workflow
- POSIX ingestion allows standard POSIX compliant data ingestion in addition to Hadoop support
Q: Why does HSP use the Hitachi eScaleFS and not Hadoop HDFS file system?
A: eScaleFS provides some improvements over HDFS. See The Importance of a Distributed File System above for more detail.
Q. How are non-Hadoop related applications supported on HSP?
A: There are three types of scenarios for HSP applications as shown below:
Type of Application
1. HDS Certified and Supported
HDS has certified this application on the HSP platform. HDS will provide full support. Customers can call HDS directly. Example Pentaho, Hortonworks HDP.
2. HDS Certified and Vendor
HDS has certified the 3rd party application on the HSP platform. However, the vendor/partner will provide support for the application.
Example Pointcross, iQser.
(See the following 3 scenarios below)
HDS has not certified the application on HSP.
3.1 Not-Certified, but Operates-on HSP
Since our system utilizes KVM for OS virtualization we’re confident that many applications will run. In particular, for simple applications, which are not on our supported list, we will confirm that the VM OS boots/runs, but the rest is up to you including API integrations, data protection, application bring-up, etc. Examples include but aren’t limited to: Single Node Postgres, single node MongoDB, Simple File Servers, Simple Web Servers, etc. While we’re not creating a white list of applications per se, we’d love to hear from you if you get your application running on HSP through the HDS Community.
3.2 Work with us on Certification and Support
If you have an application stack you’d like certified and supported we’d love to hear from you. Since these efforts can be quite significant we need to be prepared for the workload and discuss all options. Please work with your local Account Manager to engage appropriately.
3.3 Not-Certified and Not-Supported
For certain applications, e.g. Distributed Sharded MongoDB, they aren’t yet certified or supported on our platform, and if you talk to our support teams they will tell you that you will run this at your own risk as we cannot assure tuning, data protection and so on.
Q: Which Apache components are supported by HSP and Hortonworks?
A: The table below shows which Apache packages are currently supported by HSP, Hortonworks Enterprise and Hortonworks Enterprise Plus. Please note, the term “Hadoop” has come to refer not just the base module consisting of HDFS, YARN and MapReduce, but the eco-system as well.
Hortonworks HDP 2.1.5
Apache Hadoop (HDFS, YARN, MapReduce)
Hortonworks HDP integrates a set of Apache components as a bundled product. Hortonworks HDP currently has two flavors: HDP Enterprise and HDP Enterprise Plus. The Enterprise Plus as shown above includes more Apache components.
Besides the additional components, Enterprise Plus support is priced at a higher level.
HSP supports the Apache components shown above. Based on customer requirements, HSP will support additional components.
Q: What are the key HSP system components?
A: See the HSP-400 Specification Table.
- Product Web Site: https://www.hds.com/products/converged-infrastructure/hyper-scale-out-platform.html?WT.ac=us_mg_pro_hsop
- HSP Data Lake Overview Video: https://www.youtube.com/watch?v=3ohBDkJBJGs
- HSP Pentaho Video: https://www.youtube.com/watch?v=YGX0477ZmUk
- HSP Hadoop Video: https://www.youtube.com/watch?v=hgdxu3lrkhk
- HSP OpenStack Video: https://www.youtube.com/watch?v=3WievaO65wE
- HSP 3D Demo Video: https://www.hds.com/assets/documents/hsp-connect-dwnld.zip