Skip navigation

Expanding HDFS storage to Object Store with S3a


In this blog post I’ll explore how Hadoop applications can quickly and easily address storage in the Hitachi Object Storage through the Hadoop S3a interface. See my previous Blog posts from this series.

Getting started with object storage access from Hadoop

Hadoop has long supported the ability to interact with object storage systems via the S3A interface or its predecessors, S3N and S3 (reference). There is growing interest amongst Hadoop users and administrators in leveraging object storage to: (a) recover HDFS capacity, by offloading files from HDFS to S3A, or (b) to use as primary storage for raw data and other intermediate data sets where HDFS performance is not critical. Storing data in Hitachi Content Platform (HCP) provides distinct advantages over a filesystem because object storage scales to billions of objects without performance degradation and provides a lower total cost of ownership. Storing HDFS data on HCP via S3A also allows applications the ability to leverage HCP’s best in class compliance capabilities and to share data with other applications by directly leveraging HCP’s RESTful APIs. With the increase of less frequently accessed data or “cold data” in HDFS, the desire to offload data from Hadoop to object storage continues to grow.


The S3A interface allows Hadoop applications to read and to write data directly to object storage with a HDFS syntax by simply addressing the bucket like so: s3a://bucket/path/to/data. Applications can leverage the s3a://bucket/path URI to address data stored in a S3 bucket directly as they normally would reference a HDFS path.


This functionality provides native Hadoop applications with the ability to expand the storage they are able to interface with beyond the local HDFS storage to HCP’s S3 API. To address HCP as Hadoop storage I simply need to configure the access key, secret key and the S3A endpoint as properties in the Hadoop core-site.xml. This can either be done through Ambari/Cloudera Manager, or by editing the $HADOOP_HOME/hadoop/conf/core-site.xml on each DataNode.
















Below is a demo video depicting the configuration of HCP in HDFS, along with a few examples of interacting with HCP from Hadoop via S3A. Video found here:


Looking Ahead

The Hitachi Content Solutions Engineering (CSE) team is currently building a solution which will provide the capability to seamlessly offload data from HDFS to HCP with complete transparency and with zero change to applications. As I demonstrated in this post, existing Hadoop functionality supports the expansion of Hadoop storage onto HCP by having end users and applications leverage the S3A interface. Unfortunately, leveraging S3A requires significant application change, and most application owners would prefer not to change application configurations and workflows to accomplish this. Customers have expressed their concerns and the CSE team is developing a solution that will leverage existing Apache HDFS storage management capabilities with new technology that will allow HDFS to tier directly to HCP.

The AWS Java SDK does not natively support Active Directory authentication, but it is flexible enough that with a very little bit of coding you can use your AD credentials with HCP over the HS3 gateway.


Attached is a working code example that uses active directory credentials to interface with HCP using the AWS Java SDK. This is not intended to be a general S3 programming example (for that see HCP S3 Code Sample), but is strictly intended to demonstrate how to use AD with HCP and the AWS Java SDK. This is intended for an audience that is already familiar with AWS Java SDK programming.


In order for this to work you will need to be on HCP version 8.0 or higher. You cannot create a bucket(namespace) with the AD user so you will need to create the namespace by other means. To give the AD user privileges in the namespace, you must assign data access permissions to an AD group to which the user belongs. Setting the AD user to be the owner of the bucket does not provide any data access privileges, but is required if you wish to see the buckets you own in an endpoint listing.


COSBench is an open source project originally developed by Intel to test the performance of cloud object stores (OBS) in a quick and scalable way. COSBench is currently the industry standard OBS benchmarking tool and potential Hitachi Content Platform (HCP) customers rely on COSBench results when choosing between object storage vendors. Unfortunately up until now, COSBench was not able to show true HCP performance capabilities. This was primarily a result of 2 factors. First, COSBench would only write to a single directory in the S3 bucket. This resulted in a bottleneck on HCP since HCP relies on a directory structure to properly distribute metadata workloads among its available resources. Second, COSBench had no built in mechanism to evenly distribute work across multiple nodes in a cluster. Consequently, if COSBench were used to test HCP without a load balancer, there would be an imbalance of work among nodes, potentially resulting in connection exhaustion on busy nodes while other nodes remained idle. The end result being that in a competitive situation, COSBench was putting HCP at a disadvantage, and not giving potential customers or an accurate representation of HCP performance capabilities.

Fortunately COSBench is designed to be extensible, allowing for plugins to be built, and to add support for different object storage products regardless of object protocol or authentication method. The Content Solutions Engineering team has delivered an engineering qualified COSBench storage adaptor for HCP. This adaptor leverages all of the strengths of COSBench while eliminating the specific deficiencies that resulted in misleading performance results for HCP. Requests will always be distributed evenly across all HCPnodes and objects evenly divided among 64,000 directories.

The COSBench adaptor for HCP also introduces a number of configurable options to better emulate customer workloads and tune performance. Our release introduces 4 options for authentication including HCP authentication, anonymous, AWS Signature Version 2, and AWS Signature Version 4. You may choose whether to use the default 64K directory distribution method, or to instead write to a single folder. And finally you may choose to disable "use_expect_continue", as an option that can result in significant improvements in small object performance.

Using the new COSBench adaptor for HCP, you can finally engage in "apples to apples" comparisons of HCP with other OBS products. You can be confident that the results will be fair, and will show HCP in an accurate light.




The following instructions for installing COSBench are specific to Linux servers. Additional setup may be required in order to run on Windows/Mac OS X.


  1. Make sure the following programs are installed on each server that is going to be running in your COSBench environment: cURL, Java 1.7 (or newer), netcat. The “http_proxy” environment variable also needs to be unset.
  2. On each server download the tarball and extract the contents exactly as shown below:

    # mkdir /opt/cosbench

    # cd /opt/cosbench

    # wget \

    # tar -xvf v0.4.3.0.tar

    # rm -f v0.4.3.0.tar

    # ln -s v0.4.3.0 cos

    # cd /opt/cosbench/cos/

    These steps extract COSBench and link the COSBench home folder to the /opt/cosbench/cos directory. This is important as other tools in the ecosystem like the COSBench Wrapper depend on the COSBench files to be in this location.
  3. If COSBench is being run as a clustered environment then edit the controller.conf file found within the conf subdirectory. The number of drivers must be increased to the number of drivers being used and an entry must be added for each driver:

[root@COSBench]# cat conf/controller.conf

drivers = 2

log_level = INFO

log_file = log/system.log

archive_dir = archive


name = driver1

url =


name = driver2

url =

  1. On each driver run the script and then start the controller with the script on the server designated to be the controller
  2. At this point you should be able to view the GUI for COSBench through <controller_ip_addres>:19088/controller
    In the GUI, you may see a red indicator next to each driver that is not on the controller node, this is okay and expected:


  1. To stop each running java process, run or depending which kind of process is running on each server.


Defining Workloads

COSBench is operated by submitting a series of XML configuration files called ‘workloads’ that specify how to interact with the cloud object store. Each workload file defines which specific storage to use, and various configuration values that are often specific to this particular storage adaptor. Following the storage definition, a ‘workflow’ is defined that is made up of one or many ‘workstages’. Each workstage defines the object level operations that are to be done such as the type (write, read, delete), the container(s) to write to, how many files to write, and how many workers to use.

To specify to use of the HCP storage adaptor, the storage type is simply set to “hcp” and the config values are specified. The required configuration parameters are endpoints as well as tenant level username and password (unless anonymous authentication is being used). The following table displays all configurable HCP storage adaptor configuration values:










The tenant endpoint connection



The tenant username with appropriate permissions



The password corresponding with the tenant username




Either http or https to specify whether or not to use SSL




Maximum number of retries a client will do on error




Milliseconds allowed before timing out on a request




Milliseconds allowed before a socket is closed




The type of authentication to use. Valid values are: v4, v2, hcp, and anon




Whether or not to optimize the directory structure. Setting this to false will result in a flat directory structure




Whether or not to disable certificate checking for SSL connections.




Whether or not each client should be configured to wait for the 100 continue response from the server before beginning to send data



The following picture XML displays a simple workload that will write 10,000 4KB objects and then read them back to the namespace ‘ns1’. If the namespace ‘ns1’ does not already exist for the tenant endpoint then this namespace will be created. This will use V4 authentication over https but with SSL certification checks disabled. This example is also available in the git repo as well as in the conf folder of the deployed COSBench server.


<?xml version="1.0" encoding="UTF-8" ?>

<workload name="hcps3-sample" description="sample benchmark for s3">

<storage type="hcp" config="endpoint=<tenant>.<cluster>.<domain>.com;username=<tenantUser>;password=<password>;

signe_version=v4;protocol=https;disable_cert_check=true" />



<workstage name="create-bucket">

<work type="init" workers="1" config="cprefix=ns;containers=r(1,1)" />


<workstage name="create-10k-4kb-files">

<work name="main" type="normal" interval="10" division="object" rampup="0" rampdown="0" workers="50" totalOps="10000">

<operation type="write" config="cprefix=ns; containers=c(1); oprefix=cos_; objects=r(1,10000); sizes=c(4)KB" />



<workstage name="read-10k-4kb-files">

<work name="main" type="normal" interval="10" division="object" rampup="0" rampdown="0" workers="50" totalOps="10000">

<operation type="read" config="cprefix=ns; containers=c(1); oprefix=cos_; objects=r(1,10000)" />







These XML configuration files may seem complicated at first, but there are additional examples as well as a more in-depth explanation in the COSBench User guide under Section 4. Workstages are run in the order defined in the workflow.


Submitting Workloads

Workloads are submited either through the bash script with parameter submit:

[root@COSBench]# ./ submit conf/hcp-config-sample.xml

Accepted with ID: w4


or through the main GUI:


Once the workload has been submitted we can see "which workload COSBench is currently working on in the UI. Selecting the workload will provide more information about which specific workstage is being run, current snapshots, and driver status. If there is an active workload being run then any additional submissions will be queued up to run after the active workload completes.

Configuring HCP for COSBench

The tenant level endpoint that is supplied to the HCP storage adaptor must have MAPI enabled and the tenant level user must have Administrator permissions. This simple configuration alone will work for the default values of the HCP storage adaptor. COSBench is able to create a bucket through the tenant endpoint provided and this will use the namespace defaults specified for the tenant..

If anonymous authentication is to be used then the namespace must be pre-created and configured with appropriate protocol settings. The XML configuration for the workflow should then use the configuration values “cprefix” and “containers” to point to the namespace that was created with the appropriate protocols. If a range of only one value is specified for the “containers” configuration value then this will simply be appended to the “cprefix” value. For example, using a “cprefix” of “myNamespace” and a “containers” value of “r(2,2)” will have the operation point to the namespace “myNamespace2”.



Logging can be turned up for the controller in the controller.conf file and for each driver in the driver.conf file found in the conf subdirectory. The controllers and drivers must be restarted in order for these to take effect. The logs for COSBench can be found in the log subdirectory. The most useful information for overall system troubleshooting can also be found in the log subdirectory. Logs for specific runs of a workload can be found in the mission subdirectory and is useful if an individual stage of a workload is failing.

COSBench Results

To view the results for a workload, click on the “view details” link for that workload in the main controller UI. If you do not see your workload here (this happens when the COSBench controller has been restarted), click the “load archived workloads” link to load all of the previous workloads that COSBench is able to find.

Once within the specific workload view, there are various links that are able to be drilled into including the specific stage details and detailed performance statistics. Also, in this same view, there is an option to download the specific workload XML that has been run.

The raw data is stored as a series of snapshots that include Op-Count, Byte-Count, Average Response Time, Average Processing Time, Throughput, Bandwidth, and Success Ratio. The snapshot time-frame is configurable within the XML workload that is sent through the “interval” parameter for each workstage. All this data can be exported through a downloadable CSV file or can be found as a csv within the archive subdirectory of COSBench on the controller server.


The plan going forward is to maintain COSBench within the Hitachi Vantara open source github space.Also, COSBench will continue to be used for internal testing and benchmarking. Any additional fixes that are made will be pushed back upstream to this origin.. If there are any questions or feature requests, please don’t hesitate to either email me or leave a comment below.


On October 3rd, 2018 MapR announced the general availability of MapR version 6.1. This release offers much improved storage management capabilities that now include the ability to tier cool, cold, or frozen data to the Hitachi Content Platform (HCP). As we’ve discussed in previous blog posts, our customers are asking for solutions to seamlessly offload their cold and frozen HDFS and MapR-FS data. We are confident that MapR 6.1 delivers on these requirements as the release was beta tested on HCP by an important customer of both MapR and Hitachi.  HCP provides tremendous value as a virtually limitless scale out cold storage pool for the MapR environment. Data that is infrequently accessed will be seamlessly tiered from your mission critical MapR storage infrastructure into economical HCP S3 buckets. Once your data is tiered to the HCP your data will be protected by HCP's best in class durability and availability. And best of all, your data will continue to be accessible to all your applications at the original MapR-FS URI. There is no need to update data paths, as this is completely transparent at the application layer. Of course, there is additional latency when recalling data from the HCP, but MapR will cache any recently recalled blocks, so that the performance hit is only on the first access. Subsequent access to the recalled blocks will be as fast as any other locally stored data. By leveraging Hitachi’s best in class object storage with MapR Data Tiering, the enterprise can now right size their MapR clusters, and stop buying high end compute nodes to accommodate a ceaseless accumulation of cold data.


To learn more about these capabilities please refer to our blog post Critical Capabilities for Hadoop Offload to HCP

How it Works

MapR has a concept of Volumes which are logical unit used to organize data and manage performance. A Volume allows you to apply policies to a set of files, directories, and sub-volumes. The new data tiering functionality provides more control over the data by tiering cold data (at the block level) to more economical remote storage targets like HCP. Data tiering is controlled with rules that can be customized based on the user, group, file size, and last modified time. One or more of these rules comprise a Storage Policy. Volumes can be assigned a Storage Policy and a remote target which is backed by HCP. A schedule is then configured to determine how often the rules of the storage policy are applied to the volume. This workflow allows aging data to be automatically and seamlessly tiered to HCP. When data is tiered, the blocks will be moved to the lower cost storage, and a file stub will be kept on primary storage. This enables applications to seamlessly access data which has been tiered. The Content Solutions Engineering team validated this functionality with MapR 6.1 and HCP, and we implemented the workflow as described above.


Configuring HCP as a remote target and implementing seamless data tiering is a fairly simple and well-documented procedure. First, the HCP administrator configures an HCP namespace with the S3 protocol enabled and a data access user who must also be the namespace owner. From the MapR Control System a remote target is then configured pointing to the HCP namespace. Next, a MapR volume can be created, data tiering can be enabled, and the newly created HCP target can be selected. A storage policy needs to be selected or created to determine when data is eligible to tier. Once a user has selected a storage policy, the user can either initiate offloading manually or select a schedule. By assigning a schedule to the volume, you can control how often eligible data will be offloaded to the HCP.



  • Data tiering is only supported for file data, offloading table and stream data is not supported at this time.
  • Tiering data is only supported on newly created Volumes and cannot be enabled on existing volumes. Data on existing volumes must be copied to new tiering enabled volumes to take advantage of data tiering.
  • Tiering for an individual volume is managed by a single MapR MAST Gateway, which runs on a single MapR node. To achieve high offload throughput, it will be important to build a volume hierarchy according to MapR’s best practices to balance your data among several volumes and maximize cluster performance and data availability.


With the MapR 6.1 release, MapR is taking the lead in acknowledging and addressing the problem of cold data bloat in big data clusters. MapR customers now have an option to address this problem other than adding more nodes. Not only is HCP a great choice to be your MapR remote storage target, HCP has been tested in this capacity and is currently in production in this capacity at a very large customer behind a multi-petabyte MapR cluster. The Content Solutions Engineering team will be working with the MapR alliances team to provide official certification of this solution. There is a great opportunity here for customers and Hitachi Vantara.  Let's start talking about the pain caused by big data, and how Hitachi Vantara can help. As always, your feedback is appreciated in the comment section.