A couple of weeks ago I had the privilege of visiting Mesosphere up in the city (San Francisco) for a few days. They were offering their newly created training course on DC/OS and not being terribly familiar with container orchestration, Apache Mesos and the surrounding technologies of DC/OS I jumped at the chance to attend.
What I discovered was a very powerful technology that would allow us to quickly start the process of building Big Data solutions on top of HSP (or even UCP, which, by the way, is being done in the Denver lab). There’s also a bigger picture here in that by using HSP and Mesosphere DC/OS we can properly orchestrate, schedule and manage containers which ultimately allows us to build distributed apps across HSP (or USP) and run stateful (and/or stateless) workloads.
Now, I say ultimately as if this is somehow a future possibility. The reality is, Hitachi is already using this technology! How many people within Hitachi or outside of Hitachi know of Hitachi Content Intelligence (HCI)? If you know of HCI, do you know what the underlying architecture is? Yeah, you guessed it: Apache Mesos and Marathon, the same technology that underpins Mesosphere DC/OS! Now here’s the mind blowing thing that will probably put me on the HCI engineering hit list… we could theoretically, take HCI and distribute it on DC/OS running on top of HSP. What other apps can we distribute across DC/OS and HSP (or UCP)?
Image 1: Marathon performing container orchestration for HCI
Image 2: Apache Mesos for HCI
So with HCI, Apache Mesos and Marathon in mind, let’s take a step back and see what we can do with DC/OS on HSP.
As shown in the image below, it’s just a simple task of spinning up a bunch of CentOS 7.2 VMs in HSP, then deploying DC/OS from a bootstrap server. (https://dcos.io/docs/1.7/administration/installing/custom/)
Image 3: CentOS 7.2 VMs for DC/OS
Once DC/OS was installed, I was able to quickly spin up Cassandra and HDFS.
Image 4: Frameworks Marathon, Cassandra and HDFS all healthy and running
Next, I wanted to spin up an instance of Spark for our super quick Big Data solution but it turns out that Spark wouldn’t install properly. Well that’s a drag…
Image 5: Installable frameworks including Spark.
Being fairly new to this I wasn’t quite able to figure out why Spark wouldn’t install, so skipping it, I decided to install a Hadoop docker container instead. All I had to do was search Docker Hub for Hadoop and found a container from Sequenceiq, tell Marathon to create a new application and point it at the hadoop docker image. Viola! I have the beginnings of our Big Data solution on HSP.
Image 6: Hadoop running in a docker container
Image 7: Resource usage of our starter Big Data solution on HSP and DC/OS.