David Merrill

Driving a new TCO target for IoT Data Lake Nodes

Blog Post created by David Merrill Employee on Nov 9, 2016

In my last blog, I outlined the need to build, deploy and manage IoT Data Lake nodes that are 80-90% lower cost (not lower price) than traditional IT server and storage infrastructure. I will now present how this can be done.


First, we need to understand that processing and storage requirements in the data lake are very different from what we typically see in the data center

  • Transient data, state-less
  • Short shelf-life of the data, from a few minutes to a few days.
  • High volume, small objects, log files or M2M elements
  • Low processing throughput is usually OK
  • Virtual machines nodes may only have 1-2 vCPU, but can have very large store pools (1-20TB each has been observed)
  • Commercial hypervisors and operating systems are not necessary


Now, how to build very low cost nodes and storage for the IoT Data Lake --

1. New systems and solutions are required of your IoT vendors

2. New operational and behavioral considerations are needed from the IT department


This table below summarizes various options, and what is required of the vendors and IT departments:


Moving to COTS hardware, open source software, replace RAID, don't protect the data..... all radical ideas and concepts in order to meet a new total unit cost point for IoT nodes.


My next and final blog on  IoT Data Lake economics will review some customer case study data, and cross-over points of capacity and cost that drives us to look for new ways to reduce total unit cost of IoT platforms.