Bob Plumridge

Why Big Data Projects Fail

Blog Post created by Bob Plumridge Employee on Sep 15, 2016

“Everyone talks about it, nobody really knows how to do it.”


Luckily, things have moved on since Geoffrey More’s famous quote about Big Data, but there’s still an awful lot of confusion and frustration when it comes to analytics.

This is not a unique experience.  Companies struggle to exploit Big Data - partly because they don’t know how to overcome the technical challenges and that they don’t know how to approach Big Data analytics.


The most common problem is data complexity. Often this is self-inflicted, as when starting out with Big Data analytics companies try to “boil the ocean.” IT teams subsequently become overwhelmed and the task turns out to be impossible to solve.


It’s true, data analytics can deliver important business insights, but it’s not a solution for every corporate problem or opportunity.


Complexity can also be a symptom of another problem, companies struggling to extract data from a hotchpotch of legacy technologies. The reality is that many companies will be tied to legacy technologies for years to come, they need to find a way to work within this context.


Another source of trouble is setting wrong or poorly planned business objectives.  This can result in people asking the wrong questions and interrogating non-traditional data sets through traditional means.


Take Google Flu Trends, an initiative launched by Google to predict flu epidemics. It made the mistake of asking: “When will the next flu epidemic hit North America?”

When the data was analysed, it was discovered that Google Flu Trends missed the 2009 US epidemic and consistently over-predicted flu trends. The initiative was abandoned in 2013.


An academic later speculated that if the researchers had asked “what do the frequency and number of Google search terms tell us?” the project may have proved more successful.




Moving mountains with simplicity


The renowned American poet, Henry Wadsworth Longfellow, once wrote: “In character, in manner, in style, in all things, the supreme excellence is simplicity”.


Too often, people associate simplicity with a lack of ambition and accomplishment. In fact, it’s the key to unlocking a great deal of power in business.  Steve Jobs once said you can move mountains with ‘simple’.


Over the years, technology has progressed by getting simpler rather than more complex. However, this doesn’t mean the back-end isn’t complicated. Rather, a huge amount of work goes into creating an intuitive user experience.


Consider Microsoft Word.  Every time you type, transistors switch on or off and voltage changes take place all over computer and storage mediums. You only see the document, but a lot of technical wizardry is happening in the background.

Cooling the ocean with an abstraction layer


Extracting meaningful value from data depends on three disciplines: data engineering, business knowledge and data visualisation. To achieve all three, you either need a team of super humans who can code in their sleep, have a nose for business, an expansive knowledge of their industry and adjacent industries, supreme mathematical genii and excellent management and communication skills.


Or, you have technology that can abstract all these challenges and create a platform layer which does most of the computations in the background. This is where Pentaho’s data engineering, preparation, and analytics platform performs some very powerful, deft manoeuvres.


But there is a caveat. Even if you eschew complexity and embrace a simplified data platform, you still need data savvy people. These data scientists won’t have to train for three years to memorise the finer points of Hadoop, but they will need to understand Big Data challenges.


Pentaho provides the method and points businesses in the right direction, but they still need to uncover what questions to ask, and what kind of answers to expect. In a previous blog, I explored how businesses can equip themselves with the right skills for the job.


Big data successes


While Big Data projects may stall or fail for any of the above reasons, we are starting to see more succeed and transform businesses, mainly thanks to the huge strides in stripping out complexity in the front-end through layer technology.


Let’s take the Financial Industry Regulatory Authority, Inc. (FINRA), a private self-regulatory organization (SRO) and the largest independent regulator for all US-based securities firms. Since using Pentaho, the financial watchdog has been able to find the right ‘needles’ in their growing data ‘haystack’. Analysts can now access any data in FINRA’s multi-petabyte data lake to identify trading violations – in an automated fashion making the process 10 to 100 times faster.  This means a difference of seconds vs. hours. FINRA achieved simplicity and more control of its data as a result. According to this article, FINRA ordered brokerages to return an estimated $96.2 million in funds they obtained through misconduct during 2015, nearly three times the 2014 total.


Similarly, through Pentaho, Caterpillar Marine Asset Intelligence demonstrated to one of its customers with a fleet of eight ships that shutting a tugboat’s engine down when idling for extended periods would save $2 million in wasted fuel fleet wide every year.


Big Data projects don’t have to confound and confuse. They can bring breakthrough lightbulb moments, provided they’re grounded in simplicity. Let the technology do the difficult stuff – in all else, keep it simple.


If you want to find out more about Big Data, then a good starting point would be my webinar on the power of digital transformation, available online now.