How the Right Data Fabric Strategies Can Help You Take Full Advantage of Your Data Lake

By Evan Cropper posted 05-01-2020 20:14


Do you ever wonder why it's so difficult to capitalize on the full potential of your data lake? It's because the pressure that big data puts on a monolithic data lake can break it in many different ways. For example, massive capacity will break the economics of linear Hadoop scaling. This happens because companies want to keep data for longer periods of time even though only the most recent data are getting processed. In the end, you buy more compute than you need and use expensive Hadoop memory because you haven't implemented the right data fabric strategy.


The right data fabric strategy, deployed over time, eases the integration of data and the complexity of processing it. It optimizes the ecosystem for CapEx and OpEx. And it helps align IT to business needs without asking the business for any additional resources. 


One excellent data fabric strategy is abstraction, for both storage and execution. An example of this for storage is an abstraction layer that allows Hadoop to think the data is in Hadoop, while it's actually in a more cost-effective and scalable data object store. By employing this strategy in their data fabric, Hitachi Vantara reduced storage expense to a third of comparable Hadoop storage.


Another example of abstraction in a data fabric is adaptive execution. If you create a Java function, you should be able to reuse that function in different engines like Spark and MapReduce, without having to rewrite the code. Similarly, instead of creating a transformation for every data source and business rule, you should be able to use catalogs and self-describing metadata to make the ETL processes dynamic. By utilizing these strategies, Hitachi Vantara was able to consolidate 900 ETL jobs down to just 12.


Hitachi Vantara has developed an entire data fabric solution designed to break data lake challenges into bit-size chunks so you can prove the value of your digital transformation initiatives faster. If you're struggling to take full advantage of your data lake, and you'd like to learn more about Hitachi Vantara's data fabric strategies, you can at Hitachi Vantara's free upcoming virtual conference DataOps.NEXT on May 14, 2020.


In the session Make your Data Fabrics Work Better at 2:00 p.m. EST, Hitachi Vantara's Global Solution Lead, Mike Williams, will discuss what a data fabric is, what strategies to adopt for your existing environments, what reference architectures comprise data fabric elements and what the value is of employing various strategies.


Mr. Williams will present elements of Hitachi's overarching data fabric strategy and the lessons learned from an industry perspective to put everything in context. He will also give a glimpse of what Hitachi's Vantara's vision is for DataOps going forward.


Attendees will not only be able to engage with the speakers with a Q&A after every session, but there are also one-on-one mentoring opportunities with many of the speakers in which attendees can meet with professional services team members, product managers, pretty much any presenter. Attendees can engage in these one-on-one conversations to talk about thought leadership and how Hitachi sees data management changing. 

To register for the event, just head on over to the DO.NEXT registration page . The event is free.