Establish Control of the Data Lifecycle with Dataflow Manager

By Evan Cropper posted 05-07-2020 02:14


According to a State of DataOps survey of 451 respondents, 80% of organizations manage over 100 disparate data sources, with almost half managing at least 300. At the same time, almost 30% of the survey respondents indicated that it takes a week or more to get insights from new or existing data pipeline. In a world where real-time data insights are essential to business success, it's shocking to learn that almost a third of organizations operate at a competitive disadvantage. 


Today, every industry is extremely competitive or dominated by a Fortune 100 company with robust data management practices and teams. And whether it’s improving safety and efficiency in manufacturing or delivering a customized user experience, it is critical for companies to be able access timely, accurate, and impactful data with a minimum number of resources.


Hitachi Vantara’s Pentaho platform has always been a natural fit for organizations that are expected to do more with less when it comes to their data assets. As an open-source data integration and business intelligence platform, thousands of organizations have been able to manage data onboarding, preparation, and analytics activities using Pentaho with the increased flexibility that comes with open source technologies. 


Pentaho 9.0 helps data consumers (e.g., data scientists, analysts) and data curators (e.g., data engineers) become even more productive with the new Dataflow Manager. Dataflow Manager streamlines and automates your data lifecycle, consolidates hundreds of your data silos while providing a “single source of truth” across all of your data silos. 


Managing the Lifecycle of a Single Dataflow 

It’s no surprise that data pipelines become brittle, difficult to monitor, and even more difficult to troubleshoot when IT is managing hundreds of data pipelines. That’s why Hitachi Vantara developed Dataflow Manager. To help organizations manage the entire data lifecycle using a single tool. Let’s take a look at a critical challenge we’re solving with the Dataflow Manager.

Executing a Dataflow 

Imagine you’re a data engineer or business analyst and you want to onboard or prepare a new dataset or existing dataset for analysis. Wouldn’t it be nice to not have to build a pipeline from scratch? Prior to Dataflow Manager, you’d likely have to open a thick client tool and develop your Dataflows from scratch. Now, Dataflow Manager’s web-based client offers a consolidated view of your data assets and the Dataflow templates so you can dramatically reduce the time to onboard, prepare, and enrich datasets. Furthermore, data engineers can set a run configuration for the dataflow so that it uses the proper resource, such as memory and CPU. 


Monitoring Executions of a Dataflow (MVP) 

As data engineers understand, building a dataflow is just the first step in the process. Without effective monitoring, data pipelines will break or deliver unsatisfactory data. To ensure continuous delivery of your data assets, Dataflow Manager has real-time status monitoring capabilities. With your dataflow running, or even if it has already executed, you can easily navigate to a monitoring page in Dataflow Manager to see the status of hundreds of dataflow executions. 



This is just one example of a use case being addressed by Hitachi Vantrara’s new Dataflow Manager. If you'd like to learn more about Dataflow Manager, you can at Hitachi Vantara's free upcoming virtual conference DataOps.NEXT on May 14, 2020.


In the session To Manage All Your Data Pipelines, Let’s Follow the Dataflow Studio Roadmap, Jason Tiret, Hitachi's Senior Product Manager for Dataflow Manager, will do a deep dive into the capabilities of Dataflow Manager as he discusses the roadmap and vision for how Hitachi Vantara will help you build and manage data pipelines in the future.


Attendees will not only be able to engage with the speakers with a Q&A after every session, but there are also one-on-one mentoring opportunities with many of the speakers in which attendees can meet with professional services team members, product managers, pretty much any presenter. Attendees can engage in these one-on-one conversations to talk about thought leadership and how Hitachi sees data management changing. 


To register for the event, just head on over to this website. The event is free.