Evan Cropper

Pentaho 8.3:  An Important Part of Our DataOps Strategy

Blog Post created by Evan Cropper Employee on Jul 10, 2019

by John Magee, Vice President Portfolio Marketing


The Pentaho engine is a key component of Hitachi’s Lumada data and analytics platform for building innovative data-driven solutions for general business and IoT use cases.  Recently we delivered Pentaho 8.3, an important update to our flagship data integration and analytics software and a key part of our DataOps strategy (keep reading to learn why).

You can read the press release here: https://www.hitachivantara.com/en-us/news-resources/press-releases/2019/gl190710.html

The new release of Pentaho offers enhancements in three key areas:

  • Data pipelining– Improved drag and drop capabilities for accessing and blending data that’s difficult to access such as SAP and Amazon Kinesis. For example, SAP data is often hard to access from outside of SAP. Now, with our connector to SAP, we enable drag and drop onboarding, blending, offloading and writing data to and from SAP ERP and Business Warehouse. 
  • Data visibility for governance- Enhanced capabilities for metadata management with Hitachi Content Platform, integration with IBM Information Governance Catalog (ICG) and streaming data lineage tracking. The goal is to make it easier to understand what data you have and what its retention, compliance, and other governance requirements are, so you can analyze, share, and manage it appropriately.
  • Expanded multicloud support– New hybrid cloud data management capabilities for leveraging the public cloud, including an Amazon Redshift bulk loader and Snowflake interoperability.


These new capabilities are important enhancements for our Pentaho users, but they also reflect and align with our broader portfolio strategy around DataOps. Organizations everywhere are looking to transform digitally and get more value out of their data. But getting the right information to the right place at the right time continues to be a challenge.  

Many of you have probably experienced the following DataOps challenges: analytic velocity is slowed by data engineering challenges that make it time-consuming to discover, blend, and deliver the data required. Data governance and compliance requirements place new demands on what data can be shared and how it should be managed. Moreover, distributed edge-to-cloud infrastructures mean that data is more distributed than ever before, which introduces a new set of operational data management challenges.

This lack of data agility has existed for years despite—or in some cases, because of—all the new tools and technologies. And for most organizations, it represents the biggest obstacle to achieving the promise of analytics, machine learning, and AI to transform their operations and drive innovation. Clearly, something needs to change.

DataOps has emerged as a collaborative data management discipline focused on improving the communication, integration and automation of data flows between data managers and consumers across an organization. At Hitachi Vantara, we are increasingly seeing customers who are embracing this more modern and holistic approach to managing data that spans applications, data centers, clouds, branch offices, the IoT edge, and other places where enterprise data resides.

Our strategy is laser-focused on providing the data management infrastructure, metadata-driven data management tools, and policy-based automation that organizations need to improve data agility through DataOps. In addition to Pentaho, we’re delivering new capabilities across our portfolio to make DataOps a reality for our customers, and you can expect to see more announcements from Hitachi Vantara in the coming months that reflect our strategic focus (especially at NEXT 2019).

DataOps is still a relatively new approach for many organizations, and first steps often start with improving data pipelines for the purposes of analytics and ML, whether that be real-time streaming data for embedded analytics in production apps or chatbots.  The end game could also be more complex data science projects involving building data lakes and managing related infrastructure and dev tools.

Automating and accelerating the discovery, integration, and delivery of data is certainly key to shortening the time it takes to get from raw data to actionable insights. And the often-cited analogy of DataOps as the data equivalent of Dev Ops—in the way it seeks to make formerly offline batch development processes more collaborative and automated—is a good one. But for the enterprise customers we work with, addressing the analytic data pipeline is only one part of the opportunity they see around DataOps. The others include data governance and operational agility.

Given the data compliance and regulatory requirements most enterprises operate under today, any analytic or data science projects that involve accessing, sharing, and analyzing data need to also include appropriate data access and governance controls. And the complexity of the edge-to-core-to-cloud infrastructures that most of our customers are building out today demand new approaches to managing data to optimize data retention and other policies in new ways. 

We believe the right DataOps strategy needs to address all three of these key areas – analytic velocity, governance, and edge-to-cloud operational agility. The good news is that many of the core technologies required to make DataOps a reality – discovery, metadata management, policy-based governance and retention management, automated data migrations and data pipelining, and so on can all be addressed with the right data platform for successful Data Operations.

Takeaways: DataOps is all about making companies more innovative and agile by getting the right data, to the right place, at the right time. Hitachi is laser-focused on providing a broad portfolio of tools and related services to make DataOps a reality. The new Pentaho 8.3 release provides key capabilities for customers looking to begin their DataOps journey.

Visit https://www.hitachivantara.com/go/pentaho.html to learn more.