Skip navigation

Pentaho

2 Posts authored by: Anand Sagar Rao Vala Employee

Guest Contributor: Matt Aslett, Research Vice President, Data, Analytics & Artificial Intelligence, 451 Research

 

451 research logo sm.JPGAslett_Matt head shot.jpg

Most companies are increasing their investment in data-processing, analytics and machine-learning software with a desire to become more data-driven. Data – and the rapid processing of data – is a key driver in enabling companies to grasp the opportunities presented by digital transformation to deliver improved operational efficiency and competitive advantage.

 

We have moved from the transactional era, through the interaction era to the engagement era, in which enterprises have recognized that they must store, process and analyze as much, if not all, data that is available to them in order to survive and thrive in the digital economy. This includes data produced by the myriad of sensors, embedded computers, industrial controllers and connected devices such as vehicles, wearable computing devices, robots and drones that make up the emerging Internet of Things (IoT).

 

Data from 451 Research’s Voice of the Enterprise: Internet of Things indicates that analytics is seen as the most critical technology for success in IoT projects, but also that the largest impediment to IoT projects is technology deployment and integration challenges, followed by security concerns, and a lack of a compelling business case or uncertain ROI.

 

For IoT projects the primary use-cases are optimizing operations (for preventative maintenance and reduced downtime, for example) followed by reduce risk (such as security and compliance); the development of new, or enhancement of existing, products or services; and enhanced customer targeting for increased sales.

 

In all of these cases, while data from IoT devices is extremely valuable, and has been stored and processed alone for many years, the greater value comes from blending that IoT data with enterprise data sources. Combining IoT data with data from existing enterprise applications makes the link with customer behavior data, employee behavior data, marketing/advertising data and sales data, for example, to provide a more complete picture and ensure the IoT data is seen in the context of the business goal.

 

Data from 451 Research’s Voice of the Enterprise Data and Analytics indicates that the complexity involved in integrating and managing data actually grows, the more data-driven a company is. The results show that while the most data-driven companies enjoy benefits such as increased focus on competitive advantage, they are also faced with more data integration and preparation overheads.

 

Data from each of these sources is likely to be delivered in different formats, meaning that it needs to be blended, transformed and cleansed before it can be used to generate business insights. Data will also be delivered via different mechanisms. Although most will likely be delivered from traditional enterprise applications in batch form, increasingly it will be generated at the edge by Internet of Things devices and delivered via stream processing, for IoT analytics.

 

Additionally, in attempting to become more data-driven, many organizations are investing in machine-learning tools and developing ML-driven applications. The success of these projects depends on the ability of the organization to operationalize experimental data-science projects through training and testing to model deployment and management.

 

Much attention is paid to the outputs of these data-processing pipelines – including visualizations and machine-learning models used to drive business decision-making. However, intelligent and automated data-processing pipelines that are able to rapidly integrate data from multiple sources, including enterprise applications and IoT devices, should not be overlooked as a foundation for delivering successful IoT projects that deliver improved operational efficiency and new revenue streams.

 

To learn more about how to combine IoT and business data to deliver business value, click on one of the links below:

 

Visit the Hitachi Vantara website.

 

Download the Business Impact Brief from 451 Research entitled "Agile Data Management as the Basis for the Data-Driven Enterprise.”

Author: Rakesh Saha, Sr. Product Manager, Pentaho Product Line

 

In the world of enterprise IT, managing data in multiple clouds is now the new normal — whether it’s the result of a deliberate strategy or from shadow IT doing their own thing. Enterprises are not only moving data to the cloud at an unprecedented pace, but they are also embracing different cloud platforms from different vendors at the same time for good business and technical reasons. That means IT leaders need a plan to manage multiple clouds uniformly. But it’s not just about maintaining resource utilization views anymore. If left unchecked, multi-cloud sprawl can put your data assets at tremendous risk.

 

According to a study by Forrester Research, 65 percent of IT leaders believe “data integration becomes more complex in the public cloud”.  To give you some perspective, these cloud data integration challenges came in behind only security and compliance challenges.

 

With Pentaho 8.1, we are continuing to enhance our data integration and analytics platform to be more cloud-friendly so that enterprises can develop data pipelines on and with data in any of the leading cloud platforms without the complexity. Now, following our support for AWS and then Microsoft Azure, Pentaho 8.1 supports Google Cloud platform.

 

By supporting Google Cloud, Pentaho 8.1 is a significant step toward helping our customers with their multi-cloud strategies.  We now provide even more choice regarding which public cloud vendor to use for their data management.

 

Pentaho 8.1 also delivers new capabilities which directly and indirectly support multi-cloud data strategies.  With Pentaho, for example, you can: 

 

  • Visually manage data in multiple-cloud storage environments, now using Google Cloud storage (see Figure 1)
  • Load data in bulk to Google BigQuery (see Figure 3)
  • Visualize and analyze data in Google BigQuery
  • Elastically deploy Pentaho in the cloud to scale up and down based on workload
  • Use Spark in the Cloud (AWS EMR) for visual data processing
  • Load & download data files from Google Drive

 

Fig 1.png

 

Figure 1: Job spanning on-premise to multi-cloud

 

Each cloud platform offers their own services, but data integration platforms like Pentaho also need to support a set of common components, like those shown in Figure 2.  What also differentiates us from the data integration tools specific to the vendors themselves is our flexible deployment architecture.  This means you can use Pentaho to access and process data where it lives, whether the data is in the cloud or on premises, and whether it’s in AWS, Azure or Google Cloud platform – rather than needing to move data around – thereby reducing latency.

 

Fig 2.png

Figure 2: Job spanning on-premise to multi-cloud

 

Now Pentaho can also be used to move files from on-premise to one cloud, and then to another cloud vendor with any data format because of the seamless integration of different cloud storage technologies via VFS (see figure 4). Pentaho encapsulates security and other integration details and makes it easy to load data into the appropriate cloud data management or warehouse services with new and existing capabilities.

 

Fig 3.png

Figure 3: Data Loading to Google Cloud Storage

 

Fig 4.png

Figure 4: Data loading to Google BigQuery

 

After loading data in cloud data warehouses, data can be consumed in data pipelines running in Pentaho data integration and directly by data analysts using Pentaho’s Business Analytics.  With all these cloud data sources and our data management services, we can facilitate end-to-end ETL, analytics solutions and help solve even more problems.

 

With the emergence of multi-cloud IT deployments, data professionals need to work with data they understand and trust, and now more than ever need a platform to harmonize the data with transformation processes, across different cloud and on-premise environments. Data integration platforms like Pentaho have an enormous role to play for those enterprises and for our cloud future. Pentaho’s multi-cloud capabilities squarely address this enterprise need – especially with the new capabilities introduced in 8.1 release.