How is Data Ops Related to Data Centric Computing?

By Hubert Yoshida posted 10-14-2020 20:59


In my last Post I talked about the movement to Data Centric Computing and the introduction of DPUs or Data Processing Units, working with CPUs and GPUs to offload the work of data movement so that CPUs and GPUs can concentrate on application processing and analytics.

According to Wikipedia:” Data-centric computing is an emerging concept that has relevance in information architecture and data center design. It highlights a radical shift in information systems that will be needed to address organizational needs for storing, retrieving, moving and processing exponentially growing data sets. … Organizations are struggling to cope with exponential data growth while seeking better approaches to extracting insights from that data using services including Big Data Analytics and Machine Learning. However, existing architectures aren't built to address service requirements at petabyte scale and beyond without significant performance limits.”

The traditional compute centric model requires the CPU to move data from network to storage to memory to CPU/GPU and back to storage and out to the network. The Data Centric model employs a DPU to off load the data movement from the CPU so that the CPU can concentrate on application processing. The offload of some of the data movement is already available in Smart Storage and Smart NICs with the use of FPGA and NVMe technologies. Today, companies like NVIDIA and Intel are developing DPU’s for that purpose. Data Centric Compute is about making it possible to process a huge amount of data by offloading the work of data acquisition and data movement.

How does Data Ops work with Data Centric Computing?

DataOps is enterprise data management for the AI era. It applies lessons learned from DevOps to data management and analytics. Effective deployment of DataOps has shown to accelerate time to market for analytic solutions, improve data quality and compliance, and reduce cost of data management. Data operations is not a product, service or solution. It's a methodology: a technological and cultural change to improve your organization's use of data through better collaboration and automation. Data Centric Computing offloads the work of acquiring and moving of data while Data Ops focusses on extracting the value from the data.

Data Ops Framework
The framework for DataOps combines five essential elements that range from technologies up to full-on culture change.
  • The first element is enabling technologies, many of which are probably in your enterprise already (including IT automation, data management tools), as well as AI and machine learning (ML).
  • The second is an adaptive architecture that supports continuous innovations in major technologies, services and processes.
  • The third is enrichment of your data, putting it into useful context for accurate analysis. That means intelligent metadata that the system creates automatically, often at ingestion to save time later in your data pipeline.
  • The fourth is the DataOps methodology to build and deploy your analytics and data pipelines, following your data governance and model management.
  • The fifth element of a DataOps framework is the most important and most difficult: culture and people. To fulfill the potential of DataOps, you must have or build a culture of collaboration among your IT and cloud operations, data architecture and engineering, and data consumers such as data analysts and data scientists. Only then can DataOps put the right data in the right place at the right time to foster real business value.

In Data Centric Computing, data is viewed as a critical and perpetual asset used in support of applications to produce deliverables. Applications come and go but the data model precedes the implementation of a given application and remains valid long after the application is gone.
Choosing a data-centric strategy for your business requires a shift in the approach towards data management. One of the solutions for these purposes is the introduction of the Data Ops model.

1 view