priyanka makke

Integrating Apache Hive with Kafka, Spark, and BI

Discussion created by priyanka makke on Mar 19, 2019

WHAT IS APACHE HIVE?

Apache Hive is a data warehouse system built on top of apache hadoop that facilitates easy data summarization, ad-hoc queries, and the analysis of large datasets stored in various databases and file systems that integrate with Hadoop, including the MaPR data platform with MaR XD andMapR Database. Hive offers a simple way to apply structure to large amounts of unstructured data and then perform batch SQL-like queries on that data. Hive easily integrates with traditional data center technologies using the familiar JDBC/ODBC interface.

 

Merits of Hive

  • Hive was initially developed at Facebook to summarize, query, and analyze large amounts of data stored on a distributed file system. Hive makes it easy for non-programmers to read, write, and manage large datasets residing in distributed Hadoop storage using HiveQL SQL-like queries. Hive has gained a lot of popularity due to its ease of use and compatibility with existing business applications through ODBC.

WHY HIVE WITH MAPR MATTERS TO YOU

  • DATA ANALYSTS
  • BI/DATAWARE HOUSE TEAMS
  • ENTERPRISE DATA ARCHITECT
  • IT/STORAGE ADMINISTRATOR

 

BEYOND HIVE AND HADOOP

  • The MapR Database Connector for Apache Spark enables you to use MapR Database as a sink for Spark Structured Streaming or Spark Streaming.
  • The Spark MapR Database Connector enables users to perform complex SQL queries and updates on top of MapR Database, while applying critical techniques such as projection and filter pushdown, custom partitioning, and data locality.

 

if you want to know more information on this Discussion then my recommendation is to visit mindmajix for better understanding. and it is helpful improve your career.

Outcomes