MapReduce programming is not easy
- MapReduce programs are written in Java
- Not everyone has Java background
- We need to be an expert in Java to write MapReduce code for intermediate and Complex problems
- Hadoop alone canât make our life easy
- While HDFS and Map Reduce programming solves our issue with bigdata handling, but it is not very easy to write a MapReduce code
- One has to understand the logic of overall algorithm then convert it to MapReduce format.
Tools in Hadoop Echo System
- If we donât know java how do we write MapReduce programs?
- Is there a tool to interact with Hadoop HDFS environment and handle data operations, without writing complicated codes?
- Yes, Hadoop ecosystem has many such utility tools.
- For example Hive gives us an SQL like programming interface and converts our queries to MapReduce
Hadoop Ecosystem Tools
- Hive
- Pig
- Sqoop
- Flume
- Hbase
- Zookeeper
- Oozie
- Mahout
and many more!!!
Hadoop Ecosystem Tools – Hive
- Hive is for data analysts with strong SQL skills providing an SQL-like interface and a relational data model
- Hive uses a language called HiveQL; very similar to SQL
- Hive translates queries into a series of MapReduce jobs
Hadoop Ecosystem Tools – Pig
- Pig is a high-level platform for processing big data on Hadoop clusters.
- Pig consists of a data flow language, called Pig Latin, supporting writing queries on large datasets and an execution environment running programs from a console
- The Pig Latin programs consist of dataset transformation series converted under the covers, to a MapReduce program series