In this particular session we are going to learn the basic of the pig, such as âwhat is a pig, pig architecture ,pig latin scripts, pig basic operations , loading the data into pig , group by ,filtering , sorting, functions in pig , joins in pig and storing the data and exporting the data outside the pig. In short in this tutorial a detail study of the pig and its ecosystem will be covered. This session will teach you how to use pig tool with in Hadoop ecosystem, after learning the basic of the pig you can do the advance operations in the same. ## Contents
As we know already the Map reduce have some issue that is one need to be expert in the java programming language to write the map reduce code efficiently. Second issue with map reduce is that we need to convert each and every problems into map reduce framework , is not like normal program where one can just write the code in the traditional way, every program need to be converted into Map â that is running locally then get the output from the map and then execute reducerâ This makes map reduce program tough to write . As the data scientific or the analytic are not excepted to know java as a professional java developer ,then we need to take the help of the tool within Hadoop ecosystems that will do map reduce job with out actually writing the java scripts . So hive was doing the map reduce job by converting queries into map reduce code , similar there is one more tool called Pig. So pig is a high level scripting language, by using the help of the pig latin script the code can be written which will be converted into the map reduce . Being a big data analytic the pig tool is very useful, where one can write the code in the pig latin script which will be internally be converted into the map reduce task. This is called as map-reduce made easy.
Map Reduce Made easy
In short pig is a simple scripting tool and it is powerful alternative to map-reduce . Apache pig is an abstraction over map-reduce. Pig works very good for certain types of the classes such as web log analysis, text mining and etc . Pigs can handle datasets where the datasets are slightly unstructured or semi-structured unlike hive which will fail if the datasets are not in the proper structured format.
The application of the pig are as follows:
In this session we will discuss about pig in detail and pig latin script and how to write pig latin scripts . Both hive and pig have their own advantage and disadvantages, in some types of problems hive is better and in some classes of problem pig is better , so there is no competition between hive and pig. Data scientist or analyst should decide which tool is better for achieving the goal as both hive and pig are tools are used for achieving the desired results but the approach is different. To interact with pig we need to learn new language which is called as pig latin script, which is very simple and have limited number of commands or operators , syntax is very simple and hence not much time needed to be spend on learning the pig latin scripts.
So now we will learn about Pig latin script which is necessary to interact with the pig . To write data analysis programs , pig provides an high-level language known as pig Latin. As said before pig latin have very limited keywords and operators, and very simple to learn too . There are many built in operations for joins , filter and ordering , we just need to call the write operator for the right task . It also provide nested data types such as tuples, bag and maps which are missing from the map reduce. Sow what exactly is the nested data types for example a bag consist of the tuples and a map consist of the key value pairs so basically each one of them is the sub group of one another , the use of the nested data types will be more clear once we start writing the pig latin scripts . Pig also allows us to write user defined functions , we can write our own functions for reading ,writing, processing or creating the report and then implement them in pig which be internally be converted into the map reduce code, which is really a powerful feature of pig and also solve our business purpose .