Pig Installation
Tested on Ubuntu-14.04 LTS with hadoop-2.6.0
Prerequisites: Hadoop-2.x should be installed and working. In this tutorial, you will install apache pig-0.16.0. — ###Following are the steps for downloading the Apache Pig: Download Apache Pig
First of all, download the stable (or any) version of Apache Pig from the following website − https://pig.apache.org/
Step 1:
Click on the “News”, then click on the link release page as shown in the image below:
Apache Pig’s “release page” will appear and you need to scroll down in this page thus you will get some description of the different release.
Step 2:
In the same page, under Download section, click on the link Download a release now!As shown in the image below
A page will appear with different mirror site links for your download. Click on any link.As shown in the image below:
Step 3:
In the new tab again one page will be open, thus there you will see the directory links of recent Pig Releases. If you want to download older version then click on the link archives.
Step 4:
Click on pig-0.16.0 directory link and you will be redirected to Index of /apachemirror/pig/pig-0.16.0 click on pig-0.16.0.tar.gzlink. Select save file and click Ok.
Install Apache Pig
After downloading the Apache Pig software, install it in your Linux environment by following the steps given below:
Step 1:
Login through hduser (In this system, the created user was “hduser” at the time of installation of hadoop).
$ su - hduser
password:
Step 2:
Create a directory with the name Pig in the same directory where the hadoop is already installed. (In this system, hadoop installation directory is /usr/local/hadoop, and created the Pig directory in the user “hduser”).
$ cd /usr/local
$ sudo mkdir pig
Step 3:
Go to the root user Downloads directory and untar the pig-0.16.0.tar.gz file. In the below command “gopal” is the root user. You need to put your root user name at the place of gopal in the below command.
$ cd /home/gopal/Downloads
$ sudo tar -zxvf pig-0.16.0.tar.gz
Step 4:
Move the content of pig-0.16.0 file to the pig directory which we had created earlier. Command is shown below.
$ sudo mv pig-0.16.0 /usr/local/pig
Configure Apache Pig
After installing Apache Pig, we have to configure it. To configure, we need to edit two files .bashrc and pig.properties.
Setup Environment Edit .bashrc file
Open the bashrc file using following command:
$ sudo gedit ~/.bashrc
Make the following changes in .bashrc file i.e., insert the following pig variables in .bashrc file:
#PIG VARIABLES START
export PIG_HOME=/usr/local/pig/pig-0.16.0
export PATH=$PATH:$PIG_HOME/bin
export PIG_CLASSPATH=/usr/local/hadoop/conf
#PIG VARIABLES END
Save the changes to .bashrc:
$ source ~/.bashrc
Verifying the Installation
Verify the installation of Apache Pig by typing the version command. If the installation is successful, you will get the version of Apache Pig as shown below:
$ cd /usr/local/pig/pig-0.16.0/bin
$ pig -version
The output will look like this:
Invoking the Grunt Shell:
You can invoke the Grunt shell in a desired mode (local/MapReduce) using the −x option as shown below:
$ cd /usr/local/pig/pig-0.16.0/bin
$ pig -x local
The output will look like this:
$ cd /usr/local/pig/pig-0.16.0/bin
$ pig -x mapreduce
The output will look like this:
You can exit the Grunt shell using Ctrl + D.