Spark Installation

Tested on Linux 14.04 LTS with Java version “1.7.0_121” and Hadoop-2.6.0

In order to install Apache Spark, we need to first install Scala. Follow the step 1 to download and install Scala.

Step 2: Spark Installation

Download the latest version of Spark by visiting the link. You have to select the version whichever you want to use. There will be three options. As we have installed hadoop version 2.6, thus accordingly we should select the spark version. The option 4 is the download link . Select the download link and save it as shown in the image.

Extract and Move:

    $ su - hduser

In the below command, type your root username in the place of user:

    $ cd /home/user/Downloads
    $ sudo tar -zxvf spark-2.1.0-bin-hadoop2.6.tgz
    $ sudo mv spark-2.1.0-bin-hadoop2.6 /usr/local/spark

Edit .bashrc file:

    $ sudo nano ~/.bashrc

Add the following lines:

    #SPARK VARIABLES START
    export SPARKHOME=/usr/local/spark
    export PATH=$SCALA_HOME/bin:$PATH
    #SPARK VARIABLES END

Use Ctrl+X and Y to save.

To save these changes in bashrc file, use the below command:

    $ source ~/.bashrc

Now to get the shell of spark, type the following commands:

    $ cd /usr/local/spark/bin
    $ sudo ./spark-shell

If the installation is done in the correct way, you will get the spark’s shell as shown in the image given below:

To exit the spark shell, use :q.

Spark Installation

Spark Installation

Statinfer

Tested on Linux 14.04 LTS with Java version “1.7.0_121” and Hadoop-2.6.0

Step 2: Spark Installation