Spark Installation
Statinfer
Tested on Linux 14.04 LTS with Java version “1.7.0_121” and Hadoop-2.6.0
In order to install Apache Spark, we need to first install Scala. Follow the step 1 to download and install Scala.
Step 2: Spark Installation
Download the latest version of Spark by visiting the link. You have to select the version whichever you want to use. There will be three options. As we have installed hadoop version 2.6, thus accordingly we should select the spark version. The option 4 is the download link . Select the download link and save it as shown in the image.
Extract and Move:
$ su - hduser
In the below command, type your root username in the place of user:
$ cd /home/user/Downloads
$ sudo tar -zxvf spark-2.1.0-bin-hadoop2.6.tgz
$ sudo mv spark-2.1.0-bin-hadoop2.6 /usr/local/spark
Edit .bashrc file:
$ sudo nano ~/.bashrc
Add the following lines:
#SPARK VARIABLES START
export SPARKHOME=/usr/local/spark
export PATH=$SCALA_HOME/bin:$PATH
#SPARK VARIABLES END
Use Ctrl+X and Y to save.
To save these changes in bashrc file, use the below command:
$ source ~/.bashrc
Now to get the shell of spark, type the following commands:
$ cd /usr/local/spark/bin
$ sudo ./spark-shell
If the installation is done in the correct way, you will get the spark’s shell as shown in the image given below:
To exit the spark shell, use :q.