Apache Pig is intended for analyzing large data sets. Usually we combine Pig with Hadoop. The language of Pig is Pig Latin. Apache Pig can execute Hadoop jobs in MapReduce, Apache Tez, Apache Spark. Pig Latin has similarities with SQL for relational database management. Pig Latin can be extended with scripts written in Java, Python, JavaScript, Ruby, Groovy. Here is How To Install Apache Pig On Ubuntu 16.04. Here is official website of Pig, which can be useful for documentations and download :
1 | https://pig.apache.org |
How To Install Apache Pig On Ubuntu 16.04
Apache Pig itself very easy to install but you must have Apache Hadoop and Java installed on the instance. In other words, you can follow our guide on installing Apache Hadoop and Java from previous guide to further proceed.
Download Pig latest version (at the time of writing this tutorial pig-0.17.0 is the latest) from official website. You will wget pig-0.17.0.tar.gz
file. We will keep Pig in /usr/local/pig
location. That is actually installation. Now run these :
---
1 2 | tar -xzf pig-0.17.0.tar.gz sudo mv pig-0.17.0 /usr/local/pig |
Noe we need to add $PATH
to bashrc
:
1 | nano ~/.bashrc |
Add PIG_HOME
and PIG_CLASSPATH
path to it :
1 2 | export PIG_HOME=/usr/local/pig export PATH=$PATH:$PIG_HOME/bin |
Save the file and reload .bashrc
:
1 | source ~/.bashrc |
Pig can be used or run in two modes, one is local mode and another mapreduce mode. To run Apache Pig in local mode run :
1 | pig -x local |
To run Apache Pig in MapReduce mode or cluster mode, run :
1 | pig |
To check version of Pig, run :
1 | pig -version |