Possibly Apache BigTop is confusing to many of the new users for the simple fact – peoples know about Hadoop without many parts and tools. Apache Ambari is another tool which confuses the new users. Apache BigTop is a Big Data management distribution. On official website, there will be good information to clarify any doubt. Here are the SSH commands showing how to install Apache BigTop on Ubuntu 16.04.
How to Install Apache BigTop on Ubuntu 16.04
First update and upgrade the Ubuntu operating system :
1 2 | apt update apt upgrade |
Then install Java :
---
1 2 3 | sudo add-apt-repository ppa:webupd8team/java apt update apt install oracle-java8-installer |
The home directory supposed to be at /usr/lib/jvm/java-8-oracle/
. Also we can install Oracle Java with this command and set it to default :
1 | apt install oracle-java8-set-default |
Then install maven :
1 | apt install maven |
The home directory supposed to be at /usr/share/maven
. Next, we need to set the environment with some commands to add things to /etc/bash.bashrc
:
1 2 3 4 5 | echo "export JAVA_HOME=/usr/lib/jvm/java-8-oracle/" >> /etc/bash.bashrc echo "export M2_HOME=/usr/share/maven" >> /etc/bash.bashrc echo "export PATH=$M2_HOME/bin:$JAVA_HOME/bin:$PATH" >> /etc/bash.bashrc source /etc/bash.bashrc mvn -v |
Next install the Apache Bigtop GPG key, in this example we have used version 1.10 :
1 | wget -O- http://archive.apache.org/dist/bigtop/bigtop-1.1.0/repos/GPG-KEY-bigtop | sudo apt-key add - |
Grab the repo file, again in this example we have used version 1.10 :
1 | wget -O /etc/apt/sources.list.d/bigtop-1.1.0.list http://archive.apache.org/dist/bigtop/bigtop-1.1.0/repos/trusty/bigtop.list |
Next update the apt cache :
1 | apt update |
then install the BigTop utilities :
1 | apt install bigtop-utils |
We can install the full Hadoop stack or parts of it, like :
1 | apt install hadoop* flume-* oozie* hive* |
Next part is how we will run. First we can format the namenode in this way :
1 | sudo /etc/init.d/hadoop-hdfs-namenode init |
Then we will start the necessary Hadoop services :
1 2 3 | for i in hadoop-hdfs-namenode hadoop-hdfs-datanode ; do sudo service $i start ; done sudo service hadoop-yarn-resourcemanager start sudo service hadoop-yarn-nodemanager start |
Then we will check the Hadoop filesystem :
1 | sudo hdfs dfs -ls / |
Next we will prepare to run Hive :
1 2 3 4 | hadoop fs -mkdir /tmp hadoop fs -mkdir /user/hive/warehouse hadoop -chmod g+x /tmp hadoop -chmod g+x /user/hive/warehouse |
In case, the directories have not been created on local filesystem, we will run :
1 2 | sudo mkdir /var/run/hive sudo mkdir /var/lock/subsys |
Finally we can run :
1 | sudo /etc/init.d/hive-server start |