Install Apache Mahout : Ubuntu 16.04 For Machine Learning Dev

Abhishek Ghosh

By Abhishek Ghosh July 14, 2017 8:35 pm Updated on July 14, 2017

Install Apache Mahout : Ubuntu 16.04 For Machine Learning Dev

Apache Mahout is a simple programming environment and also a framework for building algorithms for Scala, Apache Spark, H2O, Apache Flink and so on. Samsara is part of Mahout, an experimentation environment with R like syntax. Here is how to install Apache Mahout on Ubuntu 16.04 for machine learning development. This guide will show commands to give the correct idea not exact commands to copy paste on terminal. Because it is not installing LAMP and WordPress that users need version specific commands. Apache Mahout has practically no usable guide for a new to get started. Purpose of this guide

There are lot of resources around Mahout :

https://mahout.apache.org/general/books-tutorials-and-talks.html

1	https://mahout.apache.org/general/books-tutorials-and-talks.html

Install Apache Mahout : Literal Meaning Is Not Bright

Install Apache Mahout - Ubuntu 16-04 For Machine Learning Dev

Mahout, Samsara are Sanskrit derived words. Mahout derived from Sanskrit word mahamatra. In Bengali, Hindi language etc languages Mahout indicates the poor peoples who are used to keep like chauffeurs – trainer, keeper, cleaner, feeder. It is a family profession in Indian subcontinents. Following standard European culture we would name some software as “chauffeur”. Softwares are named after fruits, flowers etc. If you search with India Mahout or India Mahoot, you’ll see real example human mahout in 2017. Fool humans performing manual work has no credit of intelligence. We should respect all but politeness often confused with weakness. Samsara means family. As example I, my parents, pets are our “samsara”. If someone asks me “when you’ll do samsara” that means in gentleman’s language the person is asking when I will be wedding. After my marriage I, my parents, my wife, pets will be our “samsara”. Then my wife will shout – “Look after our samsara, writing blogs for peoples like King”. Pet elephant, horse, parakeet, cat, dog all behave like humans who are closer to them – the owners. We do not “tame” them. They are domesticated. Apache ML would work fine as name. A rose by any other name would smell as sweet.

Disagreeable, disrespectful nomenclature. Who named need to be thrown in front of angry pet elephants to make the basic understand.

Install Apache Mahout : Steps

We need Java, Maven, Subversion, Git at minimum to build or install Apache Mahout. Install Subversion, Git yourself from repo.

Hadoop and/or spark are not basic requirements to run Apache Mahout, some algorithms may run on a single server. But the algorithms are related to Hadoop and/or spark. Spark based algorithms are encouraged to test. So it is practical to build or install on existing Hadoop, Spark based Big Data platform like we have guide to install Apache Spark.

Java needed to run Hadoop, Hadoop is used by Mahout, MVN is common.

In order to install Oracle Java, go to official web page for latest. Prototype commands are shown :

sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer
## sudo apt-get install oracle-java9-installer
sudo update-alternatives --config java
# copy the path
nano /etc/environment
# Add at the end of this file
JAVA_HOME="/usr/lib/jvm/java-8-oracle"
source /etc/environment
echo $JAVA_HOME
java -version
## example for JDK for 1.5.0_07 for all users
## http://docs.oracle.com/javase/7/docs/webnotes/install/linux/linux-jdk.html#install-64
# sudo cp jdk-7u45-linux-x64.gz /usr/local/lib/  
# sudo tar -xzvf jdk-7u45-linux-x64.gz  
nano ~/.bashrc
# add
# export PATH=$PATH:/usr/java/jdk1.5.0_07/bin
# export PATH=$PATH:/usr/java/jdk1.5.0_07/bin
source ~/.bashrc

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

sudo add-apt-repository ppa:webupd8team/java

sudo apt-get update

sudo apt-get install oracle-java8-installer

## sudo apt-get install oracle-java9-installer

sudo update-alternatives --config java

# copy the path

nano /etc/environment

# Add at the end of this file

JAVA_HOME="/usr/lib/jvm/java-8-oracle"

source /etc/environment

echo $JAVA_HOME

java -version

## example for JDK for 1.5.0_07 for all users

## http://docs.oracle.com/javase/7/docs/webnotes/install/linux/linux-jdk.html#install-64

# sudo cp jdk-7u45-linux-x64.gz /usr/local/lib/

# sudo tar -xzvf jdk-7u45-linux-x64.gz

nano ~/.bashrc

# add

# export PATH=$PATH:/usr/java/jdk1.5.0_07/bin

source ~/.bashrc

Also you can add path ~/.bash_profile or ~/.bashrc like we did with Java for out Big Data tutorials.

We need :

java version "ABCD"
Java(TM) SE Runtime Environment (build ABCD)
Java HotSpot(TM) 64-Bit Server VM (build XYZ, mixed mode)

1

2

3

java version "ABCD"

Java(TM) SE Runtime Environment (build ABCD)

Java HotSpot(TM) 64-Bit Server VM (build XYZ, mixed mode)

Next, we need to install current version of Maven. wget to download and tar -xzvf to unpack. You can read our other guides from links above. Ultimately you will add path ~/.bash_profile or ~/.bashrc like :

nano ~/.bashrc
export MAVEN_HOME=/usr/local/lib/apache-maven-3.3.3/
export PATH=$PATH:/usr/local/lib/apache-maven-3.3.3/bin
source ~/.bashrc
mvn --version
## old versions
# export M2_HOME=/usr/local/apache-maven-3.0.4
# export M2=$M2_HOME/bin
# export PATH=$M2:$PATH
# export JAVA_HOME=$HOME/programs/jdk

1

2

3

4

5

6

7

8

9

10

nano ~/.bashrc

export MAVEN_HOME=/usr/local/lib/apache-maven-3.3.3/

export PATH=$PATH:/usr/local/lib/apache-maven-3.3.3/bin

source ~/.bashrc

mvn --version

## old versions

# export M2_HOME=/usr/local/apache-maven-3.0.4

# export M2=$M2_HOME/bin

# export PATH=$M2:$PATH

# export JAVA_HOME=$HOME/programs/jdk

Now installing Apache Mahout step. In this repo, the readme :

https://github.com/apache/mahout

1	https://github.com/apache/mahout

took it granted that all will know what they are working with – they started after what we written above. I got misguided, installed Hadoop and Spark for Apache Mahout on colocation server.

git clone https://github.com/apache/mahout.git mahout
# edit your environment in ~/.bash_profile or ~/.bashrc
export MAHOUT_HOME=/path/to/mahout
# for running on standalone server
export MAHOUT_LOCAL=true

1

2

3

4

5

git clone https://github.com/apache/mahout.git mahout

# edit your environment in ~/.bash_profile or ~/.bashrc

export MAHOUT_HOME=/path/to/mahout

# for running on standalone server

export MAHOUT_LOCAL=true

Also you can install in this way when Hadoop, Maven installed :

http://www.apache.org/dyn/closer.cgi/lucene/mahout/
unzip -a mahout-distribution-x.x-src.zip
mv mahout /usr/local
cd /usr/local/mahout/mahout-distribution-0*
ls
# output
bin         core          examples     LICENSE.txt  math-scala  pom.xml     src buildtools  distribution  integration  math         NOTICE.txt  README.txt  target
mvn install

1

2

3

4

5

6

7

8

http://www.apache.org/dyn/closer.cgi/lucene/mahout/

unzip -a mahout-distribution-x.x-src.zip

mv mahout /usr/local

cd /usr/local/mahout/mahout-distribution-0*

ls

# output

bin core examples LICENSE.txt math-scala pom.xml src buildtools distribution integration math NOTICE.txt README.txt target

mvn install

There is also Cloudera package, that is installed from repo :

apt-get install mahout

1	apt-get install mahout

I hope you got some idea to install Apache Mahout. I myself forgot the exact steps I did among so many commands. If you are having problems, check the logs in the logs directory to see if there are any Hadoop errors or Java Exceptions. Errors at the beginning not uncommon.

Tagged With machine learning OR (machine AROUND(20) learning) , what is 16 04 ubundu development machine , intalar mahout hadoop en ubuntu , install apache mahout , instalar mahout hadoop en ubuntu , https://thecustomizewindows com/2017/07/install-apache-mahout-ubuntu-16-04-for-machine-learning-dev/ , how to install mahout without internet in linux , How to install hadoop with Mahout in ubuntu Linux , how to install apache mahout in ubuntu , apache mahout use in windows

About Abhishek Ghosh

Here’s what we’ve got for you which might like :

Take The Conversation Further ...

Get new posts by email: