Accumulo provides robust, scalable data storage. It is a scalable, distributed key-value store based on Google’s Bigtable and built on top of Apache Hadoop, Apache ZooKeeper, and Apache Thrift. Accumulo is the third most popular NoSQL wide column store behind Apache Cassandra and Hbase. Here Are the Steps on How to Install Apache Accumulo on Ubuntu Running on Single Cloud Server Instance. You have read our linked guides in this webpage to complete the work.
How to Install Apache Accumulo on Ubuntu
Hadoop
First install Apache Hadoop with HDFS. This is the service that Accumulo needs. Follow our guide on installing Apache Hadoop on single cloud server. You’ll need password-less SSH as Hadoop need as the system needs connect to the server over SSH without being prompted for password. All details are written in our Hadoop installation guide. We may use hadoop-2.7.3
for example, you should change command, directives to match your version number.
---
Edit the core-site.xml
:
1 | nano /opt/hadoop-2.7.3/etc/hadoop/core-site.xml |
Make the fs.defaultFS
point to the correct nodename, add :
1 2 3 4 | <property> <name>fs.defaultFS</name> <value>hdfs://localhost:9000</value> </property> |
Configure HDFS by editing hdfs-site.xml
:
1 | nano /opt/hadoop-2.7.3/etc/hadoop/hdfs-site.xml |
add :
1 2 3 4 5 6 7 8 9 10 11 12 | <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.name.dir</name> <value>>file:///opt/hadoop-2.7.3/hdfs_storage/name</value> </property> <property> <name>dfs.data.dir</name> <value>>file:///opt/hadoop-2.7.3/hdfs_storage/data</value> </property> |
Configure MapReduce by editing mapred-site.xml
:
1 2 | cp /opt/hadoop-2.7.3/etc/hadoop/mapred-site.xml.template /opt/hadoop-2.7.3/etc/hadoop/mapred-site.xml nano /opt/hadoop-2.7.3/etc/hadoop/mapred-site.xml |
Add :
1 2 3 4 | <property> <name>mapred.job.tracker</name> <value>localhost:9001</value> </property> |
Initialize the Hadoop storage directory, run Hadoop :
1 2 3 4 | cd /opt/hadoop-2.7.3/bin ./hdfs namenode -format cd /opt/hadoop-2.7.3/sbin ./start-dfs.sh |
We only need HDFS to be running for Accumulo to work. You can test to check if everything is working properly by using :
1 2 | jps netstat –tupln |
At http://server_ip:50070
, you’ll get the NameNode interface.
Zookeeper
Install Apache Zookeeper following these steps (match your current version number, steps are pointing what to do) :
1 2 3 4 5 6 7 8 | wget http://www-eu.apache.org/dist/zookeeper/zookeeper-3.4.9/zookeeper-3.4.9.tar.gz tar -xzvf zookeeper-3.4.9.tar.gz -C /opt/ cp /opt/zookeeper-3.4.9/conf/zoo_sample.cfg /opt/zookeeper-3.4.9/conf/zoo.cfg ## Edit zoo.cfg and set a dataDir nano /opt/zookeeper-3.4.9/conf/zoo.cfg ## dataDir=/opt/zookeeper-3.4.9/datadir ## start zookeeper /opt/zookeeper-3.4.9/bin/zkServer.sh start |
Apache Accumulo
In previous steps, we installed the prerequisites. We will get the latest binary of Apache Accumulo from here :
1 | http://accumulo.apache.org/downloads/ |
Download it :
1 2 | wget https://www.apache.org/dyn/closer.lua/accumulo/1.9.2/accumulo-1.9.2-bin.tar.gz tar -xzvf accumulo-1.9.2-bin.tar.gz -C /opt/ |
Accumulo has configuration script which simplifies configuration tasks, do the following :
1 2 | cd /opt/accumulo-1.9.2/bin ./bootstrap_config.sh |
You’ll get a command line screen to configure heap configuration (512 MB for test), memory map type (jvm), Hadoop version. Make sure that you have set HADOOP_HOME
, ZOOKEEPER_HOME
at .bashrc
with proper version numbered path :
1 2 | export ZOOKEEPER_HOME=/opt/zookeeper-3.4.9 export HADOOP_HOME=/opt/hadoop-2.7.3 |
Initialize Accumulo HDFS Folder :
1 | /opt/accumulo-1.9.2/bin/accumulo init |
Open accumulo-site.xml
:
1 | nano accumulo-site.xml |
Probably you’ll need to edit some things :
1 2 3 4 5 6 7 8 9 10 | ... <name>tserver.memory.maps.max</name> <value>40M</value> ... <name>tserver.cache.data.size</name> <value>4M</value> ... <name>tserver.cache.index.size</name> <value>10M</value> ... |
You’ll need system’s max open files to 32768. By default Accumulo is setup to run on localhost. You have to read documents to
1 2 3 | nano /etc/hosts ## add ## ip.add.re.ss host_name |
Edit conf/slaves
to change the localhost name :
1 | nano /opt/accumulo-1.9.2/conf/slaves |
Start :
1 2 3 | /opt/accumulo-1.9.2/bin/start-all.sh ## start Accumulo shell /opt/accumulo-1.9.2/bin/accumulo shell |
Apache Accumulo web monitor UI will be at http://server_ip:9995
. Check ports, running processes if there is any trouble.