Previously we have published guides on how to install Apache Hadoop, how to install Apache Spark, how to install Apache Hadoop and use with ElasticSearch.
Also we have talked about Cloud Orchestration Tools, Getting started guide with Ansible and Ansible Playbooks.
If the above two sets of tutorial’s philosophies are combined, installation of Hadoop and complicated, time taking softwares can be made easy. In this guide, we will list some resources which can help you in automated deployment of Apache Hadoop & big data softwares.
---
Automated Deployment of Apache Hadoop & Big Data Softwares
This is relatively easy Ansible playbook to install Hadoop :
1 2 3 4 5 | https://github.com/andrewrothstein/ansible-hadoop https://github.com/cloudmesh/ansible-cloudmesh-hadoop https://github.com/micafer/ansible-role-hadoop https://github.com/indigo-dc/ansible-role-hadoop https://github.com/myztical/hadoop-manager |
Readers must check carefully check each of the above listed Github projects. We did a simple preliminary checking but have not tested them. Some of them do differ from each other in the planning of deployment, versions of softwares and server operating system.
Here is an Ansible playbook that installs a Hadoop cluster, with HBase, Hive, Presto for analytics, and Ganglia, Smokeping, Fluentd, Elasticsearch and Kibana for monitoring and centralized log indexing :
1 | https://github.com/analytically/hadoop-ansible |
Here is a Kafka Cookbook for Chef :
1 | https://github.com/cerner/cerner_kafka |
We can understand that it is not exactly highly informative article, yet many new users have no idea that orchestration tools actually exists.
Tagged With automated deployment in hadoop , find surgeon , paperuri:(c1b1544650ee9b114f72ba0acf8b0c46)