Apache Apex is a Hadoop YARN native big data processing platform. It was released as industry’s first YARN native engine. This enables real time stream, batch processing. Installing Apache Apex With Hadoop Demands Meeting Prerequisites Including Apache Hadoop, JDK 7, Git and Maven. In other words, you need to make sure whether you are going to use native Apache’s original distributions or tweaked versions like from Cloudera, Hortonworks etc. As basically we are writing guides for who are towards sysadmins, if you are towards real data science works i.e. data processing, you possibly find easy to spin virtual appliance. Because, for the total work in our way, one have to install Apache Hadoop, may be Apache BigTop. For data processing itself, it is probably too much lengthy way except for need of real deployment. Minimum theory reading needed to do actual tests with Apache Apex family.
Apex actually has multiple parts. The core Apex platform is supplemented by Malhar. Malhar is a library of connector and logic functions. Total things provide access to HDFS, S3, NFS, FTP, Kafka, ActiveMQ, RabbitMQ, JMS, MySql, Cassandra, MongoDB, Redis, HBase, CouchDB, JDBC, and so on. You can look at :
1 2 | https://github.com/apache/apex-core https://github.com/apache/apex-malhar |
Of course, also download available on official site of Apache :
---
1 | https://apex.apache.org/downloads.html |
Also, the Apex CLI needs to be run from /apex-core/engine/src/main/scripts
directory as script. That often confuses the new users.
Installing Apache Apex With Hadoop
Installation practically not suitable word. Basically it sits on top of Hadoop. We need git, java JDK (not JRE), Apache Maven installed. Basically we need to set the path on bash or whatever profile file :
1 2 | # change the path to real one, it is example export PATH=$PATH:/sfw/maven/apache-maven-3.8.1/bin |
Then we have to run these commands :
1 2 3 4 5 6 7 8 | git clone https://github.com/apache/apex-core git clone https://github.com/apache/apex-malhar # switch to release branch cd apex-core mvn clean install -DskipTests cd .. cd apex-malhar mvn clean install -DskipTests |
Now, for Apache Apex CLI you can go to that scripts directory like described above, and read this documentation :
1 | http://apex.apache.org/docs/apex/apex_cli/ |