YARN stands for Yet Another Resource Negotiator. YARN is a cluster management technology since Hadoop 2.0. YARN is being extensively used for writing applications by Hadoop Developers. Most of the readers of this website are beginners around big data. For that reason, we point towards the basic theoretical articles such as differences between batch processing and stream processing, differences between Hadoop and Spark. Apache Hive and Apache Pig are built to make MapReduce accessible who has limited experience in Java programming. In earlier versions of Hadoop, the batch processing framework MapReduce was paired with Hadoop Distributed File System (HDFS). After addition of YARN, Hadoop 2.0 became different in how Hadoop previously worked. Apache Twill is an abstraction over Apache Hadoop YARN which reduces the complexity of developing distributed applications. Thus allowing the developers to focus more on logic.
Package and Run Apache Twill
Naturally we need Haddop to be installed, may be Apache Spark. You need Apache Maven to be installed.
You may need to manually download Hadoop-auth jar file for org.apache.hadoop.util.PlatformName
. It may get missing in existing Hadoop installation. Apache Twill is available on GitHub and easiest way to install :
---
1 2 3 4 5 | # git clone https://git-wip-us.apache.org/repos/asf/twill.git cd twill mvn install # |
For packaging, you can use the maven-bundle-plugin
, use something like this in pom.xml
:
Apache Karaf is a sub project of Apache Felix. You’ll get example of Twill YARN on GitHub.
Then run MAVEN_OPTS="-Xmx512m" mvn clean package
. That should create a .jar
file under the target directory. If you use “jar -tf” to look at the content of the jar file, it should be something like this:
1 2 3 4 5 | my/package/HelloWorld.class my/package/HelloWorld$HelloWorldRunnable.class lib/twill-api-0.3.0-incubating.jar lib/twill-core-0.3.0-incubating.jar lib/.. |
To launch the application, you can use SCP and unjar the file in some directory, followed by shell command like this in the expanded jar directory:
1 2 | $> export HADOOP_CP=`hadoop classpath` $> java -cp .:lib/*:$HADOOP_CP my.package.HelloWorld |
This ends this small guide.
Tagged With install apache yarn