Apache Flink is a big data processing engine which can run in both streaming & batch mode. data Artisans is the company who is the original creator of Flink. It started as a project called Stratosphere, which was forked, and became Apache Flink. Flink can be deployed on local machine, on cluster (it can run on YARN), or can be deployed in the cloud. Core of Apache Flink is a distributed streaming dataflow engine. It written in Java and Scala. Flink natively supports execution of iterative algorithms. Programs to run with Flink can be written in Java, Scala, Python, SQL. Flink has no own data storage system and provides data source and sink connectors to Apache Kafka, HDFS, Apache Cassandra, ElasticSearch, Amazon Kinesis etc. Apache Beam is a shared programming model for which Flink is backend.
Is not Apache Spark does similar job? Yes. Flink is not alone. That is why we published an article named Apache Spark Alternatives To Overcome Integrity Issues, from our that previous article :
Apache Flink is considered as powerful competitor of Apache Spark. Spark is based on resilient distributed datasets (RDDs). Flink is optimized for cyclic or iterative processes by using iterative transformations on collections. Flink is also a strong tool for batch processing.
However, this article is not about comparison. Let Us Move to the Steps on How To Install Apache Flink on Ubuntu Server. We said “Ubuntu Server” to point “no GUI”, you may use a local machine or even Windows 10 Ubuntu Bash to test.
---
An Apache Hadoop installation is not mandatory to use Flink. Hadoop version needed if you plan to run Flink in YARN or process data stored in HDFS.
Steps To Install Apache Flink on Ubuntu Server
Let us update, upgrade as root
user :
1 | apt update -y && apt upgrade -y |
Wait (do not run the next commands till we say to start). Normally we have to install the Java runtime (JRE) with this command :
1 | apt install default-jre |
And next we will install Java development environment (JDK) :
1 | apt install default-jdk |
Next we set JAVA_HOME
in the bash file with the following command:
1 | export JAVA_HOME=$(readlink -f /usr/bin/java | sed "s:bin/java::") |
Then check with below command :
1 | echo JAVA_HOME |
If you already ran the above steps on machine, you need not to run the below commands. In this example, we can add webupd8team
PPA for empty new machine :
1 2 3 | apt install python-software-properties sudo add-apt-repository ppa:webupd8team/java apt update -y && apt upgrade -y |
and then run :
1 | apt install oracle-java7-installer |
Download the binary distribution of Apache Flink from here :
1 | http://flink.apache.org/downloads.html |
Flink has binary releases marked with a Hadoop version which come bundled with binaries for that Hadoop version. The binary release without bundled Hadoop can be used without Hadoop or with a Hadoop version that is installed in the environment. So read that webpage carefully.
This is as example, without Hadoop (notice the file name flink-1.5.0-bin-scala_2.11.tgz
) :
1 | http://www.apache.org/dyn/closer.lua/flink/flink-1.5.0/flink-1.5.0-bin-scala_2.11.tgz |
Here as example, you’ll get with Hadoop (notice the file name flink-1.5.0-bin-hadoop28-scala_2.11.tgz
) :
1 2 3 | http://www-eu.apache.org/dist/flink/flink-1.5.0/ ## LINK to file http://www-eu.apache.org/dist/flink/flink-1.5.0/flink-1.5.0-bin-hadoop28-scala_2.11.tgz |
As example of installation, these are the steps :
1 2 3 4 5 6 7 8 9 10 11 12 | wget http://www-eu.apache.org/dist/flink/flink-1.5.0/flink-1.5.0-bin-hadoop28-scala_2.11.tgz ls -al tar -xzvf flink-1.5.0-bin-hadoop28-scala_2.11.tgz ## we can run # tar -xzvf flink* ## as command ls -al cd flink-1.5.0* ## start session bin/start-local.sh ## stop session ./bin/stop-local.sh |
Here are commands :
1 | https://ci.apache.org/projects/flink/flink-docs-release-1.0/apis/cli.html |
Go to http://localhost:8081
and make sure everything is up and running. The web frontend should report a single available TaskManager instance. The version you installed has own official tutorials with examples :
1 | https://ci.apache.org/projects/flink/flink-docs-release-1.5/quickstart/setup_quickstart.html |
That ends this tutorial.
Tagged With how to configure apache flink on ubuntu 18 , ubuntu install flink , ubutunu 18 flink install