ClickHouse is a free column-oriented database software (DBMS) for online analytical processing (OLAP). ClickHouse was developed by the Russian company Yandex for the web analysis service Yandex. ClickHouse allows the analysis of updated data in real-time. This system is developed for high performance. The project was released as free software under the terms of the Apache License in June 2016. ClickHouse is also used by Cloudflare to store and process logs from its DNS servers.
The main features of ClickHouse are:
- A true column-oriented database. Nothing is stored with the values. For example, fixed size values are handled to avoid storing their size next to values.
- Linear extension. It is possible to extend a cluster by adding servers.
- Fault tolerance. The system is a cluster of fragments, in which each fragment is a group of replicas.
- ClickHouse uses asynchronous multi-master replication. The data is written to any of the available replicas and then distributed to the remaining replicas. ZooKeeper is used to synchronize processes, but does not participate in processing and executing queries.
- Ability to store and process multiple petabytes of data.
- SQLsupport. ClickHouse supports an extended SQL-like language that includes nested arrays and data structures, approximations, and URIs, and allows connection to external key-value storage.
- High performance.
- Vector calculations are used. Data is stored only by columns, but is processed by vectors (portions of columns). This approach makes it possible to achieve high CPU performance.
- Approximation calculations and sampling are managed.
- Distributed and parallel query processing is available (including joins).
- Data compression.
- Optimization for hard drives. The system can process data that does not hold in RAM.
Clients for connecting to the database. Options for connecting to the database include the client in console mode, HTTP API, or one of the wrappers (wrappers are available for Python scripting languages, PHP, Node.js, Perl, Ruby and R, as well as the compiled languages Rust and Go20). A JDBC driver is also available for ClickHouse.
Steps to install ClickHouse
ClickHouse needs the apt-transport package:
---
1 2 3 4 | sudo apt update sudo apt install apt-transport-https sudo apt install ca-certificates dirmngr sudo apt-key adv --keyserver keyserver.ubuntu.com --recv E0C56BD4 |
Add the repository to your APT repositories list :
1 | echo "deb http://repo.yandex.ru/clickhouse/deb/stable/ main/" | sudo tee /etc/apt/sources.list.d/clickhouse.list |
Proceed to installation :
1 2 3 4 | sudo apt update sudo apt install clickhouse-server clickhouse-client sudo service clickhouse-server start sudo service clickhouse-server status |
Now that we have enabled ClickHouse, we can access ClickHouse with the password :
1 | clickhouse-client --password |
For instance, to create a test database use the command below:
1 | ch:) CREATE DATABASE test; |
The configaration file is /etc/clickhouse-server/config.xml
.