What made us think about ClickHouse?
When a feature request from the product team doesn't work with the existing system, it is nothing short of a nightmare. Due to the need for several complex aggregated functions, a row-based database was unable to provide the desired results for this feature request.
Instead, we turned to columnar databases. That's when ClickHouse caught our attention, and we began analyzing ClickHouse closely using a number of criteria that I've explained below.
Fast, column-oriented, open-source ClickHouse SQL database (db) is great for real-time and data analysis. ClickHouse is appropriate for applications that need analytical findings in less than a second since it supports real-time query processing.
DB Engines and Table Engines supported by ClickHouse
ClickHouse supports a wide variety of database and table engines.
DB Engine
Our system's use case did not align with the use case of other database engines, thus after reading and conducting a Proof of Concept, we determined that Atomic is the most suitable database engine for ClickHouse. The most reliable engine offered by ClickHouse is Atomic. Database engines, such as MYSQL, SQLite, PostgreSQL, and others, are highly specialized for connecting to and establishing handshakes with these kinds of databases.
Table Engine
ClickHouse supports a number of table engines, however MergeTree worked best for us. When a standard insert is needed, the MergeTree table engine is recommended. Since ClickHouse uses a columnar database and writing takes time, it writes data piecemeal and handles merging afterwards. This is because data in.bin files is compressed, making real-time insertion laborious.
Features supported by ClickHouse
As mentioned earlier ClickHouse is a column-oriented SQL database.ClickHouse is an open-source database with an Apache2.0 license which makes it more reliable. The ClickHouse team has implemented the support for ML algorithms, which makes it much easier and faster to run ML over ClickHouse data.
Just like other columnar databases ClickHouse is also read optimized. ClickHouse support for real-time query processing makes it suitable for applications that require sub-second analytical results.
Also, ClickHouse supports partitioning, indexing, joins, and other DML statements like update, and delete. It gives support to batch processing.
Limitations of ClickHouse
As the famous saying goes, “Nothing is perfect in this world, everything has its pros and cons…”
If someone needs any of the below properties then one should not consider ClickHouse.
- There is no support for transactions.
- No real-time delete/update support.
- There is no support for transactions.
- By default, when performing aggregations, the intermediate query states must fit in the RAM on a single server. In such cases, ClickHouse can be configured to spill on the disk.
- Do not use ClickHouse for OLTP. ClickHouse expects data to remain immutable. Even though it is technically possible to remove big chunks of data from the ClickHouse database, it is not fast. ClickHouse simply isn't designed for data modifications. It's also inefficient at finding and retrieving single rows by keys, due to sparse indexing.
- ClickHouse does not fully support ACID transactions.
Conclusion
ClickHouse is a very robust database with a variety of table engines. It supports mostly all DML, DDL, and DCL which MySQL supports. ClickHouse gains more popularity because of its open-source nature. Columnar DBs come with some overhead cost of time when it performs DML, so we have to choose wisely.