ClickHouse version 22.08 (August 2022) was released on August 18, 2022. This version includes 12 new features, 12 performance improvements, +40 other improvements, and over 45 bug fixes.
For more information, please visit the official ClickHouse list of change logs here.
You can access the related documents and source code as follow:
- Source Code – GitHub Link
- Installation Instructions – Official Documentation
- v22.08 Release Webinar – Recording
- v22.08 Release Webinar – Slides
Let’s take a look at the main features and improvements in this version.
Query Parameters in interactive Mode
Query parameters can be set in interactive mode with “SET param_” command. With the help of this feature, as shown below, values can be assigned to the parameters, and these parameters can be queried.
ubuntu@clickhouse01:~$ clickhouse-client --password clickhouse01 :) SET param_x='chistaDATA'; SET Query id: 16fb3569-401a-4e8f-b208-6dab92afc12a Ok. 0 rows in set. Elapsed: 0.004 sec. clickhouse01 :) select {x:String} SELECT {x:String} Query id: fe1ce2e3-b9ed-4093-8847-b5551f0bca25 ┌─_CAST('chistaDATA', 'String')─┐ │ chistaDATA │ └───────────────────────────────┘ 1 row in set. Elapsed: 0.003 sec.
DELETE FROM Syntax Support
In this version, the SQL standard of “DELETE FROM” syntax is supported on merge tree tables and lightweight deletes implementation for merge tree families. This feature improves delete performance dramatically.
For testing “ALTER TABLE DELETE” and “DELETE FROM” syntax, 2 demo tables were created from “cell_towers” table.
clickhouse01 :) create table cell_towers_demo ENGINE = MergeTree ORDER BY (radio, mcc, net, created) as select * from cell_towers; clickhouse01 :) create table cell_towers_demo2 ENGINE = MergeTree ORDER BY (radio, mcc, net, created) as select * from cell_towers;
This table has more than 43 million records, and we will delete the records for area code 25464 by using both methods.
clickhouse01 :) select count() from cell_towers; SELECT count() FROM cell_towers Query id: e01de37f-81d0-4740-b25b-c98693e9d3a5 ┌──count()─┐ │ 43276158 │ └──────────┘ 1 row in set. Elapsed: 0.004 sec. clickhouse01 :) select count() from cell_towers where area=25464; SELECT count() FROM cell_towers WHERE area = 25464 Query id: 546a45f9-0b39-4b36-89a5-6cab932f21af ┌─count()─┐ │ 51 │ └─────────┘
With the “ALTER TABLE DELETE” command, it took 25.7 seconds to delete 51 records from “cell_towers_demo” table.
clickhouse01 :) SET mutations_sync = 2; clickhouse01 :) alter table cell_towers_demo delete where area= 25464; ALTER TABLE cell_towers_demo DELETE WHERE area = 25464 Query id: a41fb540-9a18-435b-a0be-d6b48439d1f6 Ok. 0 rows in set. Elapsed: 25.709 sec.
On the other hand, “DELETE FROM” command deleted the same 51 records in 0.7 seconds.
clickhouse01 :) SET allow_experimental_lightweight_delete = true; SET allow_experimental_lightweight_delete = 1 Query id: ae8a0fd4-948b-4860-aa10-ed00dd6e561d Ok. 0 rows in set. Elapsed: 0.003 sec. clickhouse01 :) SET mutations_sync = 2; clickhouse01 :) delete from cell_towers_demo2 where area= 25464; DELETE FROM cell_towers_demo2 WHERE area = 25464 Query id: c0ff8d9c-19c3-44ac-b9a3-f86d6451469d Ok. 0 rows in set. Elapsed: 0.783 sec.
Exact Rows Before Limit Setting
If exact_rows_before_limit is set to 1(default 0), ClickHouse provide exact value for rows_before_limit_at_least statistic. Because of the read whole table, the cost of the read data will increase.
In the example below, before setting “exact_rows_before_limit” to 1 “rows_before_limit_at_least” is 391. After setting the “exact_rows_before_limit” value to 1(or true), “rows_before_limit_at_least” have changed to 100000, which is the exact number of read records.
clickhouse01 :) select number % 100000 , count() from numbers_mt(10000000) group by 1 limit 1 format JSONCompact; SELECT number % 100000, count() FROM numbers_mt(10000000) GROUP BY 1 LIMIT 1 FORMAT JSONCompact Query id: f6f44389-3857-49b0-a709-21df9474e01f { "meta": [ { "name": "modulo(number, 100000)", "type": "UInt32" }, { "name": "count()", "type": "UInt64" } ], "data": [ [73035, "100"] ], "rows": 1, "rows_before_limit_at_least": 391, "statistics": { "elapsed": 0.135662472, "rows_read": 10000000, "bytes_read": 80000000 } } 1 row in set. Elapsed: 0.136 sec. Processed 10.00 million rows, 80.00 MB (73.41 million rows/s., 587.25 MB/s.)
clickhouse01 :) SET exact_rows_before_limit=true; SET exact_rows_before_limit = 1 Query id: a7f58e95-5527-4a9d-ba74-c1d00a36ebda Ok. 0 rows in set. Elapsed: 0.003 sec. clickhouse01 :) select number % 100000 , count() from numbers_mt(10000000) group by 1 limit 1 format JSONCompact; SELECT number % 100000, count() FROM numbers_mt(10000000) GROUP BY 1 LIMIT 1 FORMAT JSONCompact Query id: 03e3e497-f5d5-4cea-848b-613dbc61f557 { "meta": [ { "name": "modulo(number, 100000)", "type": "UInt32" }, { "name": "count()", "type": "UInt64" } ], "data": [ [73035, "100"] ], "rows": 1, "rows_before_limit_at_least": 100000, "statistics": { "elapsed": 0.140187318, "rows_read": 10000000, "bytes_read": 80000000 } } 1 row in set. Elapsed: 0.175 sec. Processed 10.00 million rows, 80.00 MB (57.30 million rows/s., 458.38 MB/s.)
Distributed INSERT INTO SELECT from s3Cluster
Support for parallel distributed INSERT INTO SELECT with s3Cluster table function into tables with Distributed and Replicated engine.
The query can be parallelized with the s3Cluster table function if you have a cluster with multiple nodes. Now, if the target table is Replicated or Distributed, the INSERT will be distributed throughout the cluster (will be parallelized).
INSERT INTO targetTable select * FROM s3Cluster('cluster','url');
JSON Formatted Log Output
The outputs can be collected in JSON format. The purpose is to allow easier ingestion and query in log analysis tools. To enable JSON logging support, it is enough for the uncomment <formatting> tag in config.xml file as shown below.
<!-- <formatting>json</formatting> --> ## uncomment this line <formatting>json</formatting> ## the uncommented "formatting" tag
SELECT INTO OUTFILE AND STDOUT
Support for saving query results as output files and showing these results at the screen same time.
clickhouse01 :) select radio,lon,lat,range from cell_towers limit 10 INTO OUTFILE 'results.txt' AND STDOUT FORMAT Pretty SELECT radio, lon, lat, range FROM cell_towers LIMIT 10 INTO OUTFILE 'results.txt' FORMAT Pretty Query id: f30e18d1-c31c-4c1c-b08f-c68ed8ac8bfe ┏━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓ ┃ radio ┃ lon ┃ lat ┃ range ┃ ┡━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩ │ CDMA │ -112.069237 │ 48.978268 │ 1000 │ ├───────┼─────────────┼───────────┼───────┤ │ CDMA │ -112.069237 │ 48.978268 │ 1000 │ ├───────┼─────────────┼───────────┼───────┤ │ CDMA │ -112.069237 │ 48.978268 │ 1000 │ ├───────┼─────────────┼───────────┼───────┤ │ CDMA │ -79.462952 │ 44.009564 │ 1000 │ ├───────┼─────────────┼───────────┼───────┤ │ CDMA │ -79.522812 │ 43.79319 │ 1000 │ ├───────┼─────────────┼───────────┼───────┤ │ CDMA │ -79.315284 │ 43.838686 │ 1000 │ ├───────┼─────────────┼───────────┼───────┤ │ CDMA │ -79.459198 │ 43.797741 │ 1000 │ ├───────┼─────────────┼───────────┼───────┤ │ CDMA │ -79.462547 │ 44.01469 │ 1000 │ ├───────┼─────────────┼───────────┼───────┤ │ CDMA │ -80.480919 │ 43.435841 │ 1000 │ ├───────┼─────────────┼───────────┼───────┤ │ CDMA │ -79.37619 │ 43.84483 │ 1000 │ └───────┴─────────────┴───────────┴───────┘ 10 rows in set. Elapsed: 0.008 sec. clickhouse01 :) ubuntu@clickhouse01:~$ cat results.txt ┏━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓ ┃ radio ┃ lon ┃ lat ┃ range ┃ ┡━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩ │ CDMA │ -112.069237 │ 48.978268 │ 1000 │ ├───────┼─────────────┼───────────┼───────┤ │ CDMA │ -112.069237 │ 48.978268 │ 1000 │ ├───────┼─────────────┼───────────┼───────┤ │ CDMA │ -112.069237 │ 48.978268 │ 1000 │ ├───────┼─────────────┼───────────┼───────┤ │ CDMA │ -79.462952 │ 44.009564 │ 1000 │ ├───────┼─────────────┼───────────┼───────┤ │ CDMA │ -79.522812 │ 43.79319 │ 1000 │ ├───────┼─────────────┼───────────┼───────┤ │ CDMA │ -79.315284 │ 43.838686 │ 1000 │ ├───────┼─────────────┼───────────┼───────┤ │ CDMA │ -79.459198 │ 43.797741 │ 1000 │ ├───────┼─────────────┼───────────┼───────┤ │ CDMA │ -79.462547 │ 44.01469 │ 1000 │ ├───────┼─────────────┼───────────┼───────┤ │ CDMA │ -80.480919 │ 43.435841 │ 1000 │ ├───────┼─────────────┼───────────┼───────┤ │ CDMA │ -79.37619 │ 43.84483 │ 1000 │ └───────┴─────────────┴───────────┴───────┘ ubuntu@clickhouse01:~$
Other Notable Features/Improvements
- Extended the range of Date32 and DateTime64 to support dates from 1900 to 2299.
- ClickFiddle, a tool to check query results in almost all ClickHouse versions.
- ClickHouse Hardware Benchmark site.
- Quota key set in the native protocol.
- New settings to control schema inference from text formats.
- Ability to specify settings for an executable() table function.
- Improved memory usage during memory efficient merging of aggregation results.
- Normalize AggregateFunction types and state representations.