ClickHouse: Release 22.12 Webinar

ClickHouse
Release 22.12

Release 22.12 Webinar

1. (50 min) What's new in ClickHouse 22.12.

2. (10 min) Q&A.

Release 22.12

ClickHouse Christmas release.

β€” 🎁 17 new features

β€” ⛸️ 8 performance optimizations

β€” πŸŽ… 39 bug fixes

SQL Language Features

Numeric Literals With Underscore

Examples:

SELECT 1_000_000; SELECT 12.345_67; SELECT 0xABCD_EF01; SELECT 0b1111_0000_1010_0101; SELECT 1_50_000, 1_00_00_000; -- 1.5 lakh, 1 crore

Developers: jh0x, unbyte.

FROM table SELECT columns...

FROM github_events SELECT created_at, 'https://github.com/' || repo_name AS url, title WHERE title LIKE '%πŸŽ„%' ORDER BY created_at DESC LIMIT 1 BY title LIMIT 100

Why? — Better autocomplete in editors.

Developer: Nikolay Degterinsky.

GROUP BY ALL

SELECT county, town, district, street, median(price), count() FROM uk_price_paid WHERE toYear(date) = 2022 GROUP BY ALL ORDER BY count() DESC LIMIT 10

Developer: TaoFengLiu.

GROUP BY ALL (alternative)

SELECT county, town, district, street, median(price), count() FROM uk_price_paid WHERE toYear(date) = 2022 GROUP BY 1, 2, 3, 4 ORDER BY count() DESC LIMIT 10

GROUP BY ALL (alternative)

SELECT (county, town, district, street) AS k, median(price), count() FROM uk_price_paid WHERE toYear(date) = 2022 GROUP BY k ORDER BY count() DESC LIMIT 10

concatWithSeparator, concat_ws

SELECT concatWithSeparator(' - ', 'Hello', 'world', 'goodbye.') AS x β”Œβ”€x────────────────────────┐ β”‚ Hello - world - goodbye. β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Alias: concat_ws.

What for? — compatibility with Spark.

Developer: TaiYang Li.

Bonus: concatWithSeparatorAssumeInjective.

Decimal Operations With Specified Precision

SELECT 1::Decimal(10, 5) AS a, 3::Decimal(10, 5) AS b, a / b AS x, divideDecimal(a, b, 10) AS y β”Œβ”€a─┬─b─┬───────x─┬────────────y─┐ β”‚ 1 β”‚ 3 β”‚ 0.33333 β”‚ 0.3333333333 β”‚ β””β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Developer: Andrey Zvonov.

SQL UDF in CREATE Queries

CREATE FUNCTION toBFloat16 AS (x) -> reinterpretAsFloat32( bitAnd(reinterpretAsUInt32(x), 0xFFFF0000)); CREATE TABLE feature_store ( time DateTime DEFAULT now(), vec Array(Float32) CODEC(ZSTD) TTL time + INTERVAL 1 DAY, vec16 Array(Float32) DEFAULT arrayMap(x -> toBFloat16(x), vec) CODEC(ZSTD), text String ) ENGINE = MergeTree ORDER BY time;

Developer: Antonio Andelic.

SQL UDF in CREATE Queries

:) INSERT INTO feature_store (vec) VALUES ([0.1, 0.2, 0.3]) Ok. :) SELECT vec, vec16 FROM feature_store β”Œβ”€vec───────────┬─vec16───────────────────────────────┐ β”‚ [0.1,0.2,0.3] β”‚ [0.099609375,0.19921875,0.29882812] β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ :) SELECT L2Distance(vec, vec16) FROM feature_store β”Œβ”€L2Distance(vec, vec16)─┐ β”‚ 0.0014615965 β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Operations &
Monitoring

system.moves table

To introspect part moves between volumes and disks.

DESCRIBE TABLE system.moves β”Œβ”€name─────────────┬─type────┐ β”‚ database β”‚ String β”‚ β”‚ table β”‚ String β”‚ β”‚ elapsed β”‚ Float64 β”‚ β”‚ target_disk_name β”‚ String β”‚ β”‚ target_disk_path β”‚ String β”‚ β”‚ part_name β”‚ String β”‚ β”‚ part_size β”‚ UInt64 β”‚ β”‚ thread_id β”‚ UInt64 β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Developer: Sergei Trifonov.

Prometheus Endpoint For ClickHouse Keeper

Monitor ClickHouse Kepeer with your favorite tools!

Probably the last missing feature in comparison to ZooKeeper.

$ cat /etc/clickhouse-keeper/config.d/prometheus.yaml prometheus: port: 9369 endpoint: /metrics

Developer: Antonio Andelic.

Bonus: write performance and stability on a very high request rate.

Constraints For MergeTree Settings

Prevent or limit users to change some settings:

CREATE TABLE ... ENGINE = MergeTree ORDER BY ... SETTINGS storage_policy = 'local', ...

$ cat /etc/clickhouse-server/users.d/table_constraints.yaml profiles: default: constraints: merge_tree_storage_policy: const: merge_tree_parts_to_throw_insert: max: 1000

Developer: Sergei Trifonov.

Security

Password Complexity Rules

$ cat /etc/clickhouse-server/config.d/rules.yaml password_complexity: - rule: pattern: '.{12}' message: 'be at least 12 characters long' - rule: pattern: '\p{N}' message: contain at least 1 numeric character - rule: pattern: '\p{Lu}' message: contain at least 1 uppercase character - rule: pattern: '[^\p{L}\p{N}]' message: contain at least 1 special character

Developer: Nikolay Degterinsky.

Password Complexity Rules

:) CREATE USER vasyan IDENTIFIED WITH sha256_password BY 'qwerty123' DB::Exception: Invalid password. The password should: be at least 12 characters long, contain at least 1 uppercase character, contain at least 1 special character.

Developer: Nikolay Degterinsky.

Note: if clickhouse-client is being used,
the password will be checked and hashed on client side.

The server will never receive the plaintext password.

Credentials Cleansing

CREATE TABLE test AS mysql( 'monty:3306', maria, table, 'videnius', 'qwerty123'); 2022.12.15 07:51:10.997810 [ 2282939 ] {ea24d544-3e40-4f2a-9f0e-2e3a35fc63c8} <Debug> executeQuery: (from [::ffff:127.0.0.1]:47320) CREATE TABLE test AS mysql('monty:3306', maria, table, 'videnius', '[HIDDEN]') (stage: Complete) :) SHOW CREATE TABLE test ... AS mysql('monty:3306', maria, table, 'videnius', '[HIDDEN]'); :) SELECT query FROM system.query_log WHERE query LIKE 'CREATE TABLE test%' CREATE TABLE test AS mysql( 'monty:3306', maria, table, 'videnius', '[HIDDEN]')

Developer: Vitaly Baranov.

Bonus

BSON Import/Export

$ clickhouse-client --time --progress --query " SELECT * FROM hits_1m INTO OUTFILE 'hits.jsonl'" 0.801 $ clickhouse-client --time --progress --query " SELECT * FROM hits_1m INTO OUTFILE 'hits.bson'" $ clickhouse-client --time --progress --query " SELECT * FROM hits_1m FORMAT BSONEachRow > hits.bson" 0.564 $ wc -c hits.jsonl hits.bson 2358439536 hits.jsonl 778316087 hits.bson

Developer: Pavel Kruglov, Mark Polokhov.

BSON Import/Export

$ time clickhouse-local --progress --query " SELECT * FROM table FORMAT Null" < hits.jsonl real 0m4,404s user 0m9,831s sys 0m1,057s $ time clickhouse-local --progress --query " SELECT * FROM table FORMAT Null" < hits.bson real 0m1,471s user 0m4,226s sys 0m0,697s

File extension detection. Schema inference. — Fully supported!

Developer: Pavel Kruglov, Mark Polokhov.

Grace Hash JOIN

SET join_algorithm = 'grace_hash';

Developer: Sergei Skvortsov.

A test: self-join for graph traversal

SELECT UserID, h1.Referer, h1.URL, h2.URL FROM hits AS h1 INNER JOIN hits AS h2 ON (h1.UserID = h2.UserID) AND (h1.URL = h2.Referer) WHERE (h1.URL != '') AND (h2.URL != '') AND (h1.Referer != '') AND (h2.Referer != '') ORDER BY UserID LIMIT 10

Grace Hash JOIN

A test: self-join for graph traversal, 100 million log of web traffic:

SET join_algorithm = 'hash'; -- 26 sec, 26.56 GiB RAM SET join_algorithm = 'parallel_hash'; -- 10 sec, 28.79 GiB RAM SET max_memory_usage = '10G', join_algorithm = 'partial_merge'; -- 1216.914 sec SET max_memory_usage = '10G', join_algorithm = 'grace_hash', grace_hash_join_initial_buckets = 128; -- 173 sec, 6.85 GiB.

https://trust.clickhouse.com/

+ penetration testing, bug bounty program, audit reports...

ClickHouse Cloud

— free 30-day trial up to 1 TB of data;

— affordable clusters for developers with full HA < $100/month

Try it! https://clickhouse.cloud/.

Q&A