ClickHouse: Release 22.4 Webinar

ClickHouse
Release 22.4

Release 22.4 Webinar

1. (30 min) What's new in ClickHouse 22.4.

2. (30 min) Q&A.

Release 22.4

April's stable release.

Gaps Filling With Interpolation

ClickHouse already has support for gaps filling in ORDER BY:

SELECT EventDate, count() FROM ... GROUP BY EventDate ORDER BY EventDate ┌──EventDate─┬─count()─┐ │ 2014-03-17 │ 3 │ │ 2014-03-19 │ 6 │ │ 2014-03-21 │ 7 │ │ 2014-03-22 │ 6 │ └────────────┴─────────┘

... ORDER BY EventDate WITH FILL STEP 1 ┌──EventDate─┬─count()─┐ │ 2014-03-17 │ 3 │ │ 2014-03-18 │ 0 │ │ 2014-03-19 │ 6 │ │ 2014-03-20 │ 0 │ │ 2014-03-21 │ 7 │ │ 2014-03-22 │ 6 │ └────────────┴─────────┘

Gaps Filling With Interpolation

ClickHouse already has support for gaps filling in ORDER BY:

ORDER BY x WITH FILL STEP 1

By irregular intervals:

ORDER BY date WITH FILL STEP INTERVAL 1 MONTH

With custom start or end:

ORDER BY x WITH FILL FROM 0 TO 5.51 STEP 0.5

By multiple keys:

ORDER BY x ASC WITH FILL STEP 0.1, y DESC WITH FILL STEP -1

Gaps Filling With Interpolation

New in 22.4: interpolation!

Example: fill gaps with previous value:

SELECT * FROM ( SELECT EventDate, count() AS c FROM test.hits GROUP BY EventDate) ORDER BY EventDate WITH FILL INTERPOLATE (c AS c) ┌──EventDate─┬─c─┐ │ 2014-03-17 │ 3 │ │ 2014-03-18 │ 3 │ │ 2014-03-19 │ 6 │ │ 2014-03-20 │ 6 │ │ 2014-03-21 │ 7 │ │ 2014-03-22 │ 6 │ └────────────┴───┘

Developers: Yakov Olkhovskiy, Nikolai Kochetov.

Gaps Filling With Interpolation

New in 22.4: interpolation!

+ supports arbitrary expressions;

+ supports multiple columns;

− aliases from SELECT cannot be referenced, requires subquery;

− only previous value can be referenced;

Do you need this feature? How you would use this feature?

Developers: Yakov Olkhovskiy, Nikolai Kochetov.

Small and Neat Things

Date and Time Construction:

:) SELECT makeDate(2022, 4, 21) ┌─makeDate(2022, 4, 21)─┐ │ 2022-04-21 │ └───────────────────────┘ :) SELECT makeDateTime(2022, 4, 21, 1, 2, 3) ┌─makeDateTime(2022, 4, 21, 1, 2, 3)─┐ │ 2022-04-21 01:02:03 │ └────────────────────────────────────┘ :) SELECT makeDateTime64(2022, 4, 21, 1, 2, 3, 111222, 6) ┌─makeDateTime64(2022, 4, 21, 1, 2, 3, 111222, 6)─┐ │ 2022-04-21 01:02:03.111222 │ └─────────────────────────────────────────────────┘

Developer: Alexander Gololobov.

Small and Neat Things

Question:

What Date and Time functions do we miss in ClickHouse?

Number to date conversions? YYYYMMDDToDate
changeYear / changeMonth / changeDay?

Small and Neat Things

toLastDayOfMonth:

SELECT toLastDayOfMonth(d) = toStartOfMonth(d + INTERVAL 1 MONTH) - INTERVAL 1 DAY AS y

— 3.5 times faster!

Developer: Habibullah Oladepo.

Small and Neat Things

INTERVAL MILLISECOND, MICROSECOND, NANOSECOND:

SELECT now64() AS x, now64() + INTERVAL 1 MICROSECOND AS y ┌───────────────────────x─┬──────────────────────────y─┐ │ 2022-04-21 07:11:38.608 │ 2022-04-21 07:11:38.608001 │ └─────────────────────────┴────────────────────────────┘

toStartOfMillisecond, -Microsecond, -Nanosecond

add/subtractMilliseconds, -Microseconds, -Nanoseconds

Developer: Andrey Zvonov.

Small and Neat Things

Full support for H3:

— h3PointDistM;
— h3PointDistKm;
— h3PointDistRads;
— h3GetRes0Indexes;
— h3GetPentagonIndexes.

Developer: Bharat Nallan.

Small and Neat Things

Question: name the technology that ClickHouse does not support:

H3, S3, S2, R2, Z3, K9, C*?

Security Improvements

Salted Passwords.

— for CREATE USER with SHA256 password, we always add a random salt;

Main motivation: it's a good practice.

Protects from brute-force attacks
if adversary has access to the password storage.

Protects from knowing if different users have identical password.

Developer: Rajkumar Varada.

Note: if users are created on version 22.4+, you cannot downgrade to 22.3.

Usability Improvements

Sanity Checks on server startup.

— displayed in clickhouse-client and in the server log;
— provide checks for common mistakes in server operation;

Examples:

— "Linux is not using fast TSC clock source."
— "Linux memory overcommit is disabled."
— "Linux threads max count is too low."
— "Rotational disk with disabled readahead is in use."

Question: have you ever look at /proc/sys/kernel/threads-max on your server?

Developer: Sergei Trifonov.

New Experimental Features

Transactions

ACID. MVCC. Snapshot Isolation.

Available for preview with non-replicated MergeTree tables.

— standard BEGIN TRANSACTION, COMMIT and ROLLBACK statements;

— atomic INSERTs of huge amounts of data;

— atomic INSERTs into multiple tables and materialized views;

— multiple consistent and isolated SELECTs from single snapshot;

— atomicity and isolation for ALTER DELETE / UPDATE;

Transactions

Available for preview with non-replicated MergeTree tables.

Next steps: Replicated tables and Distributed queries.

We need your feedback.

Developer: Alexander Tokmakov.

Performance Optimizations

Functions pull-up after ORDER BY and LIMIT:

SELECT sipHash64(number) FROM numbers(1e8) ORDER BY number LIMIT 5

Was: 2.378 sec.

Now: 0.141 sec., 17 times faster!

Developer: Nikita Taranov.

Performance Optimizations

Prediction of hash table size.

Demo.

Developer: Nikita Taranov.

Performance Optimizations

Parallel downloads.

— for SELECT FROM URL and S3 tables;

— controlled by max_download_threads and max_download_buffer_size;

Developer: Antonio Anđelić.

Parallel uploads.

— for INSERT INTO FUNCTION s3;

Developer: Sergei Trifonov.

Performance Optimizations

For ASOF JOIN*.

Developers: Raul Marin and Maksim Kita.

* ASOF JOIN — join nearest value by inexact match (<, >, ≤, ≥) — for time series correlation.

For hasAll function for arrays.

Developer: youennL-cs (ContentSquare) and Maksim Kita.

Extremely deep SIMD optimization with AVX-2 and SSE4.2 variants.

Bonus: New Documentation

Bonus: New Documentation

Advantages:
— many tutorials about integrations;
— it is fast;

Disadvantages:
— just released; still to do: dark theme;
— no pdf version;

Developer: Rich Raposa.

What's Next?

ClickHouse Cloud Private Preview

Available in May 2022!

Sign up for wait-list: https://clickhouse.com/cloud/

Serverless operation, no infrastructure to manage.

What's Next

Roadmap 2022 is published:

https://github.com/ClickHouse/ClickHouse/issues/32513

— open for discussions.

... all great stuff is discussed there, don't miss.

Q&A