1. (30 min) What's new in ClickHouse 22.4.
2. (30 min) Q&A.
April's stable release.
ClickHouse already has support for gaps filling in ORDER BY:
SELECT EventDate, count() FROM ... GROUP BY EventDate ORDER BY EventDate
┌──EventDate─┬─count()─┐
│ 2014-03-17 │ 3 │
│ 2014-03-19 │ 6 │
│ 2014-03-21 │ 7 │
│ 2014-03-22 │ 6 │
└────────────┴─────────┘
... ORDER BY EventDate WITH FILL STEP 1
┌──EventDate─┬─count()─┐
│ 2014-03-17 │ 3 │
│ 2014-03-18 │ 0 │
│ 2014-03-19 │ 6 │
│ 2014-03-20 │ 0 │
│ 2014-03-21 │ 7 │
│ 2014-03-22 │ 6 │
└────────────┴─────────┘
ClickHouse already has support for gaps filling in ORDER BY:
ORDER BY x WITH FILL STEP 1
By irregular intervals:
ORDER BY date WITH FILL STEP INTERVAL 1 MONTH
With custom start or end:
ORDER BY x WITH FILL FROM 0 TO 5.51 STEP 0.5
By multiple keys:
ORDER BY x ASC WITH FILL STEP 0.1,
y DESC WITH FILL STEP -1
New in 22.4: interpolation!
Example: fill gaps with previous value:
SELECT * FROM (
SELECT EventDate, count() AS c
FROM test.hits GROUP BY EventDate)
ORDER BY EventDate WITH FILL INTERPOLATE (c AS c)
┌──EventDate─┬─c─┐
│ 2014-03-17 │ 3 │
│ 2014-03-18 │ 3 │
│ 2014-03-19 │ 6 │
│ 2014-03-20 │ 6 │
│ 2014-03-21 │ 7 │
│ 2014-03-22 │ 6 │
└────────────┴───┘
Developers: Yakov Olkhovskiy, Nikolai Kochetov.
New in 22.4: interpolation!
+ supports arbitrary expressions;
+ supports multiple columns;
− aliases from SELECT cannot be referenced, requires subquery;
− only previous value can be referenced;
Do you need this feature? How you would use this feature?
Developers: Yakov Olkhovskiy, Nikolai Kochetov.
Date and Time Construction:
:) SELECT makeDate(2022, 4, 21)
┌─makeDate(2022, 4, 21)─┐
│ 2022-04-21 │
└───────────────────────┘
:) SELECT makeDateTime(2022, 4, 21, 1, 2, 3)
┌─makeDateTime(2022, 4, 21, 1, 2, 3)─┐
│ 2022-04-21 01:02:03 │
└────────────────────────────────────┘
:) SELECT makeDateTime64(2022, 4, 21, 1, 2, 3, 111222, 6)
┌─makeDateTime64(2022, 4, 21, 1, 2, 3, 111222, 6)─┐
│ 2022-04-21 01:02:03.111222 │
└─────────────────────────────────────────────────┘
Developer: Alexander Gololobov.
Question:
What Date and Time functions do we miss in ClickHouse?
Number to date conversions? YYYYMMDDToDate
changeYear / changeMonth / changeDay?
toLastDayOfMonth:
SELECT toLastDayOfMonth(d)
= toStartOfMonth(d + INTERVAL 1 MONTH)
- INTERVAL 1 DAY AS y
— 3.5 times faster!
Developer: Habibullah Oladepo.
INTERVAL MILLISECOND, MICROSECOND, NANOSECOND:
SELECT
now64() AS x,
now64() + INTERVAL 1 MICROSECOND AS y
┌───────────────────────x─┬──────────────────────────y─┐
│ 2022-04-21 07:11:38.608 │ 2022-04-21 07:11:38.608001 │
└─────────────────────────┴────────────────────────────┘
toStartOfMillisecond, -Microsecond, -Nanosecond
add/subtractMilliseconds, -Microseconds, -Nanoseconds
Developer: Andrey Zvonov.
Full support for H3:
— h3PointDistM;
— h3PointDistKm;
— h3PointDistRads;
— h3GetRes0Indexes;
— h3GetPentagonIndexes.
Developer: Bharat Nallan.
Question: name the technology that ClickHouse does not support:
H3, S3, S2, R2, Z3, K9, C*?
Salted Passwords.
— for CREATE USER with SHA256 password, we always add a random salt;
Main motivation: it's a good practice.
Protects from brute-force attacks
if adversary has access to the password storage.
Protects from knowing if different users have identical password.
Developer: Rajkumar Varada.
Note: if users are created on version 22.4+, you cannot downgrade to 22.3.
Sanity Checks on server startup.
— displayed in clickhouse-client and in the server log;
— provide checks for common mistakes in server operation;
Examples:
— "Linux is not using fast TSC clock source."
— "Linux memory overcommit is disabled."
— "Linux threads max count is too low."
— "Rotational disk with disabled readahead is in use."
Question: have you ever look at /proc/sys/kernel/threads-max on your server?
Developer: Sergei Trifonov.
ACID. MVCC. Snapshot Isolation.
Available for preview with non-replicated MergeTree tables.
— standard BEGIN TRANSACTION, COMMIT and ROLLBACK statements;
— atomic INSERTs of huge amounts of data;
— atomic INSERTs into multiple tables and materialized views;
— multiple consistent and isolated SELECTs from single snapshot;
— atomicity and isolation for ALTER DELETE / UPDATE;
Available for preview with non-replicated MergeTree tables.
Next steps: Replicated tables and Distributed queries.
We need your feedback.
Developer: Alexander Tokmakov.
Functions pull-up after ORDER BY and LIMIT:
SELECT sipHash64(number) FROM numbers(1e8)
ORDER BY number LIMIT 5
Was: 2.378 sec.
Now: 0.141 sec., 17 times faster!
Developer: Nikita Taranov.
Prediction of hash table size.
Demo.
Developer: Nikita Taranov.
Parallel downloads.
— for SELECT FROM URL and S3 tables;
— controlled by max_download_threads and max_download_buffer_size;
Developer: Antonio Anđelić.
Parallel uploads.
— for INSERT INTO FUNCTION s3;
Developer: Sergei Trifonov.
For ASOF JOIN*.
Developers: Raul Marin and Maksim Kita.
* ASOF JOIN — join nearest value by inexact match (<, >, ≤, ≥) — for time series correlation.
For hasAll function for arrays.
Developer: youennL-cs (ContentSquare) and Maksim Kita.
Extremely deep SIMD optimization with AVX-2 and SSE4.2 variants.
Advantages:
— many tutorials about integrations;
— it is fast;
Disadvantages:
— just released; still to do: dark theme;
— no pdf version;
Developer: Rich Raposa.
Available in May 2022!
Sign up for wait-list: https://clickhouse.com/cloud/
Serverless operation, no infrastructure to manage.
Roadmap 2022 is published:
https://github.com/ClickHouse/ClickHouse/issues/32513
— open for discussions.
... all great stuff is discussed there, don't miss.