1. (45 min) What's new in ClickHouse 22.7.
2. (15 min) Q&A.
ClickHouse "mid-summer" release.
β 24 new features
β 19 performance optimizations
SELECT
y::String || '.'
|| (y < toYear(today()) - 2000 - 1 ? '*' : m::String) AS Version,
(n <= 3 OR (is_lts AND lts_n <= 2)) ? 'βοΈ' : 'β' AS Supported
FROM (
SELECT y, m,
count() OVER (ORDER BY y DESC, m DESC) AS n,
m IN (3, 8) AS is_lts,
countIf(is_lts) OVER (ORDER BY y DESC, m DESC) AS lts_n
FROM (
WITH extractGroups(version, 'v(\d+)\.(\d+)') AS v,
v[1]::INT AS y, v[2]::INT AS m
SELECT y, m
FROM file('version_date.tsv', TSV, 'version String, date String')
ORDER BY y DESC, m DESC
LIMIT 1 BY y, m)
)
LIMIT 1 BY Version
FORMAT Markdown
WITH
count() OVER (ORDER BY y DESC, m DESC) AS n,
m IN (3, 8) AS is_lts,
countIf(is_lts) OVER (ORDER BY y DESC, m DESC) AS lts_n
SELECT
y::String || '.'
|| (y < toYear(today()) - 2000 - 1 ? '*' : m::String) AS Version,
(n <= 3 OR (is_lts AND lts_n <= 2)) ? 'βοΈ' : 'β' AS Supported
FROM (
WITH extractGroups(version, 'v(\d+)\.(\d+)') AS v,
v[1]::INT AS y, v[2]::INT AS m
SELECT y, m
FROM file('version_date.tsv', TSV, 'version String, date String')
ORDER BY y DESC, m DESC
LIMIT 1 BY y, m)
LIMIT 1 BY Version
FORMAT Markdown
Developer: Dmitriy Novik.
SELECT * FROM numbers(10) FORMAT SQLInsert
INSERT INTO table (`number`) VALUES (0), (1), (2), (3),
(4), (5), (6), (7), (8), (9)
;
Controlled by the settings:
— output_format_sql_insert_max_batch_size;
— output_format_sql_insert_table_name;
— output_format_sql_insert_include_column_names;
— output_format_sql_insert_use_replace;
— output_format_sql_insert_quote_names;
Developer: Pavel Kruglov.
db.hits.aggregate(
[
{ $group: { _id: '$SearchPhrase' } },
{ $count: 'value' },
{ $project: { key: null, value: 1, } }
],
{allowDiskUse: true});
SELECT COUNT(DISTINCT SearchPhrase)
FROM mongodb('localhost:27017', 'test', 'hits',
'root', '', structure = 'SearchPhrase String');
Developer: Anastasia Petrenko, Ksenia Sumarokova.
MongoDB query — 7 min 47 sec
ClickHouse over MongoDB query — 11 min 56 sec :(
MongoDB query — 7 min 47 sec
ClickHouse over MongoDB query — 11 min 56 sec :(
ClickHouse query — 10 ms
Bonus:
INSERT INTO FUNCTION mongodb(...)
CREATE TABLE ... ENGINE = MongoDB(...)
Developer: Anastasia Petrenko, Ksenia Sumarokova.
Create a view on top of MeiliSearch index:
CREATE TABLE t (id UInt64, data String)
ENGINE = MeiliSearch('http://meili1:7700', 't', '')
Insert data for indexing:
INSERT INTO t ...
Query the data with special search conditions:
SELECT * FROM t WHERE meiliMatch($$ "q" = "abaca" $$)
Developer: Mikhail Artemenko, Ksenia Sumarokova.
You Know, for Search!
— integration with MeiliSearch;
— simple integration with Quickwit: https://clickhouse.com/docs/en/guides/developer/full-text-search/
— ElasticSearch? Sphinx? ManticoreSearch?
— Full-text indices inside ClickHouse?
Streaming table engines in ClickHouse:
— Kafka
(works with RedPanda as well);
— RabbitMQ;
— FileLog;
...
— NATS;
— S3Queue;
Developer: Pavel Chebarykov, Ksenia Sumarokova.
What else should we add?
SQS? Kinesis? Pulsar? MySQL binlog?
Or maybe add persistent queues inside ClickHouse?
— "direct" algorithm:
to join with key-value tables by direct lookups a.k.a. nested loops.
Good if the left table is small, but the right table is like a large dictionary.
Good to use in MATERIALIZED VIEW queries.
— "parallel_hash" algorithm:
speed-up if the right hand table is large.
— "full_sorting_merge" algorithm:
when right hand side is large
and does not fit in memory and does not allow lookups.
— "grace_hash" algorithm:
expected in 22.8.
Developer: Vladimir Cherkasov, lgbo-ustc.
SET additional_result_filter = $$ name = 'Vasya' $$;
SELECT * FROM values('name String', ('Vasya'), ('Petya'))
ββnameβββ
β Vasya β
βββββββββ
SET additional_table_filters = {'t': $$ name != 'Petya' $$};
CREATE TEMPORARY TABLE t (name String);
INSERT INTO t VALUES ('Vasya'), ('Petya');
SELECT * FROM t;
ββnameβββ
β Vasya β
βββββββββ
Developer: Nikolai Kochetov.
Improvement of ORDER BY, insert and merge in MergeTree,
and window functions.
SELECT WatchID FROM hits_100m_obfuscated ORDER BY Age
Before:
Elapsed: 4.154 sec. (24.07 million rows/s., 216.64 MB/s.)
After:
Elapsed: 0.482 sec. (207.47 million rows/s., 1.87 GB/s.)
Developer: Maksim Kita.
Speed-up of SELECT with FINAL modifier.
It "simply" improves performance up to 4 times.
Especially for complex transforms like Collapsing and Replacing.
Developer: Nikita Taranov.
CREATE TABLE ... EMPTY AS SELECT ...
SELECT base58Encode('Hello, world'),
base58Decode('2NEpo7TZsLFA2wMeK')
SELECT translate('Hello, world', ', ', ';\n'),
translateUTF8('test', 'tes', 'Π’Π΅Ρ')
SELECT parseTimeDelta('1h 35min 23sec')
SET group_by_use_nulls = 1
multiMatch, multiSearch, ... — support for non-constant needles.
SET compatibility = '21.8'
Allows you to upgrade to the latest version
but keep all the default behavior unchanged, as in the specified version.
... it is not going to cover 100% of possible changes.
But should give something.
Please don't use this feature!
Developer: Pavel Kruglov.
Go driver has HTTP support in addition to Native protocol.
Developer: Artem Iurin.
Go driver now uses high-performant low-level "ch-go" implementation.
Developer: Aleksandr Razumov, Dale McDiarmid.
Grafana plugin now has support for JSON data type.
Developer: Dale McDiarmid.
Official ClickHouse support in Superset.
Developer: Dale McDiarmid.
clickhouse-diagnostics tool now packaged with ClickHouse.
Developer: Dale McDiarmid, Mikhail Shiryaev.
This is just a fraction of what we've made in a month.
More interesting stuff is being prepared!