1. (55 min) What's new in ClickHouse 22.10.
2. (5 min) Q&A.
ClickHouse October release.
β 12 new features
β 11 performance optimizations
β 40 bug fixes π₯
Always merging old parts:
ALTER TABLE t MODIFY SETTING
min_age_to_force_merge_seconds = 8640000 -- 100 days
Developer: Antonio Andelic, PengJian fastio.
Relaxing too many parts threshold:
Allow large number of parts if they are large in average.
Developer: Alexey Milovidov.
SET max_block_size = DEFAULT;
Reset to the global default value.
It will appear in the system.settings as not changed.
SELECT getSetting('max_block_size');
SELECT * FROM system.settings WHERE changed;
The main motivation — compatibility with MySQL protocol.
Developer: Arthur Filatenkov.
in clickhouse-client and clickhouse-local:
if you have skim or fzf installed:
https://github.com/lotabout/skim
git clone --depth 1 [email protected]:lotabout/skim.git ~/.skim
~/.skim/install
Demo
Developer: Azat Khuzhin.
Table is in readonly mode π
Was: around one minute for reconnection, for no reason.
Now: milliseconds π₯²
Developer: Raul Marin.
Send all snapshots to S3 for extra safety.
s3_snapshot:
endpoint: http://minio1:9001/snapshots/
access_key_id: minio
secret_access_key: minio123
All snapshots are zstd compressed.
Note: these snapshots are not automatically used for restore,
but can be easily downloaded and restored.
Developer: Antonio Andelic.
BACKUP TABLE data TO S3(
'https://mars-doc-test.s3.amazonaws.com/backup-S3/my_backup',
'ABC123', 'Abc+123')
If the table is already on S3, it will use server-side copy!
Developer: Vitaly Baranov, Azat Khuzhin.
Write once read many.
Export a MergeTree table to S3 as is: by copying the directory structure.
Attach a readonly table in zero time!
Ideal for example datasets. See also: web disks.
Scheduled for 22.11.
So, ClickHouse supports a lot of protocols:
— HTTP
— HTTPs
— Native TCP
— Native TCP wrapped in PROXYv1
— Native TCP with TLS
— MySQL (with TLS support)
— PostgreSQL (with TLS support)
— GRPC (with TLS)
— Replication protocol over HTTP
— Replication protocol over HTTPs
— Keeper client-server protocol;
— Keeper consensus protocol;
— ...
So, ClickHouse supports a lot of protocols.
How to configure all of them? What if:
— server has multiple network interfaces?
— enable one protocol on multiple ports?
— I want native TCP for localhost only and HTTPs from everywhere?
— I want different TLS certificates for different protocols?
— I want to wrap one protocol in another?
<protocols>
<tcp>
<type>tcp</type>
<host>::</host>
<port>9000</port>
<description>native protocol</description>
</tcp>
<tcp_secure>
<type>tls</type>
<impl>tcp</impl>
<port>9440</port>
<description>secure native protocol</description>
</tcp_secure>
<tcp_endpoint>
<impl>tcp</impl>
<host>0.0.0.0</host>
<port>9001</port>
<description>native protocol, another</description>
</tcp_endpoint>
<tcp_proxy>
<type>proxy1</type>
<impl>tcp</impl>
<port>9100</port>
<description>native protocol with PROXYv1</description>
</tcp_proxy>
The case: ClickHouse under proxy:
Envoy Proxy / HAProxy / CloudFlare.
ClickHouse server will receive connections from the proxy.
But it needs to know the source IP address for quotas, ACL and logging.
Solution: enable PROXYv1 protocol in the proxy
and configure it as a protocol wrapper in ClickHouse.
ClickHouse will read the header and unwrap the network packets.
Developer: Yakov Olkhovskiy.
The new configuration is fully compatible with the old one.
They can be used together, and the old configuration
represents the defaults.
What else?
— routing based on the protocol, domain name, URL and IP address?
setting the default user or default database.
— ClickHouse server as a proxy for ClickHouse :)
Developer: Yakov Olkhovskiy.
system.asynchronous_insert_log
Records individual INSERT queries,
and how they were collected into larger INSERT batches.
ββnameβββββββββββββββββββββ¬βtypeβββββββββββββββββββββββββββββββββββββββββββββββββββ
β event_date β Date β
β event_time β DateTime β
β event_time_microseconds β DateTime64(6) β
β query β String β
β database β LowCardinality(String) β
β table β LowCardinality(String) β
β format β LowCardinality(String) β
β query_id β String β
β bytes β UInt64 β
β exception β String β
β status β Enum8('Ok' = 0, 'ParsingError' = 1, 'FlushError' = 2) β
β flush_time β DateTime β
β flush_time_microseconds β DateTime64(6) β
β flush_query_id β String β
βββββββββββββββββββββββββββ΄ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Developer: Anton Popov.
Was: gzip, deflate, br, zst.
Now: lz4, snappy, bz2.
Note: these are non-standard extensions to HTTP.
But neither we care.
Developer: Nikolay Degterinsky.
SELECT mortonEncode(x, y)
SELECT mortonDecode(2, z)
— Space-filling curve.
— Binary space partitioning.
— Locality-preserving encoding.
Demo.
Developers: Constantine Peresypkin, Bakey, Olga Khvostikova, Andrey Chulkov.
randUniform
randNormal
randLogNormal
randExponential
randChiSquared
randStudentT
randFisherF
randBernoulli
randBinomial
randNegativeBinomial
randPoisson
Developer: Nikita Mikhailov.
ClickHouse Sink for Kafka Connect
Comparing to the Kafka table engine:
— Push model instead of Pull
— Exactly-once semantics
Alpha stage, welcome for evaluation and experiments!
https://github.com/ClickHouse/clickhouse-kafka-connect
Developer: Mark Zitnik.
— available since Oct 4th;
— free 14-day trial up to 10 TB of data;
Try it! https://clickhouse.cloud/.
Introducing ClickHouse Cloud Webinar
Tomorrow, Oct 27, 9:00 AM (PST) 6:00 PM (CEST)
https://clickhouse.com/company/events/cloud-beta