1. (50 min) What's new in ClickHouse 23.5.
2. (10 min) Q&A.
ClickHouse May Release.
β 29 new features πΏ
β 25 performance optimizations π΄
β 83 bug fixes π
|
|
|
|
|
|
|
|
|
|
:) SELECT * FROM numbers(10)
INTO OUTFILE 'numbers.tsv'
Ok.
:) SELECT * FROM numbers(10)
INTO OUTFILE 'numbers.tsv'
Cannot open file numbers.tsv, errno: 17,
strerror: File exists. -- this is fine
:) SELECT * FROM numbers(10)
INTO OUTFILE 'numbers.tsv' APPEND
Ok.
Developer: Manas Alekar.
Preparing data inside staging tables
and attaching it in one large batch.
CREATE TEMPORARY TABLE src (x UInt64) ENGINE = MergeTree ORDER BY x;
INSERT INTO src SELECT * FROM numbers(1000000);
...
ALTER TABLE dst ATTACH PARTITION () FROM src;
Works with REPLACE PARTITION as well.
Developer: Roman Vasin.
SELECT * FROM generateRandom('x UInt8, y String') LIMIT 10;
CREATE TEMPORARY TABLE test (x UInt8, y String);
INSERT INTO test SELECT * FROM generateRandom() LIMIT 10;
SELECT * FROM generateRandom() LIMIT 10;
DESCRIBE (SELECT * FROM generateRandom());
Demo
Developer: Pavel Kruglov.
For MySQL compatibility.
SHOW INDEX FROM hits
Row 1:
ββββββ
table: hits
non_unique: 0
key_name: PRIMARY
seq_in_index: α΄Ία΅α΄Έα΄Έ
column_name: α΄Ία΅α΄Έα΄Έ
collation: A
cardinality: α΄Ία΅α΄Έα΄Έ
sub_part: α΄Ία΅α΄Έα΄Έ
packed: α΄Ία΅α΄Έα΄Έ
null: α΄Ία΅α΄Έα΄Έ
index_type: primary
comment: α΄Ία΅α΄Έα΄Έ
index_comment: α΄Ία΅α΄Έα΄Έ
visible: YES
expression: CounterID, EventDate, UserID, EventTime, WatchID
Developer: Robert Schulze.
system.user_processes
— current memory usage and resource counters for every user.
Demo
Developer: Kovi Benjamin Antal.
SELECT *
FROM system.zookeeper_connection
Row 1:
ββββββ
name: default
host: 127.0.0.1
port: 9181
index: 0
connected_time: 1970-01-01 01:25:00
is_expired: 0
keeper_api_version: 3
client_id: 122
Developer: Mateng915.
— bcrypt for password hashing;
— allow configuring the default password hashing;
CREATE USER test IDENTIFIED BY 'my_password';
config.d/password_type.yaml:
default_password_type: sha256_password
Developer: Nikolai Degterinskiy.
— BACKUP and RESTORE works with encryption-at-rest
without decryption and re-ecryption.
Developer: Vitaly Baranov.
— space function;
— for compatibility with Spark;
Demo
Developer: Robert Schulze.
Table engine:
CREATE TABLE test ... ENGINE = AzureBlobStorage(
'https://alesapintestazure.blob.core.windows.net',
'container', 'path.tsv');
SELECT * FROM test;
INSERT INTO test ...;
Table function:
SELECT * FROM azure_blob_storage('...', 'container', 'path.tsv');
INSERT INTO FUNCTION azure_blob_storage('...', 'container', 'path.tsv');
Developer: Alexander Sapin, Smita Kulkarni.
SELECT ... FROM s3(
'https://bucket.s3.amazonaws.com/data*.parquet.zst')
SELECT ... FROM s3Cluster(
cluster, 'https://bucket.s3.amazonaws.com/data*.parquet.zst')
Since version 23.5:
SELECT ... FROM url(
'https://bucket.s3.amazonaws.com/data{01..99}.parquet.zst')
SELECT ... FROM urlCluster(
cluster, 'https://bucket.s3.amazonaws.com/data{01..99}.parquet.zst')
Developer: Attack204, Pavel Kruglov.
clickhouse keeper-client
— a tool inside ClickHouse to introspect your Keeper or ZooKeeper.
Demo
Developer: Artem Brustovetskii.
Query Results Cache
— securely separated between users by default;
— normalized queries to their AST;
— respects query settings, and non-deterministic functions;
— controllable in size for each element, or in total;
— can be enabled for active and passive usage;
— production ready since 23.5!
Geographical Data Types
— Point, Ring, Polygon, and MultiPolygon data types;
— distance, area, perimeter, union, intersection, convex hull, etc. functions;
— cartesian and spherical coordinates;
— read from WKT, output into SVG;
— production ready since 23.5!
Compressed marks and indices on disk
— available since 22.9;
— improves the speed of first queries after server startup;
— enabled by default in 23.5.
New server settings to control memory usage:
merges_mutations_memory_usage_soft_limit and
merges_mutations_memory_usage_to_ram_ratio = 0.5
Merges and mutations will not start if more than half
of server memory is already consumed by background operations.
Developer: Dmitri Novik.
New table settings to lower memory usage:
merge_max_block_size_bytes and vertical_merge_algorithm_min_bytes_to_activate
— save memory when merging tables with large rows;
Developer: Nikita Mikhaylov.
The Official Tableau Connector:
https://github.com/ClickHouse/clickhouse-tableau-connector-jdbc
Originally developed by Analytika Plus.
Updates for Grafana, Metabase, Kafka Connect, Node.js, and Go drivers.
GCP
— ClickHouse Cloud is now available in Google Cloud!
API
— now you can control your services programmatically!
— easy to use and complete REST API for services, organizations, etc.
OpenAPI; Swagger endpoint available;
— upcoming: Terraform provider;
IAM
— secure access to your S3 buckets from ClickHouse Cloud.
— Vector Search In ClickHouse
— CDC From Postgres To ClickHouse
— ClickHouse For Observability with SigNoz
— Optimization of Hash Tables
— JOINs in ClickHouse: a deep dive
ClickHouse Meetups in
New York, Berlin, Barcelona, London,
San-Francisco, Bengaluru.
ClickHouse for AI - Vectors, Embedding, Semantic Search, and more...
Video Recordings: https://www.youtube.com/c/ClickHouseDB
Now available in
— free 30-day trial with $300 credits up to 10 TB of data;
— affordable clusters for developers with full HA < $100/month
Try it! https://clickhouse.cloud/.