ClickHouse: Release 23.5 Webinar

ClickHouse
Release 23.5

Release 23.5 Webinar

1. (50 min) What's new in ClickHouse 23.5.

2. (10 min) Q&A.

Release 23.5

ClickHouse May Release.

β€” 29 new features 🌿

β€” 25 performance optimizations 🚴

β€” 83 bug fixes 🐝

Compatibility of the CASE expression

SELECT CASE WHEN number = 1 THEN 'Hello' WHEN number = 2 THEN 'World' ELSE '-' END AS res FROM numbers(4) β”Œβ”€res───┐ β”‚ - β”‚ β”‚ Hello β”‚ β”‚ World β”‚ β”‚ - β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”˜

SELECT CASE number WHEN 1 THEN 'Hello' WHEN 2 THEN 'World' ELSE '-' END AS res FROM numbers(4) β”Œβ”€res───┐ β”‚ - β”‚ β”‚ Hello β”‚ β”‚ World β”‚ β”‚ - β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”˜

SELECT CASE WHEN number = 1 THEN 'Hello' WHEN number = 2 THEN 'World' END AS res FROM numbers(4) β”Œβ”€res───┐ β”‚ ᴺᡁᴸᴸ β”‚ β”‚ Hello β”‚ β”‚ World β”‚ β”‚ ᴺᡁᴸᴸ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”˜

SELECT CASE number WHEN 1 THEN 'Hello' WHEN 2 THEN 'World' END AS res FROM numbers(4) β”Œβ”€res───┐ β”‚ ᴺᡁᴸᴸ β”‚ β”‚ ᴺᡁᴸᴸ β”‚ β”‚ ᴺᡁᴸᴸ β”‚ β”‚ ᴺᡁᴸᴸ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”˜

Compatibility of the CASE expression

SELECT CASE WHEN number = 1 THEN 'Hello' WHEN number = 2 THEN 'World' ELSE '-' END AS res FROM numbers(4) β”Œβ”€res───┐ β”‚ - β”‚ β”‚ Hello β”‚ β”‚ World β”‚ β”‚ - β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”˜

SELECT CASE number WHEN 1 THEN 'Hello' WHEN 2 THEN 'World' ELSE '-' END AS res FROM numbers(4) β”Œβ”€res───┐ β”‚ - β”‚ β”‚ Hello β”‚ β”‚ World β”‚ β”‚ - β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”˜

SELECT CASE WHEN number = 1 THEN 'Hello' WHEN number = 2 THEN 'World' END AS res FROM numbers(4) β”Œβ”€res───┐ β”‚ ᴺᡁᴸᴸ β”‚ β”‚ Hello β”‚ β”‚ World β”‚ β”‚ ᴺᡁᴸᴸ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”˜

SELECT CASE number WHEN 1 THEN 'Hello' WHEN 2 THEN 'World' END AS res FROM numbers(4) β”Œβ”€res───┐ β”‚ ᴺᡁᴸᴸ β”‚ β”‚ Hello β”‚ β”‚ World β”‚ β”‚ ᴺᡁᴸᴸ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”˜

Compatibility of the CASE expression

SELECT multiIf(number = 1, 'Hello', number = 2, 'World', '-') AS res FROM numbers(4) β”Œβ”€res───┐ β”‚ - β”‚ β”‚ Hello β”‚ β”‚ World β”‚ β”‚ - β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”˜

SELECT transform(number, [1, 2], ['Hello', 'World'], '-') AS res FROM numbers(4) β”Œβ”€res───┐ β”‚ - β”‚ β”‚ Hello β”‚ β”‚ World β”‚ β”‚ - β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”˜

SELECT ... INTO OUTFILE '...' APPEND

:) SELECT * FROM numbers(10) INTO OUTFILE 'numbers.tsv' Ok. :) SELECT * FROM numbers(10) INTO OUTFILE 'numbers.tsv' Cannot open file numbers.tsv, errno: 17, strerror: File exists. -- this is fine :) SELECT * FROM numbers(10) INTO OUTFILE 'numbers.tsv' APPEND Ok.

Developer: Manas Alekar.

ATTACH ... From Temporary Tables

Preparing data inside staging tables
and attaching it in one large batch.

CREATE TEMPORARY TABLE src (x UInt64) ENGINE = MergeTree ORDER BY x; INSERT INTO src SELECT * FROM numbers(1000000); ... ALTER TABLE dst ATTACH PARTITION () FROM src;

Works with REPLACE PARTITION as well.

Developer: Roman Vasin.

generateRandom is Self-Sufficient

SELECT * FROM generateRandom('x UInt8, y String') LIMIT 10;

CREATE TEMPORARY TABLE test (x UInt8, y String); INSERT INTO test SELECT * FROM generateRandom() LIMIT 10;

SELECT * FROM generateRandom() LIMIT 10; DESCRIBE (SELECT * FROM generateRandom());

Demo

Developer: Pavel Kruglov.

SHOW INDEX

For MySQL compatibility.

SHOW INDEX FROM hits Row 1: ────── table: hits non_unique: 0 key_name: PRIMARY seq_in_index: ᴺᡁᴸᴸ column_name: ᴺᡁᴸᴸ collation: A cardinality: ᴺᡁᴸᴸ sub_part: ᴺᡁᴸᴸ packed: ᴺᡁᴸᴸ null: ᴺᡁᴸᴸ index_type: primary comment: ᴺᡁᴸᴸ index_comment: ᴺᡁᴸᴸ visible: YES expression: CounterID, EventDate, UserID, EventTime, WatchID

Developer: Robert Schulze.

New System Tables

system.user_processes

— current memory usage and resource counters for every user.

Demo

Developer: Kovi Benjamin Antal.

New System Tables

SELECT * FROM system.zookeeper_connection Row 1: ────── name: default host: 127.0.0.1 port: 9181 index: 0 connected_time: 1970-01-01 01:25:00 is_expired: 0 keeper_api_version: 3 client_id: 122

Developer: Mateng915.

Boring Security Features

bcrypt for password hashing;

— allow configuring the default password hashing;

CREATE USER test IDENTIFIED BY 'my_password'; config.d/password_type.yaml: default_password_type: sha256_password

Developer: Nikolai Degterinskiy.

— BACKUP and RESTORE works with encryption-at-rest
  without decryption and re-ecryption.

Developer: Vitaly Baranov.

Boring Features

space function;

— for compatibility with Spark;

Demo

Developer: Robert Schulze.

Azure Table Function

Table engine:

CREATE TABLE test ... ENGINE = AzureBlobStorage( 'https://alesapintestazure.blob.core.windows.net', 'container', 'path.tsv'); SELECT * FROM test; INSERT INTO test ...;

Table function:

SELECT * FROM azure_blob_storage('...', 'container', 'path.tsv'); INSERT INTO FUNCTION azure_blob_storage('...', 'container', 'path.tsv');

Developer: Alexander Sapin, Smita Kulkarni.

urlCluster Table Function

SELECT ... FROM s3( 'https://bucket.s3.amazonaws.com/data*.parquet.zst')

SELECT ... FROM s3Cluster( cluster, 'https://bucket.s3.amazonaws.com/data*.parquet.zst')

Since version 23.5:

SELECT ... FROM url( 'https://bucket.s3.amazonaws.com/data{01..99}.parquet.zst')

SELECT ... FROM urlCluster( cluster, 'https://bucket.s3.amazonaws.com/data{01..99}.parquet.zst')

Developer: Attack204, Pavel Kruglov.

ClickHouse Keeper Client

clickhouse keeper-client

— a tool inside ClickHouse to introspect your Keeper or ZooKeeper.

Demo

Developer: Artem Brustovetskii.

Experimental Features Becoming Production Ready

Query Results Cache

— securely separated between users by default;
— normalized queries to their AST;
— respects query settings, and non-deterministic functions;
— controllable in size for each element, or in total;
— can be enabled for active and passive usage;
— production ready since 23.5!

Geographical Data Types

— Point, Ring, Polygon, and MultiPolygon data types;
— distance, area, perimeter, union, intersection, convex hull, etc. functions;
— cartesian and spherical coordinates;
— read from WKT, output into SVG;
— production ready since 23.5!

Features Enabled By Default

Compressed marks and indices on disk

— available since 22.9;
— improves the speed of first queries after server startup;
— enabled by default in 23.5.

Memory Usage In Background Operations

New server settings to control memory usage:

merges_mutations_memory_usage_soft_limit and
merges_mutations_memory_usage_to_ram_ratio = 0.5

Merges and mutations will not start if more than half
of server memory is already consumed by background operations.

Developer: Dmitri Novik.

New table settings to lower memory usage:

merge_max_block_size_bytes and vertical_merge_algorithm_min_bytes_to_activate
— save memory when merging tables with large rows;

Developer: Nikita Mikhaylov.

Integrations

Integrations

The Official Tableau Connector:

https://github.com/ClickHouse/clickhouse-tableau-connector-jdbc

Originally developed by Analytika Plus.

Updates for Grafana, Metabase, Kafka Connect, Node.js, and Go drivers.

Cloud Updates

GCP

— ClickHouse Cloud is now available in Google Cloud!

API

— now you can control your services programmatically!
— easy to use and complete REST API for services, organizations, etc.
  OpenAPI; Swagger endpoint available;
— upcoming: Terraform provider;

IAM

— secure access to your S3 buckets from ClickHouse Cloud.

Reading Corner

https://clickhouse.com/blog/

— Vector Search In ClickHouse
— CDC From Postgres To ClickHouse
— ClickHouse For Observability with SigNoz
— Optimization of Hash Tables
— JOINs in ClickHouse: a deep dive

ClickHouse Meetups in
New York, Berlin, Barcelona, London,
San-Francisco, Bengaluru.

ClickHouse for AI - Vectors, Embedding, Semantic Search, and more...

Video Recordings: https://www.youtube.com/c/ClickHouseDB

ClickHouse Cloud

Now available in

— free 30-day trial with $300 credits up to 10 TB of data;

— affordable clusters for developers with full HA < $100/month

Try it! https://clickhouse.cloud/.

Q&A