ClickHouse: Release 25.2 Call

ClickHouse Release 25.2

Release 25.2 Call

1. (45 min) What's new in ClickHouse 25.2.

2. (10 min) Guest Talk.

3. (5 min) Q&A.

Release 25.2

ClickHouse February Release.

โ€” 12 new features ๐Ÿงธ

โ€” 15 performance optimizations ๐Ÿซ

โ€” 72 bug fixes ๐Ÿ’

Small Features

Subcolumns In DEFAULT expressions

25.1:

CREATE TABLE test ( data JSON time DateTime DEFAULT getSubcolumn(data 'created_at') ) ORDER BY data.actor_login::String

25.2:

CREATE TABLE test ( data JSON time DateTime DEFAULT data.created_at ) ORDER BY data.actor_login::String

Developer: Pavel Kruglov.

Copy-On-Write Storage Policies

Allows to instantly attach a snapshot but make the table writable.

$ cat config.d/cow_policy.yaml storage_configuration: disks: web: type: web endpoint: 'https://raw.githubusercontent.com/ClickHouse/web-tables-demo/main/web/' policies: cow_policy_multi_disk: volumes: disks: disk: - web # a the readonly disk with a snapshot - default # a writable disk

Copy-On-Write Storage Policies

Allows to instantly attach a snapshot but make the table writable.

The table will be based on an external dataset,
but modifications are performed locally.

Demo

See also "table branching": CREATE TABLE t2 CLONE AS t1;

Developer: Azat Khuzhin.

Attaching External Tables Without Databases

In 25.1 and before, external tables should be located
inside a directory structure, as if they are inside the database:

$ ls -lR ./web/store/cf7/cf712b4f-2ca8-435c-ac23-c4393efe52f7: drwxrwxr-x 2 milovidov milovidov 4096 feb 27 08:25 all_1_27_2 drwxrwxr-x 2 milovidov milovidov 4096 feb 27 08:25 detached -rw-rw-r-- 1 milovidov milovidov 1 feb 27 08:25 format_version.txt CREATE TABLE uk_price_paid UUID 'cf712b4f-2ca8-435c-ac23-c4393efe52f7' ( ... ) ORDER BY (postcode1, postcode2, addr1, addr2) SETTINGS disk = disk(type = web, endpoint = 'https://raw.githubusercontent.com/ClickHouse/web-tables-demo/main/web/');

Attaching External Tables Without Databases

In 25.2 you can attach a table, pointing a disk directly to its data directory.

$ ls -lR ./uk_price_paid: drwxrwxr-x 2 milovidov milovidov 4096 feb 27 08:29 all_1_27_2 drwxrwxr-x 2 milovidov milovidov 4096 feb 27 08:29 detached -rw-rw-r-- 1 milovidov milovidov 1 feb 27 08:29 format_version.txt CREATE TABLE uk_price_paid ( ... ) ORDER BY (postcode1, postcode2, addr1, addr2) SETTINGS table_disk = 1, disk = disk(type = web, endpoint = 'https://raw.githubusercontent.com/ClickHouse/web-tables-demo/table_disk/uk_price_paid/');

Developer: Azat Khuzhin.

JSONCompactEachRowWithProgress

Needed for streaming events over HTTP

Since 25.1 we have:
— JSONEachRowWithProgress
— JSONStringsEachRowWithProgress

Since 25.2, also:
— JSONCompactEachRowWithProgress
— JSONCompactStringsEachRowWithProgress

They stream newline-delimited JSON events, where each event is one of:
progress, meta, row, totals, extremes, exception, as soon as they appear.

Developer: Alexey Milovidov.

JSONCompactEachRowWithProgress

$ curl http://localhost:8123/ -d "SELECT number % 3 AS k, count() FROM numbers(1e9) GROUP BY k WITH TOTALS FORMAT JSONCompactEachRowWithProgress" {"progress":{"read_rows":"74631669","read_bytes":"597053352","total_rows_to_read":"1000000000","elapsed_ns":"100040117"}} {"progress":{"read_rows":"151160199","read_bytes":"1209281592","total_rows_to_read":"1000000000","elapsed_ns":"200065327"}} {"progress":{"read_rows":"227754138","read_bytes":"1822033104","total_rows_to_read":"1000000000","elapsed_ns":"300115954"}} {"progress":{"read_rows":"304348077","read_bytes":"2434784616","total_rows_to_read":"1000000000","elapsed_ns":"400193152"}} {"progress":{"read_rows":"380680380","read_bytes":"3045443040","total_rows_to_read":"1000000000","elapsed_ns":"500227569"}} ... {"progress":{"read_rows":"762472713","read_bytes":"6099781704","total_rows_to_read":"1000000000","elapsed_ns":"1000340080"}} {"progress":{"read_rows":"838870425","read_bytes":"6710963400","total_rows_to_read":"1000000000","elapsed_ns":"1100394164"}} {"progress":{"read_rows":"914548638","read_bytes":"7316389104","total_rows_to_read":"1000000000","elapsed_ns":"1200466502"}} {"progress":{"read_rows":"990880941","read_bytes":"7927047528","total_rows_to_read":"1000000000","elapsed_ns":"1300489758"}} {"progress":{"read_rows":"1000000000","read_bytes":"8000000000","total_rows_to_read":"1000000000","elapsed_ns":"1312531422"}} {"meta":[{"name":"k","type":"UInt8"},{"name":"count()","type":"UInt64"}]} {"row":[0, "333333334"]} {"row":[1, "333333333"]} {"row":[2, "333333333"]} {"totals":{"k":0,"count()":"1000000000"}}

Developer: Alexey Milovidov.

Prepared Statements in the PgSQL Protocol

Did you know? — ClickHouse supports the PostgreSQL protocol!

$ cat config.xml <!-- Compatibility with PostgreSQL protocol. --> <postgresql_port>9005</postgresql_port> $ clickhouse-client :) CREATE USER pg IDENTIFIED WITH plaintext_password BY 'qwerty' :) GRANT SELECT ON default.hits TO pg $ psql --host 127.0.0.1 --port 9005 --user pg default

ClickHouse will pretend to be PostgreSQL for you!

Prepared Statements in the PgSQL Protocol

$ psql --host 127.0.0.1 --port 9005 --user pg default Password for user pg: psql (14.15 (Ubuntu 14.15-0ubuntu0.22.04.1), server 25.3.1.1) WARNING: psql major version 14, server major version 25. Some psql features might not work. Type "help" for help. default=> PREPARE select_test AS default-> SELECT count(), uniq(UserID) FROM hits WHERE CounterID = $1; SELECT 0 default=> EXECUTE select_test (62) default-> ; count() | uniq(UserID) ---------+-------------- 738172 | 62292 (1 row)

Developer: Scanhex12.

Performance Improvements

Writing Bloom Filters For Parquet

When exporting to Parquet, ClickHouse writes bloom filters for columns.

When running queries on Parquet, ClickHouse uses these bloom filters.

-- bloom filters will be created by default: :) SELECT * FROM hits INTO OUTFILE 'hits.parquet'; -- you can disable it to save space a little: :) SET output_format_parquet_write_bloom_filter = 0; :) SELECT * FROM hits INTO OUTFILE 'hits_old.parquet';

Demo.

Developer: Michael Kolupaev.

Transitive Conditions Inference

CREATE TABLE calls ( start_time DateTime, end_time DateTime, message String ) ORDER BY end_time; -- uses the index: SELECT uniq(message) FROM calls WHERE end_time >= '2025-08-01'; -- does not use the index: SELECT uniq(message) FROM calls WHERE start_time >= '2025-08-01'; -- uses the index since 25.2: SELECT uniq(message) FROM calls WHERE start_time >= '2025-08-01' AND end_time > start_time;

Developer: ShiChao Jin.

Parallel Hash JOIN Is Faster

Developer: Nikita Taranov.

Something
Interesting

Backup Database Engine

To access a backup as a readonly database.

The backup is attached instantly, without copying the data.

:) BACKUP DATABASE default TO File('backup1'); :) CREATE DATABASE backup ENGINE = Backup('default', File('backup1'));

Works for full and incremental backups, on local and external storages.

A nice way for packaging and distributing databases!

Developer: Maksim Kita.

Partition Pruning For Iceberg

SET use_iceberg_partition_pruning = 1;

Demo.

Developer: Dan Ivanik.

Integration With The Delta Rust Kernel

A new implementation of the Delta Lake support in ClickHouse
with the help of a library from Databricks.

Version 25.2 has an early-stage implementation:

CREATE TABLE t ENGINE = DeltaLake( s3, filename = 'xyz/', url = 'http://minio1:9001/test/') SETTINGS allow_experimental_delta_kernel_rs = 1;

It brings support for schema evolution, deletions, etc.

Developer: Kseniia Sumarokova.

Guest Talk

Oussama Saoudi, Software Engineer at Databricks

Bonus

Database Navigation In the Web UI

Demo: https://play.clickhouse.com/

Kubenetmon — Network Metering for Kubernetes

https://github.com/ClickHouse/kubenetmon

— a service we developed to monitor data transfer in ClickHouse Cloud

— uses ClickHouse for data storage (obviously)

— now released in open source!

Developer: Ilia Andreiev.

Integrations

Postgres connector in ClickPipes is in public beta.

ClickPipes supports virtual columns.

ODBC driver improves support for PowerBI.

Java and Go improve support for the JSON data type.

DBT view materialization updates are now atomic.

Updates for Metabase and Grafana.

Meetups

— ๐Ÿ‡จ๐Ÿ‡ณ Shanghai, March 1
— ๐Ÿ‡ซ๐Ÿ‡ท Paris, March 4
— ๐Ÿ‡บ๐Ÿ‡ธ Seattle, March 5
— ๐Ÿ‡บ๐Ÿ‡ธ San Francisco, March 19
— ๐Ÿ‡ฎ๐Ÿ‡ณ Delhi, March 22
— ๐Ÿ‡จ๐Ÿ‡ญ Zurich, March 24
— ๐Ÿ‡ญ๐Ÿ‡บ Budapest, March 25
— ๐Ÿ‡บ๐Ÿ‡ธ Boston, March 25
— ๐Ÿ‡ฎ๐Ÿ‡ฑ Tel Aviv, March 26
— ๐Ÿ‡บ๐Ÿ‡ธ New York, March 26
— ๐Ÿ‡ณ๐Ÿ‡ด Oslo, April 8
— ๐Ÿ‡ฌ๐Ÿ‡ง London, May 14

I will be in person on some of these meetups :)

Reading Corner ๐Ÿงถ

https://clickhouse.com/blog/

— ClickHouse Connector For Tableau
— Billion Docs JSON Challenge
— Input Format Matchup
— ClickHouse at FOSDEM 2025: JSON, Fuzzing, DNS
— ClickHouse BYOC
— Climbing the Iceberg with ClickHouse
— Postgres CDC connector for ClickPipes
— Agent-Facing Analytics
— Open-Sourcing Kubenetmon

Q&A