ClickHouse: Release 25.11 Call

ClickHouse Release 25.11

ClickHouse release 25.11

1. (55 min) What's new in ClickHouse 25.11.

2. (5 min) Q&A.

Release 25.11

ClickHouse Thanksgiving Release.

โ€” 24 new features ๐Ÿฆƒ

โ€” 27 performance optimizations โ˜•

โ€” 97 bug fixes ๐Ÿ„

Small And Nice Features

system.unicode

A new system table for introspection and searching for code points:

:) SELECT block, general_category, groupConcat(code_point) FROM system.unicode WHERE name LIKE '%LEAF%' GROUP BY ALL

Demo ๐Ÿ€๐Ÿ๐Ÿ‚๐Ÿƒ

Developer: wxybear.

flipCoordinates

A function for Point, LineString, Ring, Polygon, MultiPolygon...

Similar to PostGIS's ST_FlipCoordinates.

:) SELECT arrayMap(x -> arrayMap(y -> reverse(y), x), polygon); -- 25.10 :) SELECT flipCoordinates(polygon); -- 25.11

For easy integration of different GIS software:

— most databases use the (lon, lat) order;
— most web maps use the (lat, lon) order.

Developer: Sachin Kumar Singh.

arrayRemove

A function for compatibility with Postgres:

-- 25.10, ClickHouse's higher-order function: :) SELECT arrayFilter(x -> x != 1, [1, 2, 1]); -- 25.11, the new function: :) SELECT arrayRemove([1, 2, 1], 1); -- 25.11, compatibility alias for Postgres: :) SELECT array_remove([1, 2, 1], 1);

Developer: Tiwary Saurav.

cume_dist

A window function, compatible with Spark, Databricks, Snowflake, Postgres.

Calculates the value of a CDF (cumulative distribution function)
— the ratio of the number of rows <= current row in the specified order.

Similar to percent_rank with a slightly different formula.

Demo

Developer: Manuel.

midpoint

A non-aggregate function, similar to least/greatest (min2/max2),
which returns the value of the same type as its arguments,
closest to the average of the arguments.

:) SELECT least(1, 9), greatest(1, 9), midpoint(1, 9) โ”Œโ”€least(1, 9)โ”€โ”ฌโ”€greatest(1, 9)โ”€โ”ฌโ”€midpoint(1, 9)โ”€โ” โ”‚ 1 โ”‚ 9 โ”‚ 5 โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ :) SELECT midpoint('2025-11-25'::Date, '2026-01-01'::Date) โ”Œโ”€midpoint(CASโ‹ฏ', 'Date'))โ”€โ” โ”‚ 2025-12-13 โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Developer: Simon Michal.

argAndMin, argAndMax

New aggregate functions, similar to argMin, argMax.

-- get the maximum price: :) SELECT max(price) FROM uk_price_paid WHERE toYear(date) = 2025; -- get the town corresponding to the maximum price: :) SELECT argMax(town, price) FROM uk_price_paid WHERE toYear(date) = 2025; -- get the (town, price) corresponding to the maximum price (25.10): :) SELECT argMax((town, price), price) FROM uk_price_paid WHERE toYear(date) = 2025; -- get the (town, price) corresponding to the maximum price (25.11): :) SELECT argAndMax(town, price) FROM uk_price_paid WHERE toYear(date) = 2025;

Developer: AbdAlRahman Gad.

Virtual Column _tags

For queries from S3-compatible object storages.

When you read files from S3, in addition to the data, you can request the following virtual columns: _path, _file, _size, _time, _etag, _tags.

Tags are a set of key-value pairs a user can attach to an object in S3.

They can be changed without rewriting the object.

ClickHouse represents tags as Map(String, String).

Developer: Zicong Qu.

Performance Improvements

Projections As Secondary Indices

You can already use projections to have different sort orders of a table:

CREATE TABLE pageviews ( CounterID UInt32, UserID UInt64, EventTime DateTime, ... PROJECTION by_time ( SELECT * ORDER BY EventTime -- projection's order ) ) ORDER BY (CounterID, UserID, EventTime) -- table's main order

And the projection with accelerate range queries or queries with ORDER BY.
However, it requires duplicating the storage for the columns (or a subset).

Can we make it work without duplication?

Projections As Secondary Indices

Since 25.11 you can create a projection with a single column, _part_offset:

CREATE TABLE pageviews ( CounterID UInt32, UserID UInt64, EventTime DateTime, ... PROJECTION by_time ( -- works as an index for the main table representation SELECT _part_offset ORDER BY EventTime ) ) ORDER BY (CounterID, UserID, EventTime)

This projection will accelerate queries using less extra storage
with the cost of worse locality of reads: useful for point queries. Demo.

Developer: Amos Bird.

Speed Up DISTINCT With Projections

A projection with GROUP BY:

CREATE TABLE pageviews ( CounterID UInt32, UserID UInt64, EventTime DateTime, MobilePhoneModel String, ... PROJECTION phones ( SELECT count() GROUP BY CounterID, MobilePhoneModel ) ) ORDER BY (CounterID, UserID, EventTime)

will be used for queries with DISTINCT by any subset of the fields:

SELECT DISTINCT MobilePhoneModel FROM pageviews

Developer: Nihal Z. Miaji.

Parallel Merge For Small GROUP BY

The case when GROUP BY has a small number of distinct keys
as they are small integers (8/16 bit), but the aggregation states are large:

SELECT number % 100 AS k, -- small integer as a key uniq(number) AS u -- large aggregation state FROM numbers_mt(1e8) GROUP BY k ORDER BY u DESC LIMIT 10

Demo

Developer: Jianfei Hu.

Lower Memory Usage For Table Metadata

By storing identical column descriptions for different data parts only once.

Example: a "CI Logs" cluster with:
2173 tables, each with 300 columns on average,
27,517 data parts,
21 trillion rows,
430 TB of compressed data,
1 shard (ClickHouse Cloud).

The optimisation save 12 GB out of 73 GB of memory!

Especially useful for tables with a large number of columns.

Developer: Azat.

Distributed Processing For Large Files

ClickHouse runs distributed queries:
— with Distributed tables;
— with Replicated tables, using "parallel replicas";
— on top of a bunch of external files, e.g., Parquet files on S3,
  using the s3Cluster table function.

But what if there is only a single (or a few) large files?

Since 25.11, ClickHouse splits the work using ranges
(row groups) inside single files!

Many servers of the cluster will "eat" different portions of a file.

Developer: Konstantin Vedernikov.

Something Interesting

Fractional LIMIT and OFFSET

Take a fraction of the resultset:

:) SELECT county, avg(price) AS price FROM uk_price_paid GROUP BY county ORDER BY price DESC LIMIT 0.25; -- take first 1/4 of the result :) SELECT county, avg(price) AS price FROM uk_price_paid GROUP BY county ORDER BY price DESC LIMIT 100 OFFSET 0.5; -- take 100 records from the middle

Use case: sampling a portion of the result without using window functions.

Developers: Ahmed Gouda.

EXECUTE AS user

Allow one user to run queries on behalf of another user:

GRANT IMPERSONATE ON user1 TO user2; GRANT IMPERSONATE ON * TO user3;

Run queries on behalf of another user:

EXECUTE AS target_user SELECT * FROM table; EXECUTE AS target_user; -- sets it for the session SELECT * FROM table;

Use-case: an app authenticates as one user, and does the work under other configured users for access rights, limits, settings, quotas, and audit.

Developer: Shankar.

Geometry Data Type

We already had all the data types for GIS:

Point, LineString, MultiLineString, Ring, Polygon, MultiPolygon

Since 25.11, we also have one unified data type: Geometry

— it can contain any geometry!

New functions, readWkt, readWkb,
which read any type of geometry as a Geometry value.

ClickHouse also supports H3 and S2 indexes, Geohashes, optimized spherical and geo distances,
Polygonal dictionaries for reverse geocoding, SVG rendering...

Developer: Konstantin Vedernikov.

ACME (Let's Encrypt) Integration ๐Ÿงช

Automatically provision TLS certificates on a distributed cluster:

$ cat /etc/clickhouse-server/config.d/ssl.yaml http_port: 80 acme: email: [email protected] terms_of_service_agreed: true # zookeeper_path: '/clickhouse/acme' # by default domains: - domain: play.clickhouse.com

Supports the HTTP-01 challenge.

The certificates are shared across the cluster with ClickHouse Keeper.

Developer: Konstantin Bogdanov, Sergey Lokhmatikov.

Bayesian Classifier For Texts ๐Ÿงช

Classifies texts based on the frequences of n-grams of bytes, code points,
or words (tokens), according to configured models with reference frequences.

nb_models: model: name: language path: '/etc/clickhouse-server/config.d/language.bin' n: 2 # n-gram size mode: codepoint # byte, codepoint, token alpha: 1.0 # optional, smoothing factor priors: # optional - prior: class: 0 value: 0.6 - prior: class: 1 value: 0.4

Developer: Nihal Z. Miaji.

Bayesian Classifier For Texts ๐Ÿงช

A simple and fast model that can detect charset, language, topic, etc.,
easily created (trained) on your corpus.

Remains practical in the LLM era for routine tasks on massive volumes of data,
but can't compete on precision.

:) SELECT naiveBayesClassifier('language', text); -- takes the model name and return the class number, according to the model.

Demo

Developer: Nihal Z. Miaji.

Data Lakes

Fabric OneLake

OneLake - Microsoft Fabric's unified data lake,
powered by the OneLake Tables APIs and Apache Iceberg.

Compatible with the Iceberg REST Catalog.

CREATE DATABASE onelake ENGINE = DataLakeCatalog('https://onelake.table.fabric.microsoft.com/iceberg') SETTINGS catalog_type = 'onelake', warehouse = 'warehouse_id/data_item_id', onelake_tenant_id = '<tenant_id>', oauth_server_uri = 'https://login.microsoftonline.com/<tenant_id>/oauth2/v2.0/token', auth_scope = 'https://storage.azure.com/.default', onelake_client_id = '<client_id>', onelake_client_secret = '<client_secret>'

Developers: Konstantin Vedernikov.

Integrations

Integrations

ClickStack now supports alerts.

Python driver supports SQLAlchemy 2.x, with Pandas 1 and 2,
r/w support for Arrow, and Polars integration.

Rust driver: type safety for reads and writes, asynchronous inserts.

ODBC driver: proper handling of network errors.

JavaScript driver: handling exceptions in the middle of result stream.

Updates for dbt, Fivetran, Flink, Spark, Kafka connectors.

ClickPipes support Azure Blob Storage as GA.

Bonus

chDB Update

chDB - a ClickHouse engine embedded in Python
and other languages

Version 3.7.1 is 1.9x faster on ClickBench.

Faster operations with ArrowTable.

stockhouse.clickhouse.com

velocity.clickhouse.com

Meetups


— ๐Ÿ‡บ๐Ÿ‡ธ House Party, The SQL, Dec 2
— ๐Ÿ‡ต๐Ÿ‡ฑ Warsaw, Nov 26
— ๐Ÿ‡บ๐Ÿ‡ธ New York, Dec 8
— ๐Ÿ‡บ๐Ÿ‡ธ San Francisco, Dec 8
— ๐Ÿ‡ฎ๐Ÿ‡ฉ Jakarta, Dec 9
— ๐Ÿ‡ฏ๐Ÿ‡ต Tokyo, Dec 15
— ๐Ÿ‡ฎ๐Ÿ‡ฑ Tel Aviv, Dec 29

Reading Corner ๐Ÿ“–

https://clickhouse.com/blog/

— chDB Kernel Upgrade Journey
— ClickHouse integrates with OneLake
— Alerting arrives in ClickStack
— ClickHouse welcomes LibreChat
— Global weather data from flying airplanes
— Merkle Science fights crypto crime at scale
— How we made our internal data warehouse AI-first
— Building StockHouse: Real-time market analytics
— ClickHouse Partners with Japan Cloud
— Team's productivity metrics

Q&A