1. (50 min) What's new in ClickHouse 23.4.
2. (10 min) Q&A.
ClickHouse Spring Release.
β 15 new features π³
β 11 performance optimizations πΌ
β 36 bug fixes π¦
SELECT UserID,
SearchPhrase,
RegionID,
FROM test.hits
Was: Expected one of: token, Dot, OR, AND, BETWEEN, NOT BETWEEN, LIKE, ILIKE, NOT LIKE, NOT ILIKE, IN, NOT IN, GLOBAL IN, GLOBAL NOT IN, MOD, DIV, IS NULL, IS NOT NULL, alias, AS, Comma, FROM, PREWHERE, WHERE, GROUP BY, WITH, HAVING, WINDOW, ORDER BY, LIMIT, OFFSET, SETTINGS, UNION, EXCEPT, INTERSECT, INTO OUTFILE, FORMAT, end of query
Now: Just works!
Developer: Nikolai Degterinskiy.
clickhouse-client --max_threads 1 --query "SELECT 1"
clickhouse-client --max-threads 1 --query "SELECT 1"
Was: Code: 552. DB::Exception: Unrecognized option '--max-threads'.
Now: Just works!
Bonus:
clickhouse-client βmax-threads 1 --query "SELECT 1"
β It also works π
Developer: Alexey Golub.
We have many quantile functions:
β quantile, quantileDeterministic, quantileExact, quantileExactWeighted, quantileTiming, quantileTDigest, quantileBFloat16...
So, we decided to confuse you with the Greenwald-Khanna algorithm...
Demo
Developer: TaiYang Li.
SELECT kolmogorovSmirnovTest(ResolutionWidth, ResolutionHeight)
FROM hits
ββkolmogorovSmirnovTest(ResolutionWidth, ResolutionHeight)ββ
β (0.9962541421161303,0) β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Developer: FFFFFFFHHHHHHH.
The algorithm developed in 1918. It's obsolete by all means.
SELECT soundex('Hello, world!') AS x
ββxβββββ
β H464 β
ββββββββ
SELECT soundex('Hallo, wereld!') AS x
ββxβββββ
β H464 β
ββββββββ
... but we need it for compatibility with other SQL DBMS.
Developer: FriendLey.
ββfieldββββββββββββββββββ¬βtypeββββββ¬βnullββ¬βkeyββββββ¬βdefaultββ¬βextraββ
β AdvEngineID β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β Age β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β BrowserCountry β String β 0 β β α΄Ία΅α΄Έα΄Έ β β
β BrowserLanguage β String β 0 β β α΄Ία΅α΄Έα΄Έ β β
β CLID β Int32 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β ClientEventTime β DateTime β 0 β β α΄Ία΅α΄Έα΄Έ β β
β ClientIP β Int32 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β ClientTimeZone β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β CodeVersion β Int32 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β ConnectTiming β Int32 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β CookieEnable β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β CounterClass β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β CounterID β Int32 β 0 β PRI SOR β α΄Ία΅α΄Έα΄Έ β β
β DNSTiming β Int32 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β DontCountHits β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β EventDate β Date β 0 β PRI SOR β α΄Ία΅α΄Έα΄Έ β β
β EventTime β DateTime β 0 β PRI SOR β α΄Ία΅α΄Έα΄Έ β β
β FUniqID β Int64 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β FetchTiming β Int32 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β FlashMajor β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β FlashMinor β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β FlashMinor2 β String β 0 β β α΄Ία΅α΄Έα΄Έ β β
β FromTag β String β 0 β β α΄Ία΅α΄Έα΄Έ β β
β GoodEvent β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β HID β Int32 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β HTTPError β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β HasGCLID β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β HistoryLength β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β HitColor β String β 0 β β α΄Ία΅α΄Έα΄Έ β β
β IPNetworkID β Int32 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β Income β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β Interests β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β IsArtifical β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β IsDownload β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β IsEvent β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β IsLink β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β IsMobile β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β IsNotBounce β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β IsOldCounter β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β IsParameter β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β IsRefresh β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β JavaEnable β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β JavascriptEnable β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β LocalEventTime β DateTime β 0 β β α΄Ία΅α΄Έα΄Έ β β
β MobilePhone β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β MobilePhoneModel β String β 0 β β α΄Ία΅α΄Έα΄Έ β β
β NetMajor β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β NetMinor β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β OS β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β OpenerName β Int32 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β OpenstatAdID β String β 0 β β α΄Ία΅α΄Έα΄Έ β β
β OpenstatCampaignID β String β 0 β β α΄Ία΅α΄Έα΄Έ β β
β OpenstatServiceName β String β 0 β β α΄Ία΅α΄Έα΄Έ β β
β OpenstatSourceID β String β 0 β β α΄Ία΅α΄Έα΄Έ β β
β OriginalURL β String β 0 β β α΄Ία΅α΄Έα΄Έ β β
β PageCharset β String β 0 β β α΄Ία΅α΄Έα΄Έ β β
β ParamCurrency β String β 0 β β α΄Ία΅α΄Έα΄Έ β β
β ParamCurrencyID β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β ParamOrderID β String β 0 β β α΄Ία΅α΄Έα΄Έ β β
β ParamPrice β Int64 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β Params β String β 0 β β α΄Ία΅α΄Έα΄Έ β β
β Referer β String β 0 β β α΄Ία΅α΄Έα΄Έ β β
β RefererCategoryID β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β RefererHash β Int64 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β RefererRegionID β Int32 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β RegionID β Int32 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β RemoteIP β Int32 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β ResolutionDepth β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β ResolutionHeight β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β ResolutionWidth β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β ResponseEndTiming β Int32 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β ResponseStartTiming β Int32 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β Robotness β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β SearchEngineID β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β SearchPhrase β String β 0 β β α΄Ία΅α΄Έα΄Έ β β
β SendTiming β Int32 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β Sex β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β SilverlightVersion1 β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β SilverlightVersion2 β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β SilverlightVersion3 β Int32 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β SilverlightVersion4 β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β SocialAction β String β 0 β β α΄Ία΅α΄Έα΄Έ β β
β SocialNetwork β String β 0 β β α΄Ία΅α΄Έα΄Έ β β
β SocialSourceNetworkID β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β SocialSourcePage β String β 0 β β α΄Ία΅α΄Έα΄Έ β β
β Title β String β 0 β β α΄Ία΅α΄Έα΄Έ β β
β TraficSourceID β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β URL β String β 0 β β α΄Ία΅α΄Έα΄Έ β β
β URLCategoryID β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β URLHash β Int64 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β URLRegionID β Int32 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β UTMCampaign β String β 0 β β α΄Ία΅α΄Έα΄Έ β β
β UTMContent β String β 0 β β α΄Ία΅α΄Έα΄Έ β β
β UTMMedium β String β 0 β β α΄Ία΅α΄Έα΄Έ β β
β UTMSource β String β 0 β β α΄Ία΅α΄Έα΄Έ β β
β UTMTerm β String β 0 β β α΄Ία΅α΄Έα΄Έ β β
β UserAgent β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β UserAgentMajor β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β UserAgentMinor β String β 0 β β α΄Ία΅α΄Έα΄Έ β β
β UserID β Int64 β 0 β PRI SOR β α΄Ία΅α΄Έα΄Έ β β
β WatchID β Int64 β 0 β PRI SOR β α΄Ία΅α΄Έα΄Έ β β
β WindowClientHeight β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β WindowClientWidth β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β WindowName β Int32 β 0 β β α΄Ία΅α΄Έα΄Έ β β
β WithHash β Int16 β 0 β β α΄Ία΅α΄Έα΄Έ β β
βββββββββββββββββββββββββ΄βββββββββββ΄βββββββ΄ββββββββββ΄ββββββββββ΄ββββββββ
A worse version of SELECT * FROM system.columns.
This is compatible with MySQL, but ugly.
Developer: Robert Schulze.
SELECT * FROM file('hits.parquet', ParquetMetadata)
FORMAT PrettyJSONEachRow
Represents the metadata about row groups and column statistics as a table.
Demo
Developer: Pavel Kruglov.
SELECT extractKeyValuePairs(
'hello="world", abc="def"',
'=',
', ',
'"') AS x
ββxββββββββββββββββββββββββββββββ
β {'hello':'world','abc':'def'} β Map(String, String)
βββββββββββββββββββββββββββββββββ
Developer: Arthur Passos.
SELECT replaceAll('Hello, world',
number % 2 ? 'Hello' : 'world',
number % 3 ? 'Goodbye' : 'Greetings') AS x
FROM numbers(2)
ββxβββββββββββββββββ
β Hello, Greetings β
β Goodbye, world β
ββββββββββββββββββββ
— replaceOne, replaceAll, replaceRegexpOne, replaceRegexpAll...
— both needle and replacement arguments can be non-constant.
Developer: Robert Schulze.
From S3 and URLs.
100 times faster.
Demo.
Developer: Michael Kolupaev.
Before: connect_timeout_with_failover_ms = 50.
Allowed quick failover on unavailable replicas.
But required tuning on cross-continent connections.
Now: connect_timeout_with_failover_ms = 1000,
but hedged_connection_timeout_ms = 50.
It starts to connect to other replicas after 50 ms,
but doesn't drop the connection to the first replica,
and continue connecting to multiple replcias in parallel.
Developers: Pavel Kruglov.
Support for Iceberg v2.
Support for partitioned and non-partitioned Iceberg, Hudi and DeltaLake.
Taking Delta format's checkpoints for reading optimization.
Tracking the changes of the data.
Developers: Kseniia Sumarokova.
Looker:
— works with ClickHouse through MySQL protocol.
— next: Quicksight?
Kafka Connector
— Added support for Nullable types
— Added retry mechanism for connection errors
— Published on Confluent Hub
Updates for Metabase, Golang, Python, and Java drivers.
— A Deep Dive into Apache Parquet with ClickHouse
— A Deep Dive into JOIN algorithms in ClickHouse
— A Story Of A Serial Product Manager
https://github.com/ClickHouse/NoiSQL
— Generating music with SQL queries in ClickHouse
— free 30-day trial with $300 credits up to 10 TB of data;
— affordable clusters for developers with full HA < $100/month
Try it! https://clickhouse.cloud/.