Author: Alexey Milovidov, 2018-10-11.
Support for ON.
— AND chains of equality comparisons.
Support for qualified names and aliases
for both left and right tables.
Available since version 18.6.
Before:
SELECT C_REGION, sum(LO_EXTENDEDPRICE * LO_DISCOUNT)
FROM lineorder
ANY INNER JOIN
(
SELECT C_REGION, C_CUSTKEY AS LO_CUSTKEY
FROM customer
) USING (LO_CUSTKEY)
WHERE (toYear(LO_ORDERDATE) = 1993) AND ((LO_DISCOUNT >= 1)
AND (LO_DISCOUNT <= 3)) AND (LO_QUANTITY < 25)
GROUP BY C_REGION;
After:
SELECT C_REGION, sum(LO_EXTENDEDPRICE * LO_DISCOUNT)
FROM lineorder ANY INNER JOIN customer
ON LO_CUSTKEY=C_CUSTKEY
WHERE (toYear(LO_ORDERDATE) = 1993) AND ((LO_DISCOUNT >= 1)
AND (LO_DISCOUNT <= 3)) AND (LO_QUANTITY < 25)
GROUP BY C_REGION;
SET join_default_strictness = 'any';
SELECT C_REGION, sum(LO_EXTENDEDPRICE * LO_DISCOUNT)
FROM lineorder INNER JOIN customer
ON LO_CUSTKEY=C_CUSTKEY
WHERE (toYear(LO_ORDERDATE) = 1993) AND ((LO_DISCOUNT >= 1)
AND (LO_DISCOUNT <= 3)) AND (LO_QUANTITY < 25)
GROUP BY C_REGION;
Correct behaviour of *.
You can leave old behaviour with the 'asterisk_left_columns_only' setting.
Available since 18.12.
SELECT 1 - 0.9 - 0.1 AS x
┌───────────────────────x─┐
│ -2.7755575615628914e-17 │
└─────────────────────────┘
Decimal numbers with fixed point.
Example:
— DECIMAL(16, 6);
— Decimal128(6);
Late beta. Available in 18.12,
under 'allow_experimental_decimal_type' setting.
Number of digits — up to 38.
Stored as integer number Int32, Int64 or Int128 containing all the digits.
For most of the operations, performance is about the same as for integers.
There are overflow checks
— can be turned off with 'decimal_check_overflow' setting.
What we already have had before:
Information about single queries:
— system.processes, SHOW PROCESSLIST;
— system.query_log;
— server text log.
Global performance metrics:
— system.events;
— system.metrics;
— system.asynchronous_metrics;
— also sent to Graphite;
We've added info about real CPU and IO usage:
RealTime, UserTime, SystemTime,
SoftPageFaults, HardPageFaults,
VoluntaryContextSwitches,
InvoluntaryContextSwitches
IOWait, CPUWait,
CPUVirtualTime,
ReadBytes, WriteBytes,
ReadChars, WriteChars
We've added info about real CPU and IO usage:
Half of them — from getrusage,
another half — from Linux kernel, obtained through AF_NETLINK socket.
Requires Linux and CAP_NET_ADMIN capability
— set at the package install time.
All performance counters became hierarchical:
— global counters
— in system.events table;
— counters per each query
— in system.processes and system.query_log;
— counters per each query execution thread
— in system.query_thread_log table;
All performance counters became hierarchical:
— in system.processes and system.query_log
we've added ProfileEvents nested data structure
with all the counters;
— added Settings nested structure
with all changed settings;
— added thread_numbers column
with numbers of all query executing threads;
— added system.query_thread_log table;
— settings: log_profile_events, log_query_settings, log_query_threads;
Information about queries in server text log:
— every line in log related to query contains query_id.
Receive query execution log directly in clickhouse-client:
SET send_logs_level = 'trace';
It works independent of server global log level.
For distributed queries it will accumulate logs from all participating servers.
example.yandex.ru :) SET send_logs_level = 'debug'
example.yandex.ru :) SELECT SearchPhrase, count() c FROM test.hits GROUP BY SearchPhrase ORDER BY c DESC LIMIT 10
[example.yandex.ru] 2018.08.16 02:06:17.302711 {e70872c6-30e5-4774-a965-ab9fcabedf02} [ 45 ]
<Debug> executeQuery: (from [::1]:56440) SELECT SearchPhrase, count() c FROM test.hits GROUP BY SearchPhrase ORDER BY c DESC LIMIT 10
[example.yandex.ru] 2018.08.16 02:06:17.303662 {e70872c6-30e5-4774-a965-ab9fcabedf02} [ 45 ]
<Debug> test.hits (SelectExecutor): Key condition: unknown
[example.yandex.ru] 2018.08.16 02:06:17.303697 {e70872c6-30e5-4774-a965-ab9fcabedf02} [ 45 ]
<Debug> test.hits (SelectExecutor): MinMax index condition: unknown
[example.yandex.ru] 2018.08.16 02:06:17.303716 {e70872c6-30e5-4774-a965-ab9fcabedf02} [ 45 ]
<Debug> test.hits (SelectExecutor): Selected 1 parts by date, 1 parts by key, 1084 marks to read from 1 ranges
[example.yandex.ru] 2018.08.16 02:06:17.304632 {e70872c6-30e5-4774-a965-ab9fcabedf02} [ 45 ]
<Debug> executeQuery: Query pipeline:
Limit
Expression
MergeSorting
PartialSorting
Expression
ParallelAggregating
Expression × 16
MergeTreeThread
┌─SearchPhrase────────────┬───────c─┐
│ яндекс │ 1655 │
│ весна 2014 мода │ 1549 │
│ анджелина джоли │ 1245 │
...
[example.yandex.ru] 2018.08.16 02:06:17.395362 {e70872c6-30e5-4774-a965-ab9fcabedf02} [ 45 ] <Debug>
MemoryTracker: Peak memory usage (total): 86.74 MiB.
10 rows in set. Elapsed: 0.093 sec. Processed 8.87 million rows, 108.44 MB (94.92 million rows/s., 1.16 GB/s.)
Available in 18.12.
What we want to do:
— resource pools (workload management);
— throttling resource usage for proper share;
What we want to do:
— write ClickHouse text logs into system table in ClickHouse;
— write global system metrics (like CPU, iowait, LA)
into system table in ClickHouse;
— calculated histogram/quantile metrics for certain processes
(example: 90% query execution time);
— add per-column and per-file statistics in query_log;
— add counters for all error codes (system.errors table);
— send all metrics to the client;
— url;
— file;
— cluster;
— mysql;
— odbc.
Encryption (https) for replication traffic
Authentication between replicas
Encryption for traffic of Distributed tables
Safe usage of ODBC drivers for external sources
SELECT substringUTF8(Title, 1, 50) AS k, count() AS c FROM hits_1000m_transformed WHERE match(k, '^[\x01-\x7F]+$') GROUP BY k ORDER BY c DESC LIMIT 100 ┌─k──────────────────────────────────────────────────┬───────c─┐ │ Kunstkamera Club-Pickup Bai Porta.ru │ 211122 │ │ Kanye Kedax^ │ 187249 │ │ OWAProfiles. Colletina Portal Digested Gabbas - IR │ 186538 │ │ Verver Board │ 87895 │ │ OWAProfit-Parts, Bless Hotels Centre (Lassilev.24l │ 74841 │ │ Skins Resort & France.Ru / Compilation │ 60256 │ │ Demio - Livejour ownitsubishi - World of Tanks │ 56643 │ │ Ford Megane Shot comparen Payne Money Bannette for │ 54032 │ │ OWAProfit-Partner: Kalus30,eos3D.com :: Nokia! │ 53593 │ │ open shoe (Calence 2 64Gb Black (Sol │ 36314 │ │ Osunshine Player. │ 33316 │ │ OWAProfit-Parts Body Big Natural Dead: A Hauptmand │ 28153 │ │ PAOLO CONTRACK Atlant, Buckle | WWWW.PHOTODOM.ria. │ 24317 │ │ Opel (Count of Tanks - World of Tanks official for │ 23525 │ │ Kanuni (Palacticarefunds official forum - Page | W │ 23340 │ │ Ahmed the sidence (46E) │ 22812 │ │ Lexus - DimKrollsky feasantech tanks official foru │ 18929 │ │ Nokia Acrosoft │ 17973 │ │ Iveco x Hundai ; Toyota RAV-4 Black Of Thalassia.r │ 17908 │ │ OWAProfitween Espace Wolf' Pacification Port and T │ 16393 │
SELECT count() FROM hits
WHERE
((EventDate >= '2018-08-01')
AND (EventDate <= '2018-08-03')
AND (CounterID >= 34))
OR ((EventDate >= '2018-08-04')
AND (EventDate <= '2018-08-05')
AND (CounterID <= 101500))
— 649 533 033 rows/sec.
SET compile_expressions = 1;
SELECT count() FROM hits
WHERE
((EventDate >= '2018-08-01')
AND (EventDate <= '2018-08-03')
AND (CounterID >= 34))
OR ((EventDate >= '2018-08-04')
AND (EventDate <= '2018-08-05')
AND (CounterID <= 101500))
— 865 491 052 rows/sec.
— +33% speed up!
Web site: https://clickhouse.com/
Google groups: https://groups.google.com/forum/#!forum/clickhouse
Maillist: [email protected]
Telegram chat: https://telegram.me/clickhouse_en
GitHub: https://github.com/ClickHouse/ClickHouse/
Twitter: https://twitter.com/ClickHouseDB
Up next: ClickHouse Meetup in Beijing on October 28.