Clickhouse Features to Blow your Mind

Author: Alexey Milovidov, 2019-10-02.

Clickhouse Features
to Blow your Mind

Per-Column Compression Codecs

col type CODEC(codecs...)

Available codecs:
— LZ4 (default);
— ZSTD; — level can be specified: ZSTD(1);
— LZ4HC; — level can be specified;
— NONE;
— Delta(N); — N is the size of the data type in bytes.

Codecs can be chained together:

time DateTime CODEC(Delta, LZ4)

— Delta is not a compression itself, it must be chained with a second codec.

Per-column codecs have priority over <compression> config settings.

Per-Column Compression Codecs

SELECT name, type, formatReadableSize(data_compressed_bytes) AS compressed, formatReadableSize(data_uncompressed_bytes) AS uncompressed, data_uncompressed_bytes / data_compressed_bytes AS ratio, compression_codec FROM system.columns WHERE (database = 'test') AND (table = 'hits') ORDER BY data_compressed_bytes DESC LIMIT 10 ┌─name────────────┬─type─────┬─compressed─┬─uncompressed─┬──────────────ratio─┬─compression_codec─┐ │ Referer │ String │ 180.19 MiB │ 582.99 MiB │ 3.2353881463220975 │ │ │ URL │ String │ 128.93 MiB │ 660.58 MiB │ 5.123600238954646 │ │ │ Title │ String │ 95.29 MiB │ 595.01 MiB │ 6.244488505685867 │ │ │ WatchID │ UInt64 │ 67.28 MiB │ 67.70 MiB │ 1.0062751884416956 │ │ │ URLHash │ UInt64 │ 37.09 MiB │ 67.70 MiB │ 1.8254645825020759 │ │ │ ClientEventTime │ DateTime │ 31.42 MiB │ 33.85 MiB │ 1.0772947535816229 │ │ │ EventTime │ DateTime │ 31.40 MiB │ 33.85 MiB │ 1.0780959105750834 │ │ │ UTCEventTime │ DateTime │ 31.39 MiB │ 33.85 MiB │ 1.0783175064258996 │ │ │ HID │ UInt32 │ 28.28 MiB │ 33.85 MiB │ 1.19709852035762 │ │ │ RefererHash │ UInt64 │ 27.68 MiB │ 67.70 MiB │ 2.445798559204409 │ │ └─────────────────┴──────────┴────────────┴──────────────┴────────────────────┴───────────────────┘

Per-Column Compression Codecs

ALTER TABLE test.hits MODIFY COLUMN ClientEventTime CODEC(Delta, LZ4)

Changes are applied lazily: only for new data and while merging.

ALTER TABLE test.hits UPDATE ClientEventTime = ClientEventTime WHERE 1

— a trick to rewrite column data on disk.

— also executed in background, look at system.mutations table.

Per-Column Compression Codecs

SELECT name, type, formatReadableSize(data_compressed_bytes) AS compressed, formatReadableSize(data_uncompressed_bytes) AS uncompressed, data_uncompressed_bytes / data_compressed_bytes AS ratio, compression_codec FROM system.columns WHERE (database = 'test') AND (table = 'hits') AND (name = 'ClientEventTime') ORDER BY data_compressed_bytes DESC LIMIT 10 ┌─name────────────┬─type─────┬─compressed─┬─uncompressed─┬──────────────ratio─┬─compression_codec────┐ │ ClientEventTime │ DateTime │ 19.47 MiB │ 33.85 MiB │ 1.7389218149308554 │ CODEC(Delta(4), LZ4) │ └─────────────────┴──────────┴────────────┴──────────────┴────────────────────┴──────────────────────┘ ALTER TABLE test.hits MODIFY COLUMN ClientEventTime CODEC(Delta(4), ZSTD), UPDATE ClientEventTime = ClientEventTime WHERE 1 ┌─name────────────┬─type─────┬─compressed─┬─uncompressed─┬─────────────ratio─┬─compression_codec────────┐ │ ClientEventTime │ DateTime │ 14.00 MiB │ 33.85 MiB │ 2.417489322394391 │ CODEC(Delta(4), ZSTD(1)) │ └─────────────────┴──────────┴────────────┴──────────────┴───────────────────┴──────────────────────────┘

LowCardinality Data Type

Just replace String to LowCardinality(String)
for string fields with low number of unique values.

... and it will magically work faster.

For high cardinality fields it will work fine but pointless.

Examples:

city name — ok;
domain of URL — ok;
search phrase — bad;
URL — bad;

LowCardinality Data Type

SELECT count() FROM hits_333 WHERE URLDomain LIKE '%aena.es%' ┌─count()─┐ │ 101 │ └─────────┘ 1 rows in set. Elapsed: 0.446 sec. Processed 333.36 million rows, 7.32 GB (747.87 million rows/s., 16.43 GB/s.)

LowCardinality Data Type

ALTER TABLE hits_333 MODIFY COLUMN URLDomain LowCardinality(String) Ok. 0 rows in set. Elapsed: 16.228 sec.

LowCardinality Data Type

SELECT count() FROM hits_333 WHERE URLDomain LIKE '%aena.es%' ┌─count()─┐ │ 101 │ └─────────┘ 1 rows in set. Elapsed: 0.244 sec. Processed 333.36 million rows, 1.72 GB (1.37 billion rows/s., 7.04 GB/s.)


Two times faster!

TTL expressions

— for columns:

CREATE TABLE t ( date Date, ClientIP UInt32 TTL date + INTERVAL 3 MONTH

— for all table data:

CREATE TABLE t (date Date, ...) ENGINE = MergeTree ORDER BY ... TTL date + INTERVAL 3 MONTH

Tiered Storage

Example: store hot data on SSD and archive data on HDDs.

Multiple storage policies can be configured and used on per-table basis.

Tiered Storage

Step 1: configure available disks (storage paths):

<disks> <fast_disk> <!-- disk name --> <path>/mnt/fast_ssd/clickhouse</path> </fast_disk> <disk1> <path>/mnt/hdd1/clickhouse</path> <keep_free_space_bytes>10485760 </keep_free_space_bytes> </disk1> ...

Tiered Storage

Step 2: configure storage policies:

<policies> <ssd_and_hdd> <volumes> <hot> <disk>fast_ssd</disk> <max_data_part_size_bytes>1073741824 </max_data_part_size_bytes> </hot> <cold> <disk>disk1</disk> </cold> <move_factor>0.2</move_factor> </volumes> </ssd_and_hdd> ...

Tiered Storage

Step 3: use the configured policy for your table:

CREATE TABLE table ( ... ) ENGINE = MergeTree ORDER BY ... SETTINGS storage_policy = 'ssd_and_hdd'

Tiered Storage

The data will be moved between volumes automatically.

You can also do it manually:

ALTER TABLE table MOVE PART|PARTITION ... TO VOLUME|DISK ...

— available in 19.15.

ASOF JOIN

(by Citadel Securities)

Join data by inexact (nearest) match.
Usually by date/time.

Example:
— to correlate stock prices with weather sensors.

Data Skipping Indices

Collect a summary of column/expression values for every N granules.

Use this summaries to skip data while reading.

Indices are available for MergeTree family of table engines.

SET allow_experimental_data_skipping_indices = 1;

Data Skipping Indices

CREATE TABLE table (... INDEX name expr TYPE type(params...) GRANULARITY n ...) ALTER TABLE ... ADD INDEX name expr TYPE type(params...) GRANULARITY n ALTER TABLE ... DROP INDEX name

Secondary Index Types

minmax
— summary is just min/max boundaries of values;
— use when values are correlated to table order;
     or distributed locally; or sparse;

set(k)
— summary is a set of all distinct values, but not larger than k;
— use when values are sparse or have low cardinality;
— reasonable values of k is about hundred;

Used for comparison and IN operators.

Secondary Index Types

Full text search indices (highly experimental)

ngrambf_v1(chars, size, hashes, seed)

tokenbf_v1(size, hashes, seed)

Used for equals comparison, IN and LIKE.

Data Skipping Indices

SELECT count() FROM test.hits WHERE URLDomain LIKE '%aena.es%' ┌─count()─┐ │ 1 │ └─────────┘ Processed 8.87 million rows

Data Skipping Indices

SET allow_experimental_data_skipping_indices = 1; ALTER TABLE test.hits ADD INDEX domain_index URLDomain TYPE set(1000) GRANULARITY 1; OPTIMIZE TABLE test.hits FINAL;

Data Skipping Indices

SELECT count() FROM test.hits WHERE URLDomain LIKE '%aena.es%' ┌─count()─┐ │ 1 │ └─────────┘ Processed 65.54 thousand rows

Advanced Text Processing

Multiple substring search

Multiple regexp search

Fuzzy string comparison and search

Fuzzy regexp match

SELECT count() FROM hits_100m WHERE multiSearchAny(URL, ['chelyabinsk.74.ru', 'doctor.74.ru', 'transport.74.ru', 'm.74.ru', 'chel.74.ru', 'afisha.74.ru', 'diplom.74.ru', '//chel.ru', 'chelyabinsk.ru', 'cheldoctor.ru'])

Advanced Text Processing

— multiSearchAny
— multiSearchFirstPosition
— multiSearchFirstIndex
— multiSearchAllPositions
+ -UTF8, -CaseInsensitive, -CaseInsensitiveUTF8

— multiMatchAny
— multiMatchAnyIndex
— multiFuzzyMatchAny
— multiFuzzyMatchAnyIndex

— ngramDistance
— ngramSearch
+ -UTF8, -CaseInsensitive, -CaseInsensitiveUTF8

 

Advanced Text Processing

SELECT DISTINCT SearchPhrase, ngramDistance(SearchPhrase, 'clickhouse') AS dist FROM hits_100m_single ORDER BY dist ASC LIMIT 10 ┌─SearchPhrase────┬───────dist─┐ │ tickhouse │ 0.23076923 │ │ clockhouse │ 0.42857143 │ │ house │ 0.5555556 │ │ clickhomecyprus │ 0.57894737 │ │ 1click │ 0.6 │ │ uhouse │ 0.6 │ │ teakhouse.ru │ 0.625 │ │ teakhouse.com │ 0.64705884 │ │ madhouse │ 0.6666667 │ │ funhouse │ 0.6666667 │ └─────────────────┴────────────┘ 10 rows in set. Elapsed: 1.267 sec. Processed 100.00 million rows, 1.52 GB (78.92 million rows/s., 1.20 GB/s.)

MySQL Protocol Support

— enable <mysql_port> in clickhouse-server/config.xml;

— connect with your favorite mysql client;

— TLS and sha256 authentication are supported;

— available from version 19.9;

MySQL Protocol Support

$ mysql -u default --port 9336 --host 127.0.0.1 Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 0 Server version: 19.9.1.1-ClickHouse

MySQL Protocol Support

mysql> SELECT URL AS k, count() FROM default.hits_333 -> GROUP BY k ORDER BY count() DESC LIMIT 10; +---------------------------------------------+---------+ | k | count() | +---------------------------------------------+---------+ | http://smeshariki.ru/GameMain.aspx#location | 3116222 | | http://pogoda.yandex.ru/moscow/ | 2944772 | | http://maps.yandex.ru/ | 1740193 | | http://newsru.com/ | 1381836 | | http://radiorecord.ru/xml/sms_frame.html | 1351173 | | goal://www.gtavicecity.ru/advert_site | 1190643 | | http://auto.ria.ua/ | 1069057 | | http://video.yandex.ru/ | 1002206 | | http://loveplanet.ru/a-folders/#page/1 | 989686 | | http://pogoda.yandex.ru/saint-petersburg/ | 971312 | +---------------------------------------------+---------+ 10 rows in set (11.86 sec) Read 333454281 rows, 28.95 GiB in 11.860 sec., 28116209 rows/sec., 2.44 GiB/sec.

HDFS import/export

(contributed by TouTiao/ByteDance)

SELECT * FROM hdfs( 'hdfs://hdfs1:9000/file', 'TSV', 'id UInt64, text String'); INSERT INTO TABLE FUNCTION hdfs( 'hdfs://hdfs1:9000/file', 'TSV', 'id UInt64, text String') VALUES ... CREATE TABLE (...) ENGINE = HDFS('hdfs://hdfs1:9000/file', 'TSV');


Drawbacks: not all authentication methods supported.

Table Functions

— url;

— file;

— cluster;

— mysql;

— odbc;

— hdfs;

— input (since 19.15).

Table Function 'input'

For data transformation at INSERT time:

INSERT INTO table SELECT *, domain(URL) FROM input(TSV, 'URL String, ...')

You pipe your data to this query as with usual INSERT.

Data is transformed on the fly by SELECT expression.

Examples:
— calculate data for columns;
— skip unneeded columns;
— do filtering, aggregations, joins.

— available in 19.15 testing.

New Formats

Protobuf

— efficient implementation, no excessive copies/allocations
(ClickHouse style);

— transparent type conversions between Proto's and ClickHouse types (UInt8, Int64, DateTime <-> sint64, uint64, sint32, uint32, String <-> bytes, string, etc.);

— support for Nested types via repeated Messages or parallel repeated fields;

format_schema setting must be specified.

New Formats

Parquet

— columnar format; naturally implemented without unpacking of columns;

— transparent type conversions also supported.

ORC

— since 19.14 (input only).

Template Format

Allow to define a template for data formatting/parsing.

A template contains substitutions and delimieters.

Each substitution specifies data escaping rule:
Quoted, Escaped, CSV, JSON, XML, Raw.

Website ${domain:Quoted} has ${count:Raw} pageviews.

You can specify a template for rows, a delimiter between rows
and a template to wrap resultset.

Example: to parse web access logs.
Example: to parse deeply nested JSON.
Example: generate HTML right in ClickHouse.

Data Gaps Filling

SELECT EventDate, count() FROM table GROUP BY EventDate ORDER BY EventDate ┌──EventDate─┬─count()─┐ │ 2019-09-01 │ 5 │ │ 2019-09-02 │ 3 │ │ 2019-09-04 │ 4 │ │ 2019-09-05 │ 1 │ └────────────┴─────────┘

Data Gaps Filling

SELECT EventDate, count() FROM table GROUP BY EventDate ORDER BY EventDate WITH FILL ┌──EventDate─┬─count()─┐ │ 2019-09-01 │ 5 │ │ 2019-09-02 │ 3 │ │ 2019-09-03 │ 0 │ │ 2019-09-04 │ 4 │ │ 2019-09-05 │ 1 │ └────────────┴─────────┘

Data Gaps Filling

WITH FILL — a modifier for ORDER BY element;

WITH FILL FROM start

WITH FILL FROM start TO end

WITH FILL FROM start TO end STEP step

WITH FILL can be applied for any elements in ORDER BY:

ORDER BY EventDate WITH FILL, EventTime WITH FILL STEP 3600

— available in version 19.14.

Developers — Anton Popov, Yandex; Dmitri Utkin, HSE Moscow

JSON Functions

— the world-fastest implementation;

simdjson by Daniel Lemire when AVX2 is available,
rapidjson otherwise;

— supports extraction of nested fields;

SELECT JSONExtractString( '{"hello": {"world": [123, "ClickHouse"]}}', 'hello', 'world', 2) AS s ┌─s──────────┐ │ ClickHouse │ └────────────┘

JSON Functions

— JSONHas;
— JSONExtractUInt/Int/Float/Bool/String;
— JSONExtract, JSONExtractRaw;
— JSONType, JSONLength;
— JSONExtractKeysAndValues;

Server Logs for Introspection

in system tables:

— system.query_log;

— system.query_thread_log;

— system.part_log;

— system.trace_log;

— system.text_log;

— system.metric_log;

Server Logs for Introspection

system.text_log

Now we write ClickHouse logs into ClickHouse!

DESCRIBE TABLE system.text_log ┌─name──────────┬─type───────────────────┐ │ event_date │ Date │ │ event_time │ DateTime │ │ microseconds │ UInt32 │ │ thread_name │ LowCardinality(String) │ │ thread_number │ UInt32 │ │ os_thread_id │ UInt32 │ │ level │ Enum8('Fatal' = 1, '...│ │ query_id │ String │ │ logger_name │ LowCardinality(String) │ │ message │ String │ │ revision │ UInt32 │ │ source_file │ LowCardinality(String) │ │ source_line │ UInt64 │ └───────────────┴────────────────────────┘

Server Logs for Introspection

system.metric_log

— for those who forgot to setup monitoring.

— record all the ClickHouse metrics each second (by default).

SELECT toStartOfMinute(event_time) AS h, sum(ProfileEvent_UserTimeMicroseconds) AS user_time, bar(user_time, 0, 60000000, 80) AS bar FROM system.metric_log WHERE event_date = today() GROUP BY h ORDER BY h

Server Logs for Introspection

SELECT toStartOfMinute(event_time) AS h, sum(ProfileEvent_UserTimeMicroseconds) AS user_time, bar(user_time, 0, 60000000, 80) AS bar FROM system.metric_log WHERE event_date = today() GROUP BY h ORDER BY h ┌───────────────────h─┬─user_time─┬─bar───────────────────────────────────────────────┐ │ 2019-09-05 04:12:00 │ 0 │ │ │ 2019-09-05 04:13:00 │ 0 │ │ │ 2019-09-05 04:14:00 │ 524000 │ ▋ │ │ 2019-09-05 04:15:00 │ 15880000 │ █████████████████████▏ │ │ 2019-09-05 04:19:00 │ 36724000 │ ████████████████████████████████████████████████▊ │ │ 2019-09-05 04:20:00 │ 17508000 │ ███████████████████████▎ │ │ 2019-09-05 04:21:00 │ 0 │ │ │ 2019-09-05 04:22:00 │ 0 │ │ │ 2019-09-05 04:23:00 │ 0 │ │ │ 2019-09-05 04:24:00 │ 0 │ │ │ 2019-09-05 04:25:00 │ 0 │ │ │ 2019-09-05 04:26:00 │ 0 │ │ │ 2019-09-05 04:27:00 │ 0 │ │ │ 2019-09-05 04:28:00 │ 0 │ │ │ 2019-09-05 04:29:00 │ 80000 │ │ │ 2019-09-05 04:30:00 │ 0 │ │ │ 2019-09-05 04:31:00 │ 0 │ │

Sampling Profiler on a Query Level

Record code locations where the query was executing
in every execution thread, at each moment of time with some period.

If the query is slow — where exactly (in code) it was slow?

— where the specific query spent time?

— where the time was spent for queries of some kind?

— where the time was spent for queries of some user?

— where the time was spent for queries, cluster wide?

Developer — Nikita Lapkov, HSE Moscow; et all.

Sampling Profiler on a Query Level

1. Turn on one of the following settings (or both):

SET query_profiler_cpu_time_period_ns = 1000000; SET query_profiler_real_time_period_ns = 1000000;

2. Run your queries.
Recorded samples will be saved into system.trace_log table.

event_date: 2019-09-05 event_time: 2019-09-05 05:47:44 revision: 54425 timer_type: CPU thread_number: 149 query_id: b1d8e7f9-48d8-4cb3-a768-0a6683f6f061 trace: [140171472847748,61781958,110943821,117594728,117595220,115654933, 120321783,63251928,111161800,120329436,120331356,120308294,120313436,120319113, 120143313,115666412,120146905,111013972,118237176,111013972,117990912,111013972, 110986070,110986938,61896391,61897898,61887509,156206624,140171472807643]

Sampling Profiler on a Query Level

trace — an array of addresses in machine code (stack trace);

Translate address to function name:
— demangle(addressToSymbol(trace[1]))
Translate address to source file name and line number:
— addressToLine(trace[1])

* don't forget to install clickhouse-common-static-dbg package

Example: functions top:

SELECT count(), demangle(addressToSymbol(trace[1] AS addr)) AS symbol FROM system.trace_log WHERE event_date = today() GROUP BY symbol ORDER BY count() DESC LIMIT 10

Sampling Profiler on a Query Level

Пример: топ функций:

┌─count()─┬─symbol──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ 517 │ void LZ4::(anonymous namespace)::decompressImpl<32ul, false>(char const*, char*, unsigned long) │ │ 480 │ void DB::deserializeBinarySSE2<4>(DB::PODArray<unsigned char, 4096ul, Allocator<false, false>, 15ul, 16ul>&, DB::PODArray<unsigned long, 4096ul, Allocator<false, false>, 15ul, 16ul>&, DB::ReadBuffer&, unsigned long) │ │ 457 │ DB::VolnitskyBase<true, true, DB::StringSearcher<true, true> >::search(unsigned char const*, unsigned long) const │ │ 270 │ read │ │ 163 │ void LZ4::(anonymous namespace)::decompressImpl<16ul, true>(char const*, char*, unsigned long) │ │ 130 │ void LZ4::(anonymous namespace)::decompressImpl<16ul, false>(char const*, char*, unsigned long) │ │ 58 │ CityHash_v1_0_2::CityHash128WithSeed(char const*, unsigned long, std::pair<unsigned long, unsigned long>) │ │ 44 │ void DB::deserializeBinarySSE2<2>(DB::PODArray<unsigned char, 4096ul, Allocator<false, false>, 15ul, 16ul>&, DB::PODArray<unsigned long, 4096ul, Allocator<false, false>, 15ul, 16ul>&, DB::ReadBuffer&, unsigned long) │ │ 37 │ void LZ4::(anonymous namespace)::decompressImpl<8ul, true>(char const*, char*, unsigned long) │ │ 32 │ memcpy │ └─────────┴─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

Sampling Profiler on a Query Level

Example: top of contexts (stacks) for a query:

SELECT count(), arrayStringConcat(arrayMap(x -> concat( demangle(addressToSymbol(x)), '\n ', addressToLine(x)), trace), '\n') AS sym FROM system.trace_log WHERE query_id = '1a1272b5-695a-4b17-966d-a1701b61b3eb' AND event_date = today() GROUP BY trace ORDER BY count() DESC LIMIT 10
count(): 154 sym: DB::VolnitskyBase<true, true, DB::StringSearcher<true, true> >::search(unsigned char const*, unsigned long) const /opt/milovidov/ClickHouse/build_gcc9/dbms/programs/clickhouse DB::MatchImpl<true, false>::vector_constant(DB::PODArray<unsigned char, 4096ul, Allocator<false, false>, 15ul, 16ul> const&, DB::PODArray<unsigned long, 4096ul, Allocator<false, false>, 15ul, 16ul> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, DB::PODArray<unsigned char, 4096ul, Allocator<false, false>, 15ul, 16ul>&) /opt/milovidov/ClickHouse/build_gcc9/dbms/programs/clickhouse DB::FunctionsStringSearch<DB::MatchImpl<true, false>, DB::NameLike>::executeImpl(DB::Block&, std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long, unsigned long) /opt/milovidov/ClickHouse/build_gcc9/dbms/programs/clickhouse DB::PreparedFunctionImpl::execute(DB::Block&, std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long, unsigned long, bool) /home/milovidov/ClickHouse/build_gcc9/../dbms/src/Functions/IFunction.cpp:464 DB::ExpressionAction::execute(DB::Block&, bool) const /usr/local/include/c++/9.1.0/bits/stl_vector.h:677 DB::ExpressionActions::execute(DB::Block&, bool) const /home/milovidov/ClickHouse/build_gcc9/../dbms/src/Interpreters/ExpressionActions.cpp:759 DB::FilterBlockInputStream::readImpl() /home/milovidov/ClickHouse/build_gcc9/../dbms/src/DataStreams/FilterBlockInputStream.cpp:84 DB::IBlockInputStream::read() /usr/local/include/c++/9.1.0/bits/stl_vector.h:108 DB::ExpressionBlockInputStream::readImpl() /home/milovidov/ClickHouse/build_gcc9/../dbms/src/DataStreams/ExpressionBlockInputStream.cpp:34 DB::IBlockInputStream::read() /usr/local/include/c++/9.1.0/bits/stl_vector.h:108 DB::ParallelInputsProcessor<DB::ParallelAggregatingBlockInputStream::Handler>::thread(std::shared_ptr<DB::ThreadGroupStatus>, unsigned long) /usr/local/include/c++/9.1.0/bits/atomic_base.h:419 ThreadFromGlobalPool::ThreadFromGlobalPool<void (DB::ParallelInputsProcessor<DB::ParallelAggregatingBlockInputStream::Handler>::*)(std::shared_ptr<DB::ThreadGroupStatus>, unsigned long), DB::ParallelInputsProcessor<DB::ParallelAggregatingBlockInputStream::Handler>*, std::shared_ptr<DB::ThreadGroupStatus>, unsigned long&>(void (DB::ParallelInputsProcessor<DB::ParallelAggregatingBlockInputStream::Handler>::*&&)(std::shared_ptr<DB::ThreadGroupStatus>, unsigned long), DB::ParallelInputsProcessor<DB::ParallelAggregatingBlockInputStream::Handler>*&&, std::shared_ptr<DB::ThreadGroupStatus>&&, unsigned long&)::{lambda()#1}::operator()() const /usr/local/include/c++/9.1.0/bits/shared_ptr_base.h:729 ThreadPoolImpl<std::thread>::worker(std::_List_iterator<std::thread>) /usr/local/include/c++/9.1.0/bits/atomic_base.h:551 execute_native_thread_routine /home/milovidov/ClickHouse/ci/workspace/gcc/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/unique_ptr.h:81 start_thread /lib/x86_64-linux-gnu/libpthread-2.27.so clone /build/glibc-OTsEL5/glibc-2.27/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:97

Row-Level Security

<users> <user_name> ... <databases> <database_name> <table_name> <filter>IsYandex = 1</filter> <table_name> </database_name> </databases> </user_name> </users>

Upcoming

Autumn 2019

— Indexing by z-Order curve

— DDL queries for dictionaries

— S3 import/export

— Parallel parsing of data formats;

— Speedup of INSERT with VALUES with expressions;

— Aggregate functions for data clustering

— Optimization of GROUP BY with table's order key

October 2019

— Initial implementation of RBAC;

— Initial implementation of Merge JOIN;

Autumn-Winter 2019

— More than initial implementation of RBAC;

— More than initial implementation of Merge JOIN;

— Workload management;

.

.

Web site: https://clickhouse.com/

Maillist: [email protected]

YouTube: https://www.youtube.com/c/ClickHouseDB

Telegram chat: https://telegram.me/clickhouse_ru, clickhouse_en

GitHub: https://github.com/ClickHouse/ClickHouse/

Twitter: https://twitter.com/ClickHouseDB

Google groups: https://groups.google.com/forum/#!forum/clickhouse