Aivazovsky
www.tooistanbul.com
CREATE TABLE hits
(
URL String,
Domain String,
CONSTRAINT c_valid_url CHECK isValidUTF8(URL),
CONSTRAINT c_domain CHECK Domain = domain(URL)
)
Checked on INSERT.
Developer — Gleb Novikov, HSE Moscow
INSERT INTO hits VALUES ('https://yandex.ru/', 'google.com')
Exception: Constraint `c_domain` for table `hits`
is violated at row 1.
Expression: (Domain = domain(URL)).
Column values:
URL = 'https://yandex.ru/',
Domain = 'google.com'.
Don't checked on background merges.
Don't checked on ALTER UPDATE.
ALTER TABLE hits ADD CONSTRAINT
c_valid_url CHECK isValidUTF8(URL)
Existing data is not checked when CONSTRAINT is added.
ALTER TABLE hits DROP CONSTRAINT c_valid_url
SELECT count()
FROM test.hits
WHERE CounterID = {id:UInt32}
AND SearchPhrase = {phrase:String}
— substitutions have names and types;
— substituted on AST level.
TODO:
— substitutions for sets in IN operator;
— substitutions for table, column, database names.
— support for custom server URLs with predefined queries.
Developer — Alexander Tretyakov, HSE Moscow
$ curl "http://localhost:8123/" --data-binary \
"SELECT count() FROM test.hits WHERE CounterID = {id:UInt32}"
Exception: Query parameter `id` was not set.
$ curl "http://localhost:8123/?param_id=34" --data-binary \
"SELECT count() FROM test.hits WHERE CounterID = {id:UInt32}"
1962
$ curl "http://localhost:8123/?param_id=34;%20DROP%20DATABASE%20test" ...
"SELECT count() FROM test.hits WHERE CounterID = {id:UInt32}"
Exception: Value 34; DROP DATABASE test
cannot be parsed as UInt32 for query parameter 'id'
$ curl "http://localhost:8123/?param_id=34;%20DROP%20DATABASE%20test" ...
"SELECT count() FROM test.hits WHERE CounterID = {id:String}"
Exception: Illegal types of arguments (UInt32, String) of function equals
SELECT EventDate, count() FROM table
GROUP BY EventDate ORDER BY EventDate
┌──EventDate─┬─count()─┐
│ 2019-09-01 │ 5 │
│ 2019-09-02 │ 3 │
│ 2019-09-04 │ 4 │
│ 2019-09-05 │ 1 │
└────────────┴─────────┘
SELECT EventDate, count() FROM table
GROUP BY EventDate ORDER BY EventDate WITH FILL
┌──EventDate─┬─count()─┐
│ 2019-09-01 │ 5 │
│ 2019-09-02 │ 3 │
│ 2019-09-03 │ 0 │
│ 2019-09-04 │ 4 │
│ 2019-09-05 │ 1 │
└────────────┴─────────┘
WITH FILL — a modifier for ORDER BY element;
WITH FILL FROM start
WITH FILL FROM start TO end
WITH FILL FROM start TO end STEP step
WITH FILL can be applied for any elements in ORDER BY:
ORDER BY EventDate WITH FILL, EventTime WITH FILL STEP 3600
— available in version 19.14.
Developers — Anton Popov, Yandex; Dmitri Utkin, HSE Moscow
— for columns:
CREATE TABLE t (
date Date,
ClientIP UInt32 TTL date + INTERVAL 3 MONTH
— for all table data:
CREATE TABLE t (date Date, ...)
ENGINE = MergeTree ORDER BY ...
TTL date + INTERVAL 3 MONTH
Example: store hot data on SSD and archive data on HDDs.
Multiple storage policies can be configured and used on per-table basis.
Step 1: configure available disks (storage paths):
<disks>
<fast_disk> <!-- disk name -->
<path>/mnt/fast_ssd/clickhouse</path>
</fast_disk>
<disk1>
<path>/mnt/hdd1/clickhouse</path>
<keep_free_space_bytes>10485760
</keep_free_space_bytes>
</disk1>
...
Step 2: configure storage policies:
<policies>
<ssd_and_hdd>
<volumes>
<hot>
<disk>fast_ssd</disk>
<max_data_part_size_bytes>1073741824
</max_data_part_size_bytes>
</hot>
<cold>
<disk>disk1</disk>
</cold>
<move_factor>0.2</move_factor>
</volumes>
</ssd_and_hdd>
...
Step 3: use the configured policy for your table:
CREATE TABLE table
(
...
)
ENGINE = MergeTree
ORDER BY ...
SETTINGS storage_policy = 'ssd_and_hdd'
The data will be moved between volumes automatically.
You can also do it manually:
ALTER TABLE table
MOVE PART|PARTITION ...
TO VOLUME|DISK ...
— available in 19.15.
Protobuf
— efficient implementation, no excessive copies/allocations
(ClickHouse style);
— transparent type conversions between Proto's and ClickHouse types (UInt8, Int64, DateTime <-> sint64, uint64, sint32, uint32, String <-> bytes, string, etc.);
— support for Nested types via repeated Messages or parallel repeated fields;
— format_schema setting must be specified.
Parquet
— columnar format; naturally implemented without unpacking of columns;
— transparent type conversions also supported.
ORC (contributed by ArenaData)
— since 19.14 (input only).
Allow to define a template for data formatting/parsing.
A template contains substitutions and delimieters.
Each substitution specifies data escaping rule:
Quoted, Escaped, CSV, JSON, XML, Raw.
Website ${domain:Quoted} has ${count:Raw} pageviews.
You can specify a template for rows, a delimiter between rows
and a template to wrap resultset.
Example: to parse web access logs.
Example: to parse deeply nested JSON.
Example: generate HTML right in ClickHouse.
— url;
— file;
— cluster;
— mysql;
— odbc;
— hdfs;
— input (since 19.15).
For data transformation at INSERT time:
INSERT INTO table
SELECT *, domain(URL)
FROM input(TSV, 'URL String, ...')
You pipe your data to this query as with usual INSERT.
Data is transformed on the fly by SELECT expression.
Examples:
— calculate data for columns;
— skip unneeded columns;
— do filtering, aggregations, joins.
— available in 19.15 testing.
with respect to table sorting key
SELECT event_time, message
FROM system.text_log
ORDER BY event_date DESC, event_time DESC
LIMIT 10
┌──────────event_time─┬─message───────────────────────────────────────────┐
│ 2019-09-05 04:20:35 │ Renaming temporary part tmp_insert_201909_10_10_0 │
│ 2019-09-05 04:20:35 │ Flushing system log │
│ 2019-09-05 04:20:34 │ Flushing system log │
│ 2019-09-05 04:20:34 │ Renaming temporary part tmp_insert_201909_10_10_0 │
│ 2019-09-05 04:20:28 │ Renaming temporary part tmp_insert_201909_9_9_0 to│
│ 2019-09-05 04:20:28 │ Flushing system log │
│ 2019-09-05 04:20:27 │ Renaming temporary part tmp_insert_201909_9_9_0 to│
...
└─────────────────────┴───────────────────────────────────────────────────┘
10 rows in set. Elapsed: 0.291 sec. Processed 42.21 million rows, 6.10 GB
(145.13 million rows/s., 20.98 GB/s.)
with respect to table sorting key
:) SET optimize_read_in_order = 1
:) SELECT event_time, message
FROM system.text_log
ORDER BY event_date DESC, event_time DESC
LIMIT 10
┌──────────event_time─┬─message───────────────────────────────────────────┐
│ 2019-09-05 04:28:59 │ Merged 6 parts: from 201909_1_96_19 to 201909_101_│
│ 2019-09-05 04:28:59 │ Renaming temporary part tmp_merge_201909_1_101_20 │
│ 2019-09-05 04:28:59 │ Merge sorted 761 rows, containing 202 columns (202│
│ 2019-09-05 04:28:59 │ Reading 2 marks from part 201909_101_101_0, total │
│ 2019-09-05 04:28:59 │ Reading 2 marks from part 201909_100_100_0, total │
...
└─────────────────────┴───────────────────────────────────────────────────┘
10 rows in set. Elapsed: 0.012 sec. Processed 54.41 thousand rows, 7.42 MB
(4.68 million rows/s., 637.76 MB/s.)
with respect to table sorting key
— ORDER BY table sorting key;
— ASC / DESC / mixed:
ORDER BY event_date DESC, event_time ASC
— or by the prefix:
ORDER BY event_date DESC
— of by expressions with monotonic functions of them;
ORDER BY event_date DESC, -toUInt32(event_time)
— available from version 19.14;
Developers — Anton Popov, Yandex; Anastasiya Rodigina, HSE Moscow
in system tables:
— system.query_log;
— system.query_thread_log;
— system.part_log;
— system.trace_log;
— system.text_log;
— system.metric_log;
system.text_log
Now we write ClickHouse logs into ClickHouse!
DESCRIBE TABLE system.text_log
┌─name──────────┬─type───────────────────┐
│ event_date │ Date │
│ event_time │ DateTime │
│ microseconds │ UInt32 │
│ thread_name │ LowCardinality(String) │
│ thread_number │ UInt32 │
│ os_thread_id │ UInt32 │
│ level │ Enum8('Fatal' = 1, '...│
│ query_id │ String │
│ logger_name │ LowCardinality(String) │
│ message │ String │
│ revision │ UInt32 │
│ source_file │ LowCardinality(String) │
│ source_line │ UInt64 │
└───────────────┴────────────────────────┘
system.metric_log
— for those who forgot to setup monitoring.
— record all the ClickHouse metrics each second (by default).
SELECT
toStartOfMinute(event_time) AS h,
sum(ProfileEvent_UserTimeMicroseconds) AS user_time,
bar(user_time, 0, 60000000, 80) AS bar
FROM system.metric_log WHERE event_date = today()
GROUP BY h ORDER BY h
SELECT
toStartOfMinute(event_time) AS h,
sum(ProfileEvent_UserTimeMicroseconds) AS user_time,
bar(user_time, 0, 60000000, 80) AS bar
FROM system.metric_log WHERE event_date = today()
GROUP BY h ORDER BY h
┌───────────────────h─┬─user_time─┬─bar───────────────────────────────────────────────┐
│ 2019-09-05 04:12:00 │ 0 │ │
│ 2019-09-05 04:13:00 │ 0 │ │
│ 2019-09-05 04:14:00 │ 524000 │ ▋ │
│ 2019-09-05 04:15:00 │ 15880000 │ █████████████████████▏ │
│ 2019-09-05 04:19:00 │ 36724000 │ ████████████████████████████████████████████████▊ │
│ 2019-09-05 04:20:00 │ 17508000 │ ███████████████████████▎ │
│ 2019-09-05 04:21:00 │ 0 │ │
│ 2019-09-05 04:22:00 │ 0 │ │
│ 2019-09-05 04:23:00 │ 0 │ │
│ 2019-09-05 04:24:00 │ 0 │ │
│ 2019-09-05 04:25:00 │ 0 │ │
│ 2019-09-05 04:26:00 │ 0 │ │
│ 2019-09-05 04:27:00 │ 0 │ │
│ 2019-09-05 04:28:00 │ 0 │ │
│ 2019-09-05 04:29:00 │ 80000 │ │
│ 2019-09-05 04:30:00 │ 0 │ │
│ 2019-09-05 04:31:00 │ 0 │ │
Record code locations where the query was executing
in every execution thread, at each moment of time with some period.
If the query is slow — where exactly (in code) it was slow?
— where the specific query spent time?
— where the time was spent for queries of some kind?
— where the time was spent for queries of some user?
— where the time was spent for queries, cluster wide?
Developer — Nikita Lapkov, HSE Moscow; et all.
1. Turn on one of the following settings (or both):
SET query_profiler_cpu_time_period_ns = 1000000;
SET query_profiler_real_time_period_ns = 1000000;
2. Run your queries.
Recorded samples will be saved into system.trace_log table.
event_date: 2019-09-05
event_time: 2019-09-05 05:47:44
revision: 54425
timer_type: CPU
thread_number: 149
query_id: b1d8e7f9-48d8-4cb3-a768-0a6683f6f061
trace: [140171472847748,61781958,110943821,117594728,117595220,115654933,
120321783,63251928,111161800,120329436,120331356,120308294,120313436,120319113,
120143313,115666412,120146905,111013972,118237176,111013972,117990912,111013972,
110986070,110986938,61896391,61897898,61887509,156206624,140171472807643]
trace — an array of addresses in machine code (stack trace);
Translate address to function name:
— demangle(addressToSymbol(trace[1]))
Translate address to source file name and line number:
— addressToLine(trace[1])
* don't forget to install clickhouse-common-static-dbg package
Example: functions top:
SELECT count(),
demangle(addressToSymbol(trace[1] AS addr)) AS symbol
FROM system.trace_log
WHERE event_date = today()
GROUP BY symbol
ORDER BY count() DESC LIMIT 10
Пример: топ функций:
┌─count()─┬─symbol──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ 517 │ void LZ4::(anonymous namespace)::decompressImpl<32ul, false>(char const*, char*, unsigned long) │
│ 480 │ void DB::deserializeBinarySSE2<4>(DB::PODArray<unsigned char, 4096ul, Allocator<false, false>, 15ul, 16ul>&, DB::PODArray<unsigned long, 4096ul, Allocator<false, false>, 15ul, 16ul>&, DB::ReadBuffer&, unsigned long) │
│ 457 │ DB::VolnitskyBase<true, true, DB::StringSearcher<true, true> >::search(unsigned char const*, unsigned long) const │
│ 270 │ read │
│ 163 │ void LZ4::(anonymous namespace)::decompressImpl<16ul, true>(char const*, char*, unsigned long) │
│ 130 │ void LZ4::(anonymous namespace)::decompressImpl<16ul, false>(char const*, char*, unsigned long) │
│ 58 │ CityHash_v1_0_2::CityHash128WithSeed(char const*, unsigned long, std::pair<unsigned long, unsigned long>) │
│ 44 │ void DB::deserializeBinarySSE2<2>(DB::PODArray<unsigned char, 4096ul, Allocator<false, false>, 15ul, 16ul>&, DB::PODArray<unsigned long, 4096ul, Allocator<false, false>, 15ul, 16ul>&, DB::ReadBuffer&, unsigned long) │
│ 37 │ void LZ4::(anonymous namespace)::decompressImpl<8ul, true>(char const*, char*, unsigned long) │
│ 32 │ memcpy │
└─────────┴─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
Example: top of contexts (stacks) for a query:
SELECT count(),
arrayStringConcat(arrayMap(x -> concat(
demangle(addressToSymbol(x)),
'\n ',
addressToLine(x)), trace), '\n') AS sym
FROM system.trace_log
WHERE query_id = '1a1272b5-695a-4b17-966d-a1701b61b3eb'
AND event_date = today()
GROUP BY trace
ORDER BY count() DESC
LIMIT 10
count(): 154
sym: DB::VolnitskyBase<true, true, DB::StringSearcher<true, true> >::search(unsigned char const*, unsigned long) const
/opt/milovidov/ClickHouse/build_gcc9/dbms/programs/clickhouse
DB::MatchImpl<true, false>::vector_constant(DB::PODArray<unsigned char, 4096ul, Allocator<false, false>, 15ul, 16ul> const&, DB::PODArray<unsigned long, 4096ul, Allocator<false, false>, 15ul, 16ul> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, DB::PODArray<unsigned char, 4096ul, Allocator<false, false>, 15ul, 16ul>&)
/opt/milovidov/ClickHouse/build_gcc9/dbms/programs/clickhouse
DB::FunctionsStringSearch<DB::MatchImpl<true, false>, DB::NameLike>::executeImpl(DB::Block&, std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long, unsigned long)
/opt/milovidov/ClickHouse/build_gcc9/dbms/programs/clickhouse
DB::PreparedFunctionImpl::execute(DB::Block&, std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long, unsigned long, bool)
/home/milovidov/ClickHouse/build_gcc9/../dbms/src/Functions/IFunction.cpp:464
DB::ExpressionAction::execute(DB::Block&, bool) const
/usr/local/include/c++/9.1.0/bits/stl_vector.h:677
DB::ExpressionActions::execute(DB::Block&, bool) const
/home/milovidov/ClickHouse/build_gcc9/../dbms/src/Interpreters/ExpressionActions.cpp:759
DB::FilterBlockInputStream::readImpl()
/home/milovidov/ClickHouse/build_gcc9/../dbms/src/DataStreams/FilterBlockInputStream.cpp:84
DB::IBlockInputStream::read()
/usr/local/include/c++/9.1.0/bits/stl_vector.h:108
DB::ExpressionBlockInputStream::readImpl()
/home/milovidov/ClickHouse/build_gcc9/../dbms/src/DataStreams/ExpressionBlockInputStream.cpp:34
DB::IBlockInputStream::read()
/usr/local/include/c++/9.1.0/bits/stl_vector.h:108
DB::ParallelInputsProcessor<DB::ParallelAggregatingBlockInputStream::Handler>::thread(std::shared_ptr<DB::ThreadGroupStatus>, unsigned long)
/usr/local/include/c++/9.1.0/bits/atomic_base.h:419
ThreadFromGlobalPool::ThreadFromGlobalPool<void (DB::ParallelInputsProcessor<DB::ParallelAggregatingBlockInputStream::Handler>::*)(std::shared_ptr<DB::ThreadGroupStatus>, unsigned long), DB::ParallelInputsProcessor<DB::ParallelAggregatingBlockInputStream::Handler>*, std::shared_ptr<DB::ThreadGroupStatus>, unsigned long&>(void (DB::ParallelInputsProcessor<DB::ParallelAggregatingBlockInputStream::Handler>::*&&)(std::shared_ptr<DB::ThreadGroupStatus>, unsigned long), DB::ParallelInputsProcessor<DB::ParallelAggregatingBlockInputStream::Handler>*&&, std::shared_ptr<DB::ThreadGroupStatus>&&, unsigned long&)::{lambda()#1}::operator()() const
/usr/local/include/c++/9.1.0/bits/shared_ptr_base.h:729
ThreadPoolImpl<std::thread>::worker(std::_List_iterator<std::thread>)
/usr/local/include/c++/9.1.0/bits/atomic_base.h:551
execute_native_thread_routine
/home/milovidov/ClickHouse/ci/workspace/gcc/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/unique_ptr.h:81
start_thread
/lib/x86_64-linux-gnu/libpthread-2.27.so
clone
/build/glibc-OTsEL5/glibc-2.27/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:97
<users>
<user_name>
...
<databases>
<database_name>
<table_name>
<filter>IsYandex = 1</filter>
<table_name>
</database_name>
</databases>
</user_name>
</users>
Automatically tune index granularity if you have fat columns.
Everything is working out of the box.
— available in 19.13.
CentOS, RedHat... GosLinux.
Supported versions starting from CentOS 5.
sudo yum install yum-utils
sudo rpm --import \
https://repo.yandex.ru/clickhouse/CLICKHOUSE-KEY.GPG
sudo yum-config-manager --add-repo \
https://repo.yandex.ru/clickhouse/rpm/stable/x86_64
sudo yum install clickhouse-server clickhouse-client
Ben Morlok, www.flickr.com, CC BY-SA 2.0
— Indexing by z-Order curve
— DDL queries for dictionaries;
— S3 import/export
— Parallel parsing of data formats
— Speedup of INSERT with VALUES with expressions;
— Aggregate functions for data clustering
— Optimization of GROUP BY with table's order key
— Initial implementation of RBAC;
— Initial implementation of Merge JOIN;
— More than initial implementation of RBAC;
— More than initial implementation of Merge JOIN;
— Workload management;
Aivazovsky
Web site: https://clickhouse.com/
GitHub: https://github.com/ClickHouse/ClickHouse/
Maillist: [email protected]
+ meetups. Moscow, Saint-Petersburg, Novosibirsk, Ekaterinburg, Minsk...
... Berlin, Paris, Amsterdam, Madrid, Munich, San-Francisco,
... Beijing, Shenzhen, Shanghai, Hong Kong, Tokyo, Istanbul, Ankara, Singapore.