https://clickhouse.com/presentations/meetup28/new_features/
CONSTRAINTs Prepared Statements Format ORC Format Template ORDER BY optimization WITH FILL text_log metric_log Query Profiler Globs for File/URL/HDFS RPM packages neighbor os_schedule_priority ALTER MODIFY SETTING SELECT FROM VALUES COLUMNS matcher LIVE VIEW Settings constraints Row Level Security Secondary Indices TTL expressions MySQL protocol Adaptive index granularity Advanced text search JSON functions Cascade MATERIALIZED VIEWs WITH ROLLUP/CUBE ASOF JOIN
Tiered Storage
— Vladimir Chebotarev already presented it.
Merge JOIN
— Artem Zuikov already presented it.
campaigns_dictionary.xml:
<yandex>
<dictionary>
<name>campaigns</name>
<source>
<mysql>
<password>...</password>
<port>3306</port>
<user>metrika</user>
<replica>
<host>mtexample.metr.yandex.net</host>
<priority>1</priority>
</replica>
<db>dicts</db>
<table>campaign_names</table>
</mysql>
</source>
<lifetime>
<min>14400</min>
<max>18000</max>
</lifetime>
<layout>
<cache>
<size_in_cells>1048576</size_in_cells>
</cache>
</layout>
<structure>
<id>
<name>OrderID</name>
</id>
<attribute>
<name>cid</name>
<type>UInt32</type>
<null_value>0</null_value>
</attribute>
<attribute>
<name>name</name>
<type>String</type>
<null_value />
</attribute>
</structure>
</dictionary>
</yandex>
CREATE DICTIONARY db.dict
(
region_id Int64 DEFAULT -1,
name String DEFAULT 'unknown',
population UInt64
)
PRIMARY KEY region_id
SOURCE(MYSQL(
USER 'root' PASSWORD '...'
DB 'dictionaries' TABLE 'regions'
REPLICA(PRIORITY 1 HOST 'server1' PORT 3306)
REPLICA(PRIORITY 2 HOST 'server2' PORT 3306)))
LAYOUT(HASHED())
LIFETIME(MIN 300 MAX 600);
SELECT dictGetString('db.dict', 'name', toInt64(213));
Created dictionaries are also available as tables to read.
SELECT * FROM db.dict;
— available since 19.17.
Разработчики — Александр Сапин, Максим Сабянин
— available since 19.16.
Разработчики — Алексей Басманов, Антон Попов
— available since 19.15.
Разработчик — Azat Khuzhin
clickhouse-local \
-S 'WatchID UInt64, JavaEnable UInt8, Title String, ...' \
--query "SELECT count() FROM table" < hits.tsv
Было: 46.2 sec., 192 096 rows/sec., 168 MB/sec.
Стало: 4.5 sec., 1 935 419 rows/sec., 1.7 GB/sec.
Works for TSV, CSV, JSONEachRow.
Works in clickhouse-client, clickhouse-local,
HTTP interface; File, URL, HDFS, S3 tables.
Enabled with input_format_parallel_parsing.
And it's already enabled by default!
— available since 19.18.
Разработчики — Никита Михайлов, Олег Ершов
CREATE TABLE test.null (x UInt64, d DateTime) ENGINE = Null
clickhouse-client --query "INSERT INTO test.null VALUES" <...
Inserting 1 mln. records like (1, '2019-12-11 00:00:00'):
— 0.108 sec.
Inserting 1 mln. records like (1, now()):
— 40.4 sec. — в 400 раз медленнее.
Inserting 1 mln. records like (1, now()) с включенной опцией
input_format_values_deduce_templates_of_expressions:
— 0.112 sec. — почти без регрессии производительности.
— available since 19.16.
Разработчик — Александр Токмаков
[11825667, "закрепление журные людьми и"]
[7588965, "выпискусстановке фото 3д фирма"]
[30622428, "губка перерьер ванновгород главный"]
[15009727, "uhfvvf yfbck.n gtycb. ltym hj;ltyb 3"]
[31235604, "порно друзья дачные коллетение спит 66"]
[1036200, "oomph"]
[22446879, "ионистранны в туре 17 март 2014 смотреть"]
[792887, "пп оптики билей ремиум новый фасонвар"]
[315961, "рангику"]
[1151353, "дом-2 последние смотреть онлайн фигур"]
— available since 19.18.
Разработчик — Михаил Коротов
Controlled by the settings:
format_custom_escaping_rule
format_custom_field_delimiter
format_custom_row_before_delimiter
format_custom_row_after_delimiter
format_custom_row_between_delimiter
format_custom_result_before_delimiter
format_custom_result_after_delimiter
Allow to define a template for data formatting/parsing.
A template contains substitutions and delimieters.
Each substitution specifies data escaping rule:
Quoted, Escaped, CSV, JSON, XML, Raw.
Website ${domain:Quoted} has ${count:Raw} pageviews.
You can specify a template for rows, a delimiter between rows
and a template to wrap resultset.
Example: to parse web access logs.
Example: to parse deeply nested JSON.
Example: generate HTML right in ClickHouse.
input_format_csv_unquoted_null_literal_as_null
— parse unquoted NULL in CSV as NULL (otherwise you have to \N).
input_format_null_as_default
— parse NULL as default value for non-Nullable columns.
input_format_tsv_empty_as_default
— parse empty values in TSV as default values.
input_format_with_names_use_header
— use headers in TSV and CSV to determine column order;
— fill missing columns with default values.
input_format_defaults_for_omitted_fields
— calculate complex default values for JSONEachRow, TSV, CSV.
For tables with File, URL, HDFS, S3 engines:
SELECT * FROM url('https://site.ru/data.gz',
TSV, 'phrase String, counter_id UInt32, ...')
INSERT INTO TABLE FUNCTION url(
'https://site.ru/data.gz',
TSV, 'phrase String, counter_id UInt32, ...')
Autodetect by file extension or explicit: gz/gzip, br/brotli, none, auto:
SELECT * FROM url('https://site.ru/data', TSV,
'phrase String, counter_id UInt32, ...', 'gzip')
Разработчик — Андрей Бодров
TODO: поддержка Brotli.
Table function s3 and table engine S3:
SELECT * FROM s3(...)
INSERT INTO TABLE FUNCTION s3(...)
CREATE TABLE table ENGINE = S3(...)
— full support in 19.18.
Разработчик — Владимир Чеботарёв
greatCircleDistance
— improved performance 3 times
with the code from Andrey Aksenov;
greatCircleAngle
— the distance on sphere in degrees;
geoDistance
— calculates the distance on WGS-84 ellipsoid
(under certain assumptions);
— available since 19.18.
С версии 19.15 по 19.17 была регрессия производительности.
Теперь всё нормально.
— available since 19.18 и будет бэкпортировано.
Разработчик — Михаил Филимонов, при участии Ивана Лежанкина
arraySplit multiMatchAllIndices multiFuzzyMatchAllIndices arrayCompact isValidJSON arrayFill arrayReverseFill JSONExtractArrayRaw FQDN
char repeat
Комбинаторы -OrNull и -OrDefault для агрегатных функций.
Bloom filter secondary index for arrays.
Virtual column _shard_num.
Skipping unknown settings in distributed queries.
Allow using FINAL and SAMPLE simultaneously.
Optimization of uniqCombined; uniqCombined64.
Instant count() in trivial queries.
RBAC
DateTime64
Polymorphic Data Parts
Ручка для Prometheus
Кэш результатов запросов
Custom HTTP handlers
https://clickhouse.com/docs/ru/extended_roadmap/
~ 500 задач с подробным описанием — зависимости, исполнители...
из них ~ 20 are already done :)
Web site: https://clickhouse.com/
Maillist: [email protected]
YouTube: https://www.youtube.com/c/ClickHouseDB
Telegram chat: https://telegram.me/clickhouse_ru, clickhouse_en
GitHub: https://github.com/ClickHouse/ClickHouse/
Twitter: https://twitter.com/ClickHouseDB
Google groups: https://groups.google.com/forum/#!forum/clickhouse