ClickHouse Roadmap 2018..2019

Author: Alexey Milovidov, 2018-11-26.

ClickHouse Roadmap 2018..2019

Things We're Not Afraid to Talk About

December 2018

LowCardinality data types in production.

Choice of compression algorithm at individual column level.

Support for computed DEFAULT expressions
when importing JSONEachRow.

Support for Parquet format for import and export.

January 2019

Import/export data to HDFS using table function.

Import/export data to S3 using table function.

Ability to add new columns
to the sorting key of MergeTree tables.

Reduction of metadata volume in ZooKeeper.

February 2019

Ability to create dictionaries via DDL queries.

Adaptive index granularity in MergeTree tables.

Things We Can Talk About with Caution

Access Rights Management

Access restriction at the level of tables, columns and rows
(row-level security).

Role-based access control (RBAC).

Ability to connect an external system
for authentication (LDAP, Kerberos).

Resource Separation for Queries

Configurable resource pools: CPU share, IO, Network, RAM.

JOIN Support Development

Multiple JOINs without using nested subqueries.

Merge JOIN for joining very large sets.

Bucket-Shuffle JOIN for optimizing large distributed JOINs.

(Spring/Summer 2019)

Things We Can Only Tell
Our Good Friends

Secondary Indexes

To be precise — index structures for data skipping.

min/max, distinct values, micro bloom-filter.

Machine Learning Methods
as Aggregate Functions

Ability to create and apply a model
directly in ClickHouse.

ASOF JOIN

What is it?

Optimization of ORDER BY and GROUP BY on key columns.

SELECT * FROM sensors ORDER BY time DESC LIMIT 10

Expanding Capabilities for Working with Geo-data

Functions for working with geohash.

Dictionaries of polygons for searching (region by location queries).

Advanced String Processing Algorithms

Min-hash algorithm for fuzzy search of near-duplicates.

Fast matching of a large number of substrings.

Ability to create additional structure to accelerate brute-force substring search in string.

Data Storage on Multiple Volumes

Separation of hot and cold data on SSD and HDD.

Ability to use JBOD.

Proper Buffering for MergeTree

Getting rid of problems with frequent inserts.

(Fall/Winter 2019)

.