ClickHouse meetup in Saint Petersburg

Author: Alexey Milovidov, 2017-03-06.

ClickHouse meetup
in Saint Petersburg

ClickHouse: Present and Future

Team

Now 5 developers.

Previously

— HTTP and executable sources;

— merge optimization, vertical merge;

— distributed query tracing;

— clickhouse-local;

— BETWEEN, || operators;

— UUID - text functions;

New in Query Language

— KILL QUERY;

— LIMIT BY;

— SELECT INTO OUTFILE;

Interfaces

— ability to get progress in HTTP headers;

— ability to skip errors in text formats;

— proper HTTP response codes;

Build

— «proper» build and packages;

— system.build_options table;

Dictionaries

— cached external dictionary performance;

— cached external dictionary instrumentation;

— HTTPS dictionaries;

Instrumentation

— information about index memory usage;

— information about uncompressed column sizes;

— metrics for cache RAM consumption;

— metrics about merges;

Optimizations

— DISTINCT optimization;

— gzip performance in HTTP interface;

— mark cache optimization;

Functions

— proper comparison logic, least, greatest;

— groupUniqArray for all data types;

— decodeURLComponent;

Something Else

— protection against accidental DROP TABLE;

— use_client_time_zone; timezone in config;

— fsync_metadata;

Community

— integration with Grafana, Redash, Apache Zeppelin, Superset;

— proper packages for CentOS, RHEL, GosLinux;

— native protocol driver for Go and C++;

— ability to pass X-ClickHouse-* headers;

— benchmarks NYC Taxi, Percona (Spark);

— Greenplum benchmark;

— English Telegram chat;

— meetings and talks (Brussels, Paris);

ClickHouse vs. Spark

https://www.percona.com/blog/2017/02/13/clickhouse-new-opensource-columnar-database/

ClickHouse vs. Greenplum

TODO (March-April 2017)

— distributed DDL queries;

— configs in ZooKeeper;

— full NULL support;

TODO (Spring-Summer 2017)

— ODBC driver functionality on Windows;

— rewrite query analysis: proper JOIN support;

Additional

[email protected]