Not-So-Dumb Pointers in ClickHouse

The Foundation of ClickHouse is Columns

On disk — columns.
Data is stored by columns.

In RAM — columns.
Data is processed by columns.

How Columns are Stored in RAM

As chunks of columns — for example, 65,536 elements.

Chunk size — depends on many things.

For SELECT — see the max_block_size setting.

How Columns are Stored in RAM

Represented as objects with the IColumn interface.

Variants — ColumnVector<T>, ColumnString, ColumnArray...

ColumnVector<T> — almost like std::vector<T>.
But under the IColumn interface.
And instead of std::vector<T> — PODArray<T> (why?).

How Columns are Stored in RAM

Previously it was std::vector. PODArray — is just an optimization.

PODArray:
— no extra memset;
— has 15 bytes padding at the end;
— uses an allocator interface different from std::allocator, which sometimes allows mremap.

How Columns are Stored in RAM

ColumnString — consists of two components:

1. Bytes laid out sequentially.
2. Offsets to the i+1 string.

h e l l o \0 w o r l d \0
6 12

How Columns are Stored in RAM

ColumnConst

Made of one nested column,
containing one value.

What the IColumn Interface Provides

Basic operations:
— cut — extract part of a column, for LIMIT implementation;
— filter — for WHERE implementation;
— compareAt, permute — for ORDER BY implementation;
...

Almost All Operations are Immutable

virtual Ptr filter(const Filter & filt, ssize_t result_size_hint) const = 0;

Instead of modifying contents, they create
and return a new column object.

This is normal, since operations are "coarse-grained".

But there are also "fine-grained", mutating operations.

IColumn, what it's responsible for:

— storing data in RAM;
— common operations on columns.

IColumn, what it's similar to:

— Apache Arrow;
— arrays in NumPy;
— arrays in APL, J, K.

Motivation

Isolate maximally efficient
inner loops from wrapper code.

Code doesn't have to be efficient as a whole.

Optimizable places should be localizable.

«vectorized engine»

Bonus:
— SIMD instructions;
— clever optimizations for homogeneous data
(IColumn::filter, LIKE function implementation);

Common Subexpression Elimination

SELECT f(x + y), g(x + y)

— x + y — is the same expression appearing multiple times.

SELECT x + y AS a, x + y AS b

— a and b — are actually the same columns.

Simple, elegant and incorrect solution:
— use shared_ptr<IColumn>.

Columns a and b would just reference the same thing.

Common Subexpression Elimination

SELECT f(x + y), g(x + y)

The query AST can also be glued — turned from a tree into a DAG:

  x   y   x   y              x   y
   ↘ ↙     ↘ ↙                ↘ ↙
    +       +                  +
    ↓       ↓      −−−>       ↙ ↘
    f       g                f   g
     ↘     ↙                 ↓   ↓
      SELECT                 SELECT

Use shared_ptr<IAST> for nodes and share identical nodes.

All Function Computations are Immutable

Computations:

c1 = x;
c2 = y;
c3 = plus(c1, c2);
c4 = f(c3);
c5 = g(c3);

— can be executed independently,
in arbitrary order of graph traversal
and in parallel.

Problems with Using shared_ptr

Some operations are mutable.

For example:
— IColumn::insert — insert one value at the end of the column;
— IColumn::insertRangeFrom — insert a chunk of another column at the end.

Usage example:

INSERT INTO table SELECT a, b FROM other

— gluing small blocks into larger ones during INSERT;
— but mutable operations cannot be applied to shared data.

Persistent data structures?

Problems with Using shared_ptr

If the AST is glued into a DAG with shared_ptr,
then rewriting one piece of the query changes another.

Obvious Solution

Use shared_ptr<const T> for shared data
and unique_ptr<T> when data needs to be modified.

Example:

// Create object and modify it.
std::unique_ptr<T> u{std::make_unique<T>()};
u->modify();

// When object is ready, it can be shared.
std::shared_ptr<const T> s1{std::move(u)};
std::shared_ptr<const T> s2 = s1;

// Nobody can modify the shared object.

Obvious Solution

If we need to modify a shared object?
— just clone it:

// Create object and modify it.
std::unique_ptr<T> u{std::make_unique<T>()};
u->modify();

// When object is ready, it can be shared.
std::shared_ptr<const T> s1{u.release()};
std::shared_ptr<const T> s2 = s1;

// Nobody can modify the shared object.
// But can clone it, then modify the new copy.
std::unique_ptr<T> u2{s2->clone()};
u2->modify();

Obvious Solution

Gluing small blocks into larger ones.

Was:

result->insertRangeFrom(*source, 0, length);

Became:

result_mutable = result->clone();
result_mutable->insertRangeFrom(*source, 0, length);
result = result_mutable;

Extra copying, even if refcount == 1.
In case of many iterations, all copies except the first
will be guaranteed to be redundant.

Obvious Solution

Extra copying, even if refcount == 1.

In case of many iterations, all copies
except the first will be guaranteed to be redundant.

Can we clone an object only if refcount > 1 and otherwise
magically convert shared_ptr to unique_ptr?

No, because they have different memory layout and everything else.

sizeof(unique_ptr<T>) == sizeof(void*)
sizeof(shared_ptr<T>) == 2 * sizeof(void*)

(in libc++ or libstdc++ implementation)

Copy-on-write?

The copy-on-write technique is notoriously known
from old implementations of std::string.

Example: libstdc++ with old C++ ABI by default in gcc before 5.1.

CoW-strings are slow.

https://www.youtube.com/watch?v=rJWSSWYL83U

Copy-on-write?

Why CoW-strings — are not very good:

— CoW implementation requires refcount tracking, which requires atomic operations in multithreaded programs;

— but in typical cases, most strings are small and copying them (with SSO or a good allocator) is cheaper than changing refcnt;

— copying on modification is done implicitly, which requires extremely complex code;

— especially for proper support of non-const iterator
and non-const operator[];

— starting with C++11, move replaces copying of temporary objects;

Copy-on-write?

CoW should NOT be used:

1. For small objects that are easy to copy.

— but if object is heavy, it's better to share it
and increase refcount instead of copying.

Example: fbstring uses CoW for long strings and SSO for small ones.

2. If copying is performed implicitly.

— but you can make one method that gives access
to the non-const part of the interface (prepare object for modification)
clone if refcount > 1 or "reinterpret_cast" otherwise.

Reasonable CoW

// Creating and assigning to immutable ptr.
COW<T>::Ptr x = COW<T>::create(1);
// Sharing single immutable object in two ptrs.
COW<T>::Ptr y = x;
// Now x and y are shared.

// Change value of x.
{
    // Creating mutable ptr.
    // It can clone an object under the hood if it was shared.
    COW<T>::MutablePtr mutate_x = std::move(*x).mutate();
    // Using non-const methods of an object.
    mutate_x->set(2);
    // Assigning pointer 'x' to mutated object.
    x = std::move(mutate_x);
}
// Now x and y are unshared and have different values.

Reasonable CoW

COW<T>::Ptr — behaves similarly to std::shared_ptr<const T>

COW<T>::MutablePtr — behaves similarly to std::unique_ptr<T>

Converting mutable to immutable:

COW<T>::MutablePtr x;
COW<T>::Ptr y = std::move(x);

Converting immutable to mutable:

COW<T>::Ptr x;
COW<T>::MutablePtr y = std::move(*x).mutate();

— performs copying if refcount > 1.

Reasonable CoW

std::move(*x).mutate();

std::move(x)->mutate();

1. In C++ there's no rvalue qualified operator->
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3723.html

2. We want to invalidate the value itself, not the Ptr.
(we can still use the assignment operator on Ptr)
Update after the talk: actually there are no problems with calling
the assignment operator after move.

Implementation

template <typename Derived>
class COW : public
    boost::intrusive_ref_counter<Derived>

— we'll use an intrusive pointer
— add refcount to the object.

class IColumn : public COW<IColumn>
{
private:
    friend class COW<IColumn>;
    virtual MutablePtr clone() const = 0;

— COW<T>::mutate method will call T::clone if needed.

Implementation

template <typename Derived>
class COW : public boost::intrusive_ref_counter<Derived>
{
    template <typename T>
    class IntrusivePtr : public boost::intrusive_ptr<T> {...}

protected:
    template <typename T>
    class mutable_ptr : public IntrusivePtr<T> {...}

public: using MutablePtr = mutable_ptr<Derived>;

protected:
    template <typename T>
    class immutable_ptr : public IntrusivePtr<const T> {...}

public: using Ptr = immutable_ptr<Derived>;
}

template <typename T>
class mutable_ptr : public IntrusivePtr<T>
{
...
public:
    /// Copy: not possible.
    mutable_ptr(const mutable_ptr &) = delete;

    /// Move: ok.
    mutable_ptr(mutable_ptr &&) = default;
    mutable_ptr & operator=(mutable_ptr &&) = default;

    /// Initializing from temporary of compatible type.
    template <typename U>
    mutable_ptr(mutable_ptr<U> && other) : Base(std::move(other)) {}

    mutable_ptr() = default;
    mutable_ptr(std::nullptr_t) {}
};

template <typename T>
class immutable_ptr : public IntrusivePtr<const T>
{ ...
public:
    /// Copy from immutable ptr: ok.
    immutable_ptr(const immutable_ptr &) = default;
    immutable_ptr & operator=(const immutable_ptr &) = default;

    template <typename U>
    immutable_ptr(const immutable_ptr<U> & other) : Base(other) {}

    /// Move: ok.
    immutable_ptr(immutable_ptr &&) = default;
    immutable_ptr & operator=(immutable_ptr &&) = default;

    /// Initializing from temporary of compatible type.
    template <typename U>
    immutable_ptr(immutable_ptr<U> && other) : Base(std::move(other)) {}

    /// Move from mutable ptr: ok.
    template <typename U>
    immutable_ptr(mutable_ptr<U> && other) : Base(std::move(other)) {}

    /// Copy from mutable ptr: not possible.
    template <typename U>
    immutable_ptr(const mutable_ptr<U> &) = delete;

    immutable_ptr() = default;
    immutable_ptr(std::nullptr_t) {}
};

class COW : public boost::intrusive_ref_counter<Derived>
{
protected:
    MutablePtr shallowMutate() const
    {
        if (this->use_count() > 1)
            return derived()->clone();
        else
            return assumeMutable();
    }

public:
    MutablePtr mutate() const && { return shallowMutate(); }

    MutablePtr assumeMutable() const
    {
        return const_cast<COW*>(this)->getPtr();
    }

    Derived & assumeMutableRef() const
    {
        return const_cast<Derived &>(*derived());
    }

Inheritance

template <typename Base, typename Derived>
class COWHelper : public Base

class IColumn
    : public COW<IColumn>
{
    friend class COW<IColumn>;
    virtual MutablePtr clone() const = 0;
    virtual ~IColumn() {}
};

class ConcreteColumn
    : public COWHelper<IColumn, ConcreteColumn>
{
    friend class COWHelper<IColumn, ConcreteColumn>;
};

Inheritance

Inheritance diagram:

boost::intrusive_ref_counter<IColumn>
                ↑
          COW<IColumn>
                ↑
             IColumn
                ↑
   COWHelper<IColumn, ConcreteColumn>
                ↑
          ConcreteColumn

Composition / Aggregation

class ColumnPair final
    : public COWHelper<IColumn, ColumnPair>
{
private:
    SomePtr a;
    SomePtr b;
}

It's unclear whether to make class members MutablePtr or Ptr?

Composition / Aggregation

Approach 1: immutable members.

Object contains immutable members inside.
In all non-const methods, mutate is called,
modifications are made and assigned back.

Disadvantages:

Extra checks and atomic operations
on each call to non-const methods.

Composition / Aggregation

Approach 2: mutable members.

Object contains mutable members inside.
Two objects cannot have shared class members.

Disadvantages:

Shared class members are needed.
Example: columns with arrays having matching lengths.

Composition / Aggregation

Approach 3: immutable members and deep-mutation.

Object contains immutable members inside.
The mutate method performs deep mutation of all class members,
which guarantees their uniqueness in mutable objects.
In all non-const methods assumeMutableRef is used.

Disadvantages:

Complex implementation.
The assumeMutableRef method is unsafe.

Composition / Aggregation

std::experimental::propagate_const

Composition / Aggregation

Approach 4: chameleon_ptr.

Object contains chameleon members inside.
Which behave as immutable if the object is const
and as mutable if the object is not const.

template <typename T>
class chameleon_ptr
{
private:
 immutable_ptr value;
public:
 const T & operator*() const { return *value; }
 T & operator*() { return value->assumeMutableRef(); }
...
}

Type Polymorphism by const

using T = X;
using const T = Y;

Where X and Y — are different types with the same memory layout.
T works approximately like union { X x; Y y; }.

template <typename T>
using COWPtr<T> = mutable_ptr<T>;

template <typename T>
using const COWPtr<T> = immutable_ptr<T>;

Type Polymorphism by const

It's unlikely worth doing this in C++.

C++ is already too complex :)

Not-So-Dumb Pointersin ClickHouse

The Foundation of ClickHouse is Columns

How Columns are Stored in RAM

How Columns are Stored in RAM

How Columns are Stored in RAM

How Columns are Stored in RAM

How Columns are Stored in RAM

What the IColumn Interface Provides

Almost All Operations are Immutable

IColumn, what it's responsible for:

IColumn, what it's similar to:

Motivation

Common Subexpression Elimination

Common Subexpression Elimination

All Function Computations are Immutable

Problems with Using shared_ptr

Problems with Using shared_ptr

Obvious Solution

Obvious Solution

Obvious Solution

Obvious Solution

Copy-on-write?

Copy-on-write?

Copy-on-write?

Reasonable CoW

Reasonable CoW

Reasonable CoW

Implementation

Implementation

Inheritance

Inheritance

Composition / Aggregation

Composition / Aggregation

Composition / Aggregation

Composition / Aggregation

Composition / Aggregation

Composition / Aggregation

Type Polymorphism by const

Type Polymorphism by const

.

Not-So-Dumb Pointers
in ClickHouse