Course Overview
Table Engines In ClickHouse

MergeTree Engine Family

Lesson #2

In this lesson we will:

  • Lesson contents 1
  • Lesson contents 2
  • Lesson contents 3

About MergeTree

The MergeTree table engines are the most robust and heavily used engines of the choices available. There are actually a number of table engines in the same family, all that work slightly differently. What they have in common is that they accept inserts, but then work behind the scenes to organise and merge the rows to optimise the table data.

Types Of MergeTree

There are five table engines in the MergeTree family:

  • Merge Tree - The basic table which works by accepting data and then
  • Replacing Merge Tree - This will replace rows with more recent rows
  • Summing Merge Tree - This will combine rows and sum up data for instance we can aggregate by hour

Using The Merge Tree

The table engine is specified at the time of table creation using the ENGINE parameter. Let's go ahead and create three tables with three different engines which we will later user to explain the concept:

CREATE TABLE CUSTOMERS ENGINE=MergeTree()

CREATE TABLE CUSTOMERS ENGINE=ReplacingMergeTree()
CREATE TABLE CUSTOMERS ENGINE=SummingMergeTree()

Merge Tree Exploration

When you create data in a ClickHouse table, the rows are stored in segments.

create table

Replacing Merge Tree

In some tables, we are only interested in the most recent value of a particular data item. For instance, if we have a table of bank balances, there will be many instances where we only care about the current value and not the billions of historical values across the customer basis. In this situation, the Replacing Mergre Tree could help by dropping old values each time a new value is inserted for us.

We can begin by creating a table with the ReplacingMergeTree engine. We need to specify a PRIMARY KEY to describe which objects are replaced. To illustrate the point, we will use a compound primary key which combines two fields.

create table

We can insert data.

insert into orders values ( 1, 2, 3 );
insert into orders values ( 1, 2, 3 );
insert into orders values ( 1, 2, 3 );

A lot of the MergeTree operations work behind the scenes. Though there are no guarantees, issuing the optimize

optimize table orders

And with a quick check we can see that no data has been merged:

select * from orders

We will now insert a row which we expect to be replaced, using the same order_id and customer_id.

insert into orders values ( 1, 2, -99 );

Optimise the table again:

optimize table orders

And with one more check we will see that the latest order has been replaced:

select * from orders

Summing Merge Tree

The Summing Merge Tree works by adding up values. This is sometimes referred to as Rolling Up.

An example use case might be adding up the total value of orders for a given product

create table

We can insert data.

insert into orders values ( 1, 2, 3 );
insert into orders values ( 1, 2, 3 );
insert into orders values ( 1, 2, 3 );
Next Lesson:
02

Log Table Engine Family

In this lesson we will learn about the ClickHouse Log Family of table engines.

0h 15m




Work With The Experts In Real-Time Analytics & AI

we help enterprise organisations deploy powerful real-time Data, Analytics and AI solutions based on ClickHouse, the worlds fastest open-source database.

Join our mailing list for regular insights:

We help enterprise organisations deploy advanced data, analytics and AI enabled systems based on modern cloud-native technology.

© 2024 Ensemble. All Rights Reserved.