Course Overview
Ingesting Data Into ClickHouse

Ingesting Data

Lesson #1

In this lesson we will:

  • Lesson contents 1
  • Lesson contents 2
  • Lesson contents 3

Ingesting Data

The first task we need to do with a new ClickHouse instance is to load or ingest some data into it.

Of course everyone is familiar with the trusty INSERT statement which can be issued at the SQL prompt, but in the real world, the main task is likely to involve taking extracts from external systems and datastores and loading them into our ClickHouse instance.

This can be a complex and messy exercise. Data might be in a variety of inconvenient formats which we have to parse before we can bring it into ClickHouse. It may need cleaning and transforming into more appropriate formats, and it may contain errors which we need to handle.

Data could arrive in infrequent and very large batches, or in very frequent small batches. Increasingly, it could also be streamed in real-time over APIs and platforms such as Kafka.

These ingestion processes also usually have to run continually as new data is generated. This means they need to be automated and monitored for errors on an ongoing basis, and we have to handle situations such as retries and late arriving data.

As ClickHouse developers we have a choice. We may choose to take this process on ourselves, or we may reach for third party ETL tools to manage some of this complexity for us.

Next Lesson:
01

Using Airbyte With ClickHouse Cloud

In this lesson we will learn how to use Airbyte to bring data into ClickHouse cloud.

0h 15m




Work With The Experts In Real-Time Analytics & AI

we help enterprise organisations deploy powerful real-time Data, Analytics and AI solutions based on ClickHouse, the worlds fastest open-source database.

Join our mailing list for regular insights:

We help enterprise organisations deploy advanced data, analytics and AI enabled systems based on modern cloud-native technology.

© 2024 Ensemble. All Rights Reserved.