In this lesson we will:
- Introduce the concept of functions and the value they bring;
- Describe lambda functions which are defined in SQL directly within ClickHouse;
- Describe executable functions which are defined in external programming languages and configured into ClickHouse.
Introduction To Functions
ClickHouse queries and statements can incorporate a number of built-in functions that implement things like numeric operations, logical operators, working with JSON or manipulating strings. The following are examples of using these built in functions:
SELECT randNormal(10, 2) FROM numbers(10);
SELECT isValidJSON('{"hello": "world"}');
It is also possible to extend ClickHouse by adding bespoke functions which can then be used in the same way as part of SQL statements. These functions can help to make our queries more concise, flexible and modular by breaking out logic into a reusable piece of logic.
There are two types of functions available to us as ClickHouse developers - lambda functions defined in SQL, or executable functions defined in other programming languages.
Lambda Functions
The most common approach to defining functions is do so using the CREATE FUNCTION statement directly within ClickHouse.
This involves specifying a lambda function which takes one or more paraneters, applies the business logic, then returns some value.
For instance, we can create a function called odd_or_even which takes an integer and returns a string odd or even in the following way:
CREATE FUNCTION odd_or_even AS (n) -> if(n % 2, 'odd', 'even');
Once defined, this can then be used inline in a SQL statement:
SELECT odd_or_even(3)
Outputs:
odd
Of course, this is a very simple example, but we could implement more complex business logic specific to your domain.
The advantage of this is that the logic is defined in one place which can be reused through your application, making future maintenence easier. It also helps to make the queries where the function is used more descriptive and concise.
Executable Functions
It is also possible to define executable UDFs which are are defined in external programming or scripting languages which are then integrated into ClickHouse by adding entries into configuration files.
This is more flexible because it is possible to use the full features of your programming environment including complex business logic and integration with external libraries.
The disadvantage is the performance in that they could be slow to execute. Each time we call the function, we have to call out of process.
Nonetheless, these executable functions still have potential value.