In this lesson we will:
- Learn about Kafka partitions;
- Learn how to create and administer partitions;
- Learn about event ordering and how it relates to partitions;
- Learn about partitions keys
- Use the kafka-console-producer script.
What Are Kafka Partitions?
Partitions are one of the main features of Kafka for achieving parrelism and therefore performance.
A partition is a sub-division of a topic. For instance, our new_pizza_orders topic could be divided into 20 partitions numbered 0-19.
Partitions can also be spread across your broker cluster. For instance, on a 5 server broker cluster, each node could process 4 of the 20 partitions.
They key thing is that wherever they reside, each of these partions can be written to by producers and read from by consumers in parallel allowing us to significantly increase throughput and latency of our Kafka broker and cluster.
Creating Partitions
As already seen, when we explicitly create a topic with kafka-topics.sh, we need to specify a number of partitions. This time, we can pass a higher number to the partitions flag such as 5:
./bin/kafka-topics.sh --bootstrap-server localhost:9092 --topic new_pizza_orders --create --partitions 5 --replication-factor 1
If we now describe the topic:
./bin/kafka-topics.sh --bootstrap-server localhost:9092 --topic new_pizza_orders --describe
We can see that we now have 5 partitions numbered 0-4 configured on the topic:
Topic: new_pizza_orders TopicId: 8pj25fZsSFqDip7VvE3EHg PartitionCount: 5 ReplicationFactor: 1 Configs: segment.bytes=1073741824
Topic: new_pizza_orders Partition: 0 Leader: 0 Replicas: 0 Isr: 0
Topic: new_pizza_orders Partition: 1 Leader: 0 Replicas: 0 Isr: 0
Topic: new_pizza_orders Partition: 2 Leader: 0 Replicas: 0 Isr: 0
Topic: new_pizza_orders Partition: 3 Leader: 0 Replicas: 0 Isr: 0
Topic: new_pizza_orders Partition: 4 Leader: 0 Replicas: 0 Isr: 0
The leader refers to which broker is managing the partition. In this instance, because we only have a single broker setup on the training virtual machine, all partitions for the topic are managed on broker 0.
The number of partitions can also be adjusted after creation. Let's adjust the number of partitions to 10:
./bin/kafka-topics.sh --bootstrap-server localhost:9092 --alter --topic new_pizza_orders --partitions 10
Kafka does not support reducing the number of partitions at this time. We can only increase the number with the alter command.
Event Ordering & Partitions
As previously mentioned, the order in which we process events will be important. For instance, it makes no sense to update a customer before the customer has been created in our system.
Using partitions has implication for event ordering.
- The ordering of events is only guaranteed at a partition level
- We could therefore read and process events outs of order even if they are on the same topic
The key role is that data which has to be processed in order therefore has to go onto the same partition
Keys And Partition Order
Kafka producers help with this ordering challenge by putting all events with the same key onto the same partition.
For instance, all events for Customer ID 1234 could be routed to the same partition to avoid the situation of updating before creation.
Sometimes we need more complex behaviour than this though:
- Partitioning by a non keyed variable
- Perhaps our keys are not evenly distributed so a simple hash function over the key is not suitable
This is adjusted using a feature called the partitioning strategy.
Scenario Setup
We will use command line scripts to demonstrate the behaviour of partitions.
Create A New Topic
./bin/kafka-topics.sh --bootstrap-server localhost:9092 --create --topic new_orders --partitions 5
Create A Console Producer
./bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic new_orders
Create A Console Consumer
./bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic new_orders
Describe Consumer Groups
./bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --describe --all-groups
Adjust Number Of Partitions
./bin/kafka-topics.sh --bootstrap-server localhost:9092 --alter --topic new_orders --partitions 10