Apache Kafka Ideas – Part 3

Consumer Groups

So far the following ideas have been introduced: topic, message, partition, producer, consumer and broker. By now, you should understand how Kafka stores messages on disk using commit log, topics and partitions. You should also know how a message is structured.

It’s time to introduce consumer groups, which are the missing piece of message distribution in Kafka.

In the previous post I mentioned that there are certain rules, which define how messages are delivered to partitions. Let’s assume the same round robin selection will be used as before.

Hire Event Exercise

In order to illustrate how consumer groups are used, let’s first define an example scenario. I will use the hire topic with two partitions introduced earlier in the series. There will be a single consumer that will be sending messages to the topic. When a person is hired by a fictitious company Building Architects & Co., that person is registered in an HCM Application. On the first day of work, each new employee is required to have their working suit ready and needs to be assigned to their first task based on the current stream of work and their declared experience. Those two events happen in different programmes in the company – let’s call them Equipment Registry and Work Manager. The picture below illustrates required flow:

Hire Event

It’s not hard to guess that Kafka and the hire topic will try to meet those requirements. Producer – the HCM Application – will be sending hire events to the hire topic. Each message will go in a round robin fashion to partitions 0 and 1. In this example John Smith hire event was placed in the partition number 0.

Hire Event Question

The question is: how do consumers know which messages are for them, which to read and which one they already read? All messages that are sent to the hire topic should be delivered to both applications: Work Manager and Equipment Registry.

Consumer Groups

Message distribution is managed with a concept of the consumer group:

  • consumer group tells the broker what kind of service is connecting to it
  • consumer group can be compared to the class in Java, and a consumer that belongs to that group can be thought of as an instance of that class
  • each consumer can belong to one group only
  • one group can have many consumers
  • the maximum number of active consumers in a group is equal to the number of partitions in a topic:
no of partitions >= active consumers in a group

Kafka tracks which messages have been read on two levels simultaneously:

  • consumer group level
  • partition level

It means that consumers belonging to the same consumer group can not read messages from the same partition. If there are two consumers in a group and a topic consists of two partitions, one consumer will read from partition 0 and another from partition 1. It will be best to illustrate use cases of consumers and consumer groups.

Use Cases

UC1: Topic with a single partition and a single consumer

All messages from partition 0 are received by Consumer A.

UC1 Single Partition Single Consumer

UC2: Topic with two partitions and a single consumer

Red message from partition 0 and blue message from partition 1 are received by Consumer A.

UC2 Two Partitions Single Consumer

UC3: Topic with two partitions and two consumers in the same CG

Consumer A and Consumer B both belong to the same consumer group. Red message from partition 0 is received by Consumer A, blue message from partition 1 goes this time to Consumer B.

UC3 Two Partitions Two Consumers

UC4: Topic with two partitions, four consumers and two CGs

Consumer A and D belong to the grey group, Consumer B and C to the green group. Each consumer in a group is attached to a single partition. Each message from a single partition goes to exactly one consumer per group: red message is received by A and B, blue message is received by C and D. In other words a group receives all messages from all partitions.

UC4 Two Partitions Four Consumers Different CG

As you can see, the number of consumers in a group has to meet the inequality defined above, i.e. there can be as many active consumers in a single group as there are partitions in a topic.

Solving Hire Event Exercise

Below is an illustration of a solution for the hire event exercise:

Hire Event Solved

In the solution Work Manager belongs to WM Group and Equipment Registry to ER Group. Thanks to this setup, both applications can receive all messages sent by the HCM Application.

Solving problem from Part 1

Overcoming Message Queue Limitation

The challenge with message queue was adding a third service (D), such that all messages were received by that service. As you probably remember, this was not possible with queues, since only some messages would go to D. With Kafka it is possible by:

  • using 2 partitions and the same consumer group for B and C to load balance work between those services
  • using third service D in another consumer group to read from partitions 0 and 1 so that it receives all messages

Overcoming Message Queue Limitation

Overcoming Publish – Subscribe Limitation

The requirement was to load balance services B and C. With the traditional publish – subscribe pattern, all messages would be received by B, C, B’ and C’, hence the work would be doubled rather than load balanced. In order to achieve that in Kafka, the following setup can be used:

  • 2 partitions and 2 consumer groups
  • grouping services: B and B’, C and C’ to load balance work between them

Overcoming Publish – Subscribe Limitation

I added a data table as a destination to illustrate that services B and B’, C and C’ are delivering the same functionality.

Next Steps

This post completes a set of basic Kafka Ideas, which provide reasoning why one would like to use Kafka. In the future Kafka Ideas you will see how to install, configure and use Kafka. There are also advanced topics which were omitted for simplicity, however are worth analysing to understand Kafka even better.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s