Apache Kafka for Developers #6: Kafka Consumer Partition Rebalancing

Partition rebalancing in Kafka is essential for maintaining an even distribution of data across consumers within a consumer group. Partition rebalancing can occur in the following situations

New Consumer Joins the Group

When a new consumer joins the group, Kafka needs to redistribute the partitions to include the new consumer.

Consumer Leaves the Group

If a consumer crashes or is shut down, Kafka redistributes its partitions among the remaining consumers.

Partition Changes in a Topic

Adding or removing partitions in a topic also triggers a rebalance

Kafka Rebalancing Protocols

Eager Rebalancing Protocol:

It is also known as "stop-the-world" rebalancing, this protocol stops all consumers in the group during a rebalance. All consumers must rejoin the group and receive new partition assignments.

It is simple and straightforward. However, it can cause significant disruption and downtime, especially in large clusters, as all consumers must stop and rejoin the group

Incremental Cooperative Rebalancing Protocol

This protocol supports incremental rebalancing, so only the consumers that need to adjust their assignments will pause and rejoin the group, while the rest continue processing.

It reduces disruption and downtime, making it ideal for large clusters. However, it is more complex to implement and manage compared to eager rebalancing.

Partition assignment strategies

During a rebalance, Kafka uses partition assignment strategies to determine how partitions are assigned to consumers. Here are the different rebalance strategies

Round-Robin Assignment

This is the simplest and most commonly used partition assignment strategy in Kafka. Partitions are assigned to consumers one after another in a circular fashion. This approach helps balance the load evenly among consumers, ensuring that no single consumer is overloaded while others remain underutilized.

This assignment uses Eager Rebalancing Protocol while rebalancing.

Use below setting to enable round-robin assignment
partition.assignment.strategy=org.apache.kafka.clients.consumer.RoundRobinAssignor


Example:

Consider a scenario with 3 consumers and 6 partitions:
  • Topic: A
  • Consumers: Consumer 1, Consumer 2, Consumer 3
  • Partitions: Partition 0, Partition 1, Partition 2, Partition 3, Partition 4, Partition 5
Assignment before a new consumer joins:

Consumer 1: Partition 0, Partition 3
Consumer 2: Partition 1, Partition 4
Consumer 3: Partition 2, Partition 5

Assignment after a new consumer C4 joins:

Consumer 1: Partition 0, Partition 4
Consumer 2: Partition 1, Partition 5
Consumer 3: Partition 2
Consumer 4: Partition 3
 
This assignment is suitable for evenly distributed partitions and consumers with similar processing capabilities


Range Assignment

This approach allocates partitions to consumers in contiguous ranges, making it beneficial when partitions have a natural order and consumers can efficiently manage a range of partitions. Partitions are sorted and divided into contiguous ranges, with each range assigned to a specific consumer.

This assignment uses Eager Rebalancing Protocol while rebalancing.

Use below setting to enable range assignment
partition.assignment.strategy=org.apache.kafka.clients.consumer.RangeAssignor

Example:

Consider a scenario with 3 consumers and 6 partitions:
  • Topic: A
  • Consumers: Consumer 1, Consumer 2, Consumer 3
  • Partitions: Partition 0, Partition 1, Partition 2, Partition 3, Partition 4, Partition 5
Assignment before a new consumer joins:

Consumer 1: Partition 0, Partition 1
Consumer 2: Partition 2, Partition 3
Consumer 3: Partition 4, Partition 5

Each consumer gets a contiguous range of partitions

Assignment after a new consumer C4 joins:

Consumer 1: Partition 0, Partition 1
Consumer 2: Partition 2
Consumer 3: Partition 3
Consumer 4: Partition 4, Partition 5

 
This assignment is suitable when partitions have a natural order and consumers can handle a range of partitions efficiently.

Sticky Assignment

The approach aims to minimize partition movement by sticking to previous assignments as much as possible. It reduces the overhead of rebalancing and helps maintain data locality.

This assignment uses Eager Rebalancing Protocol while rebalancing.

Use below setting to enable sticky assignment
partition.assignment.strategy=org.apache.kafka.clients.consumer.StickyAssignor

Example:

Consider a scenario with 3 consumers and 6 partitions:
  • Topic: A
  • Consumers: Consumer 1, Consumer 2, Consumer 3
  • Partitions: Partition 0, Partition 1, Partition 2, Partition 3, Partition 4, Partition 5
Assignment before a new consumer joins:

Consumer 1: Partition 0, Partition 3
Consumer 2: Partition 1, Partition 4
Consumer 3: Partition 2, Partition 5

Each consumer gets two partitions, distributed in a round-robin fashion initially.

Assignment after a new consumer C4 joins:

Consumer 1: Partition 0, Partition 3
Consumer 2: Partition 1, Partition 4
Consumer 3: Partition 2
Consumer 4: Partition 5

This assignment is suitable Ideal for reducing the overhead of rebalancing and maintaining data locality.

Cooperative Sticky Assignment

The approach aims to minimize partition movement by sticking to previous assignments as much as possible. It reduces the overhead of rebalancing and helps maintain data locality.

This assignment uses Incremental Cooperative Rebalancing Protocol while rebalancing.

Use below setting to enable cooperative sticky assignment
partition.assignment.strategy=org.apache.kafka.clients.consumer.CooperativeStickyAssignor

Example:

Consider a scenario with 3 consumers and 6 partitions:
  • Topic: A
  • Consumers: Consumer 1, Consumer 2, Consumer 3
  • Partitions: Partition 0, Partition 1, Partition 2, Partition 3, Partition 4, Partition 5
Assignment before a new consumer joins:

Consumer 1: Partition 0, Partition 3
Consumer 2: Partition 1, Partition 4
Consumer 3: Partition 2, Partition 5

Each consumer gets two partitions, distributed in a round-robin fashion initially.

Assignment after a new consumer C4 joins:

Consumer 1: Partition 0, Partition 3
Consumer 2: Partition 1, Partition 4
Consumer 3: Partition 2
Consumer 4: Partition 5

The key difference is Sticky Assignment uses an Eager Rebalancing Protocol, while Cooperative Sticky Assignment uses Incremental Cooperative Protocol.

This assignment is suitable for large clusters where minimizing disruption during rebalancing is critical

In Kafka, the default partition assignment strategy for consumers is the Range Assignor. This strategy allocates partitions to consumers in contiguous ranges, ensuring that each consumer receives a set of numerically adjacent partitions.