In a traditional Kafka cluster setup, ZooKeeper is essential for managing and coordinating the Kafka Cluster. Its responsibilities include
However, there are several complexities to using ZooKeeper
Apache Kafka Raft(KRaft) is the consensus protocol introduced to remove Zookeeper for cluster metadata management. The Kafka KRaft architecture simplifies the metadata management within Kafka itself to remove the external system dependency.
Quorum Controllers:
Happy Coding :)
- Storing metadata about Kafka brokers, topics, partitions, and their configurations
- Maintaining Kafka topic information, such as the number of partitions, replication factor, and partition leader.
- Electing leaders for each partition to ensure there is always a leader available to handle read and write requests.
- Electing one of the nodes as the Kafka Controller, which manages the leader-follower relationship for partitions.
- Monitoring the health of Kafka brokers and notifying the controller of any broker failures, enabling quick failover and recovery. Maintaining Access Control Lists (ACLs) for all topics in the cluster.
However, there are several complexities to using ZooKeeper
- Kafka and ZooKeeper are separate systems, which adds complexity and increases the risk of misconfiguration when managing a Kafka cluster.
- Storing metadata in ZooKeeper can become a bottleneck as the Kafka cluster grows.
- Loading metadata from ZooKeeper can be slow, particularly during startup or controller elections.
- Synchronizing metadata between ZooKeeper and Kafka requires careful handling during version updates.
Apache Kafka Raft(KRaft) is the consensus protocol introduced to remove Zookeeper for cluster metadata management. The Kafka KRaft architecture simplifies the metadata management within Kafka itself to remove the external system dependency.
Advantages of KRaft
- It uses a quorum-based controller, ensures that metadata is consistently replicated across the cluster
- The removal of ZooKeeper simplifies operational tasks, making it easier to monitor, administer, and troubleshoot Kafka clusters
- KRaft allows kafka to scale more efficiently even if cluster reaches millions of partitions
- It allows a single security model for the entire system
- It is production ready from Kafka version 3.3.1 onwards
- During startup or controller failover, a new controller can be spin up immediately because the data is already replicated across other controllers in the quorum
Quorum Controllers:
- Metadata is managed by a group of nodes known as quorum controllers.
- These controllers use the Raft consensus algorithm to ensure all nodes agree on the metadata state.
- Quorum controllers use an event-driven protocol to replicate metadata changes across all controllers in the quorum.
- All metadata changes are stored in a dedicated Kafka topic called __cluster_metadata.
- This topic has a single partition containing all information related to topics, partitions, and configurations.
- One of the quorum controllers acts as the leader and manages the metadata.
- The follower controllers replicate the metadata for failover purposes.
- For every metadata change, the leader controller sends an acknowledgment to all follower controllers.
- The change is considered committed only after receiving confirmation from the majority controllers.
- If the leader becomes unreachable, the followers initiate a new leader election.
- A follower becomes a candidate, requests votes from other nodes, and the node with the majority of votes becomes the new leader.
- The leader sends regular heartbeat messages to followers to maintain authority.
- If a follower doesn't receive a heartbeat within a set time, it assumes the leader has failed and starts a new election.
Apache Kafka for Developers Journey:
- Apache Kafka for Developers #1: Introduction to Kafka and Comparison with RabbitMQ
- Apache Kafka for Developers #2: Kafka Architecture and Components
- Apache Kafka for Developers #3: Kafka Topic Replication
- Apache Kafka for Developers #4: Kafka Producer and Acknowledgements
- Apache Kafka for Developers #5: Kafka Consumer and Consumer Group
- Apache Kafka for Developers #6: Kafka Consumer Partition Rebalancing
- Apache Kafka for Developers #7: Kafka Consumer Commit Offset
- Apache Kafka for Developers #8: Kafka Consumer Auto Offset Reset
- Apache Kafka for Developers #9: Replacing ZooKeeper with KRaft
- Apache Kafka for Developers #10:Setting Up Kafka Locally with Docker
- Apache Kafka for Developers #11: Creating and Managing Kafka Topics
- Apache Kafka for Developers #12: Setting Up a Kafka Producer in Node.js using KafkaJS
Comments
Post a Comment