Runbook

Kafka Topic Partitions Out of Sync Incident

Back to Runbooks

Overview

In this type of incident, the partitions of a Kafka topic become out of sync, which means that some of the messages are not being delivered to the intended consumers. This can happen due to various reasons, such as a network issue, a hardware failure, or a bug in the Kafka software. The consequences of this incident can be serious, as it can cause data loss, message duplication, or other errors in the downstream systems that depend on the Kafka topic. Therefore, it is important to identify and resolve the root cause of the partition sync issue as soon as possible to ensure the reliability and consistency of the data pipeline.

Parameters

Debug

List all topics in Kafka

Describe a specific topic to check if it has multiple partitions

Check if any of the Kafka brokers are down

Check if there are any under-replicated partitions

Repair

Check the topic configuration to ensure that replication and partition counts are correct.

Learn more

Related Runbooks

Check out these related runbooks to help you debug and resolve similar issues.