Runbook

Kafka Time synchronization issue impacting Kafka cluster.

Back to Runbooks

Overview

This incident type relates to an issue with time synchronization that affects the proper functioning of a Kafka cluster. As Kafka relies on accurate time synchronization to maintain data consistency, any failure in this process can cause serious issues. The incident might cause data loss or duplication, service disruptions, or other problems that can impact the overall performance of the system. It requires immediate attention from the engineering team to investigate and resolve the root cause of the issue.

Parameters

Debug

Check system time and date

Check if NTP is installed and running

Check the status of the NTP service

Check the NTP configuration file for the correct NTP servers

Check the NTP query responses

Check the Kafka logs for any time synchronization errors

Check the status of the Kafka service

Check the Kafka configuration file for the correct time zone

The time zone settings on the affected servers may have been incorrect or changed.

Repair

Check the time synchronization settings on all servers hosting the Kafka cluster and ensure that they are set to the correct time zone and are synced with a reliable time source.

Learn more

Related Runbooks

Check out these related runbooks to help you debug and resolve similar issues.