This incident type refers to an issue where the coordinator node in a Cassandra database cluster experiences slow query latency, resulting in timeouts. The coordinator node is responsible for managing client connections and routing queries to the appropriate nodes in the cluster. If it is not able to process queries quickly enough, clients may experience timeouts and be unable to retrieve the data they need. This issue can be caused by a variety of factors, including high load on the cluster, network issues, or hardware problems.
Parameters
Debug
Check the status of the Cassandra cluster
List the Cassandra keyspaces to see if there are any issues with replication
View the Cassandra system log to look for errors related to the coordinator node
Check the load on the Cassandra coordinator node
Check the network latency between nodes in the Cassandra cluster
View the Cassandra nodetool output to see if there are any issues with the cluster
Repair
Increase the capacity of the Cassandra cluster by adding more nodes to distribute the load and reduce query latency.
Learn more
Related Runbooks
Check out these related runbooks to help you debug and resolve similar issues.