The Etcd high fsync durations incident occurs when the fsync duration of the Etcd service exceeds a certain threshold. This can be caused by various factors, such as high load, network issues, or other system errors. When this incident occurs, it can impact the performance and availability of the Etcd service, and it requires immediate attention from the responsible team to diagnose and resolve the underlying issue.
Parameters
Debug
Check if Etcd service is running
Check Etcd log for any errors or warnings
Check Etcd metrics for fsync duration
Check system load and resource usage
Check network connectivity and latency to Etcd server
Check disk I/O performance
High load: If the Etcd service is experiencing high traffic or a sudden spike in requests, it can cause the fsync duration to increase, leading to this incident.
Repair
Optimize the Etcd configuration settings, such as the WAL sync interval and the number of concurrent compacting processes, to improve the system performance and reduce the fsync duration.
Learn more
Related Runbooks
Check out these related runbooks to help you debug and resolve similar issues.