Runbook
CoreDNS Excessive Cache Utilization Incident
Back to Runbooks
Overview
This incident type typically occurs when the CoreDNS service is consuming an excessive amount of cache. CoreDNS is a flexible and extensible DNS server that is used in Kubernetes clusters for service discovery and load balancing. However, when the cache utilization becomes too high, it can result in performance degradation, service disruptions, and even system crashes. This incident requires immediate attention from the DevOps team to identify the root cause and implement a solution to prevent it from happening again.
Parameters
Debug
List all the pods in the default namespace
Check the logs of the CoreDNS pods to see if there are any errors or warnings
Check the resource usage of the CoreDNS pods to see if they are consuming too much memory or CPU
Check the CoreDNS configuration file to see if there are any misconfigurations that could be causing the excessive cache utilization
Restart the CoreDNS pods to see if it resolves the issue temporarily
Check the events log to see if there are any relevant events that could be related to the excessive cache utilization
Repair
Learn more
Related Runbooks
Check out these related runbooks to help you debug and resolve similar issues.