CoreDNS Excessive Cache Utilization Incident

Overview

This incident type typically occurs when the CoreDNS service is consuming an excessive amount of cache. CoreDNS is a flexible and extensible DNS server that is used in Kubernetes clusters for service discovery and load balancing. However, when the cache utilization becomes too high, it can result in performance degradation, service disruptions, and even system crashes. This incident requires immediate attention from the DevOps team to identify the root cause and implement a solution to prevent it from happening again.

Parameters

Debug

List all the pods in the default namespace

Check the logs of the CoreDNS pods to see if there are any errors or warnings

Check the resource usage of the CoreDNS pods to see if they are consuming too much memory or CPU

Check the CoreDNS configuration file to see if there are any misconfigurations that could be causing the excessive cache utilization

Restart the CoreDNS pods to see if it resolves the issue temporarily

Repair

Increase the resources allocated to the Kubernetes cluster, such as memory or CPU, to accommodate the increased cache utilization.

Configure CoreDNS to limit the maximum size of the cache to prevent it from consuming too many resources.

Overview

Parameters

Debug

List all the pods in the default namespace

Check the logs of the CoreDNS pods to see if there are any errors or warnings

Check the resource usage of the CoreDNS pods to see if they are consuming too much memory or CPU

Check the CoreDNS configuration file to see if there are any misconfigurations that could be causing the excessive cache utilization

Restart the CoreDNS pods to see if it resolves the issue temporarily

Repair

Increase the resources allocated to the Kubernetes cluster, such as memory or CPU, to accommodate the increased cache utilization.

Configure CoreDNS to limit the maximum size of the cache to prevent it from consuming too many resources.

Learn more

Related Runbooks

Support

CoreDNS Excessive Cache Utilization Incident

Overview

Parameters

Debug

List all the pods in the default namespace

Check the logs of the CoreDNS pods to see if there are any errors or warnings

Check the resource usage of the CoreDNS pods to see if they are consuming too much memory or CPU

Check the CoreDNS configuration file to see if there are any misconfigurations that could be causing the excessive cache utilization

Restart the CoreDNS pods to see if it resolves the issue temporarily

Check the events log to see if there are any relevant events that could be related to the excessive cache utilization

Repair

Increase the resources allocated to the Kubernetes cluster, such as memory or CPU, to accommodate the increased cache utilization.

Configure CoreDNS to limit the maximum size of the cache to prevent it from consuming too many resources.

Learn more

Related Runbooks