This incident type involves a high CPU usage on Kubernetes DNS pods in a test environment. It typically occurs when the average CPU usage exceeds a certain threshold, as indicated by a query alert monitor. This can impact the performance and stability of the Kubernetes cluster and may require investigation and remediation to prevent further issues.
Parameters
Debug
1. Check the CPU usage of Kubernetes DNS pods
2. Check the resource limits and requests of Kubernetes DNS pods
3. Check the status of the Kubernetes cluster
4. Check the logs of the Kubernetes DNS pods
5. Check the status of the Kubernetes DNS service
6. Check the CPU usage of the node(s) hosting the Kubernetes DNS pods
7. Check the Kubernetes events related to the incident
Repair
Optimize the configuration of the DNS pods to reduce their resource consumption and improve efficiency, such as adjusting resource requests and limits, or using a more lightweight DNS solution.
Learn more
Related Runbooks
Check out these related runbooks to help you debug and resolve similar issues.