This incident type occurs when a Kubernetes DaemonSet fails to run the same pod across all nodes due to reasons such as missing images, initialization failures, or a lack of resources in the cluster. The incident is triggered when the desired number of pods - the running pods is greater than zero. This incident can impact the availability of the services running on Kubernetes and requires immediate attention.
Parameters
Debug
Get the list of DaemonSets in the cluster
Describe a specific DaemonSet to check its status
Check the status of the pods created by the DaemonSet
Check the logs of a specific pod to see if there are any errors
Check the events related to the DaemonSet to see if there were any issues during the creation of the pods
Check the status of the nodes in the cluster to see if there are any issues
Check the resource usage of the nodes to see if there are any resource constraints
The pod spec in the DaemonSet configuration could be incorrect or incomplete.
Repair
Review the Kubernetes resource limits for the daemonset and adjust as necessary.
Learn more
Related Runbooks
Check out these related runbooks to help you debug and resolve similar issues.