Runbook

Kubernetes HPA Status Incident

Back to Runbooks

Overview

A Kubernetes HPA (Horizontal Pod Autoscaler) Status Incident refers to an issue where the autoscaling feature of Kubernetes, which automatically scales the number of pods in a replication controller, deployment, replica set or stateful set based on observed CPU utilization, is not functioning as expected. This can result in insufficient resources being provisioned to handle incoming load and potentially lead to service disruptions.

Parameters

Debug

Check the status of all HorizontalPodAutoscalers in the namespace

Check the status of a specific HorizontalPodAutoscaler in the namespace

Check the status of all pods in the namespace

Check the CPU and memory usage of a specific pod in the namespace

Check the logs of a specific pod in the namespace

Check the status of the Kubernetes cluster metrics server

Insufficient resources available in the cluster to support the desired number of pods.

Repair

Verify that the HPA is correctly configured for the deployment or stateful set in question, and that the minimum and maximum number of pods are set appropriately.

Check the metrics used by the HPA to determine whether to scale up or down the number of pods. Ensure that the metrics are correctly defined and that they reflect the actual resource utilization of the pods.

If the metrics are not available or not working as expected, consider using alternative metrics to determine scaling. For example, you can use custom metrics or metrics from external monitoring systems.

Learn more

Related Runbooks

Check out these related runbooks to help you debug and resolve similar issues.