Apache Airflow is a platform used for creating, scheduling, and monitoring workflows. A worker node is a component of this platform that executes tasks and runs jobs in parallel. When a worker node is overloaded, it means that it is unable to handle the number of tasks assigned to it, causing failures in the execution of workflows. This incident type refers to the situation where an Apache Airflow worker node is overloaded and needs to be addressed to ensure that the platform can continue to function properly.
Parameters
Debug
Check CPU usage of worker nodes
Check memory usage of worker nodes
Check disk usage of worker nodes
Check airflow worker logs for errors
Check the number of running tasks on the worker nodes
Configuration issues such as improper settings for worker node resources like CPU, memory, or disk space that do not align with the workload requirements.
Repair
Increase the capacity of the worker node by adding more resources such as CPU, memory, or storage.
Learn more
Related Runbooks
Check out these related runbooks to help you debug and resolve similar issues.