This incident type refers to an alert triggered by a monitoring system indicating that the number of pending tasks in ElasticSearch is high. This can be an issue because it may indicate that the system is overloaded and unable to process all the incoming tasks, which can result in performance degradation or even downtime. The incident needs to be investigated and resolved as soon as possible to ensure the system is functioning properly.
Parameters
Debug
Check if ElasticSearch service is running
Check ElasticSearch logs for any errors or warnings
Check the status of the ElasticSearch cluster
Check the status of ElasticSearch nodes in the cluster
Check the number of pending tasks in the ElasticSearch cluster
Check the metrics for ElasticSearch
The ElasticSearch cluster may be lacking sufficient resources, such as memory or processing power, to handle the volume of tasks it is receiving.
Repair
Define variables
Scale the ElasticSearch cluster
Update the elasticsearch cluster settings to change the concurrent rebalance limit.
Update elasticsearch cluster settings to set desired number of concurrent recoveries.
Learn more
Related Runbooks
Check out these related runbooks to help you debug and resolve similar issues.