This incident type refers to a situation where the number of 5xx errors on NGINX upstreams is higher than usual. It is an indication that there could be an issue with the NGINX server or the upstream application server. This type of incident requires immediate attention as it can impact the availability and performance of the application.
Parameters
Debug
1. Check server status
2. Check if the server is running
3. Check the error log for any relevant information
4. Check the access log to see if there are any suspiciously high traffic spikes
5. Check the server configuration for any errors
6. Check if there are any issues with upstream servers
7. Check the network connectivity to the server
8. Check if there are any firewall rules blocking traffic
9. Check if there are any other services running on the same host that could be interfering with NGINX
10. Check if there are any resource constraints on the server
11. Check if there are any other system-level issues that may be impacting NGINX
Check if the upstream server is responding with a HTTP 200 status code
Repair
Increase the number of NGINX worker processes to handle the increased traffic and prevent the server from getting overloaded.
Learn more
Related Runbooks
Check out these related runbooks to help you debug and resolve similar issues.