Runbook
Troubleshooting connectivity issues between tasks in an Amazon ECS cluster using service discovery
Back to Runbooks
Overview
This incident type involves connectivity issues between tasks in an Amazon ECS (Elastic Container Service) cluster that uses service discovery. Service discovery is a mechanism that allows services to be discovered and accessed by other services without needing to know their IP addresses. There are several potential areas to investigate when facing connectivity issues, including service discovery configuration, DNS resolution, task definition and network mode, security groups, task IAM role, VPC configuration, ECS agent, ECS service event messages, logs, application-level configuration, and health checks. Troubleshooting steps need to be taken to resolve these issues.
Parameters
Debug
Confirm that the ECS service is associated with a Service Discovery namespace
Check if the DNS records of the tasks are correctly registered in the AWS Cloud Map service
Review the task definition and check the network mode
Confirm that the tasks are launched in the expected subnets
Ensure health checks are correctly set up
Review security groups associated with the task or service to make sure inbound and outbound traffic is allowed between tasks
Repair
Learn more
Related Runbooks
Check out these related runbooks to help you debug and resolve similar issues.