4 minutes
Amazon ECS: a retrospective
After a few years working with Kubernetes in EKS, plain VMs, Lambdas and similar compute solutions, I finally got my hands on ECS. I won’t go into many details on the why but, essentially, I felt EKS/k8s was too much for the team that was going to work with it (not due to lack of skill, but rather time/manpower), and ECS seemed to abstract away a lot of things so, I decided to go for it.
As expected, documentation is vast and finding examples is quite easy. The problem is that some details are unclear, harder to find, or even just omitted from the documentation.
Connecting services
If you’re deploying multiple services in your cluster, there’s a chance some of them need to talk to each other.
ECS gives you two options for this: Service Discovery and Service Connect.
Very shortly, Service Discovery employs a DNS-based service lookup approach, while Service Connect uses a service mesh by deploying (and managing) sidecar proxy containers along with your services. Both use AWS Cloud Map, the difference being that one uses DNS lookups, the other queries Cloud Map directly via API.
If you’re interested in more details, check this article and this Stack Overflow answer.
Service Connect: things to know
Before going into specifics, a basic concept is to get out of the way is that services that connect to other services are “clients”. Services that accept connections from other services are “servers”.
For example, you have a REST API server and a service that sends emails to users on request. The API server requests the emailer service for messages to be sent. The API server is a “client”; the emailser service is a “server”.
Deployment order matters!
Maybe this is an obvious thing and it’s just me that hadn’t had previous service mesh/discovery experience. Maybe not.
Either way, after properly configuring and double-checking many times the setup I had, I was not still not able to make service A talk to service B.
Eventually, I randomly found in this demo video an AWS developer saying that service deployment order matters.
Servers need to be deployed before clients in order to be discoverable.
Fields to configure
Check this documentation page before starting to create your services.
This is useful because you will need to, for example, know that portMappings
in a server service needs to have a name, because that is how the serviceConnectConfiguration.service
config identifies where incoming traffic to the service is directed to.
Here’s a redacted version of how aws_ecs_service
Terraform/OpenTofu objects look like for client and server:
- client
resource "aws_ecs_service" "demo_app_service" {
name = "demo-${var.env}-service"
cluster = aws_ecs_cluster.ecs_cluster.id
task_definition = aws_ecs_task_definition.demo_app_task.arn
desired_count = 1
launch_type = "FARGATE"
network_configuration {
subnets = var.private_subnet_ids
security_groups = local.ecs_security_groups
assign_public_ip = false
}
load_balancer {
target_group_arn = aws_lb_target_group.admin_website_app_tg.arn
container_name = "hbh-admin-website-${var.env}-container"
container_port = local.admin_website_container_port
}
service_connect_configuration {
enabled = true
namespace = aws_service_discovery_http_namespace.private_dns_namespace.arn
}
}
- server
resource "aws_ecs_service" "demo_api_service" {
name = "demo-api-${var.env}-service"
cluster = aws_ecs_cluster.ecs_cluster.id
task_definition = aws_ecs_task_definition.demo_api_task.arn
desired_count = 1
launch_type = "FARGATE"
network_configuration {
subnets = var.private_subnet_ids
security_groups = local.ecs_security_groups
assign_public_ip = false
}
service_connect_configuration {
enabled = true
namespace = aws_service_discovery_http_namespace.private_dns_namespace.arn
service {
port_name = local.demo_api_port_name
client_alias {
port = local.demo_api_container_port
}
}
}
}
Discovery names
By default, you can refer to a service by portName.namespace
, where
portName
is the name give to the in the portMappings
of the server service, and namespace
is the name of the CloudMap private dns namespace in use (can be created explicitly or one will be created otherwise).
This can be overriden via discoveryName
in the serviceConnectService
object.
Health checks
Your typical health check will be a curl
request. ECS executes this in the container itself, just like Docker’s HEALTHCHECK
(if there is one, it is replaced witht he task definition health check configuration). Make sure curl
is installed in the image, otherwise the check will fail and the logs are not going to be clear about it.
Task Role vs Task Execution Role
I can’t say this is well documented, but I felt that, when jumping on to create a cluster and my first services/tasks with OpenTofu, it was immediately clear that these two existed.
The Task Role is an IAM role that attaches to a specific task definition, granting container permissions to access AWS resources. The role is assumed by the containers running in the task.
The Task Execution Role is an IAM role attached to the task definition responsible for running the task. This is typically useful to allow ECS to pull a private image from ECR, sending logs to CloudWatch, or read secrets manager keys for the environment variables.
Debugging with ECS Exec
If you’re used to use kubectl
to run some commands from within Kubernetes pods/containers, you can achieve the same with ECS exec.
Looks like this:
aws ecs execute-command --cluster cluster-name \
--task task-id \
--container container-name \
--interactive \
--command "/bin/sh"
Not all clusters and tasks support this so, make sure your configuration is a match. There’s this handy tool that helps with that.
838 Words
2024-08-14