Matt Silbernagel

Deploying Elixir on ECS - Part 1

August 23, 2020

I love PaaS systems like Heroku for deploying simple Elixir web services. It makes the deployment relatively painless, but it limits the power of the BEAM by making it impossible to do distrubuted clustering. For a project that requires distribution, ECS is a good option. This series of posts will layout how to build the infrastructure, setup CI/CD and connect the Elixir nodes into a distributed cluster.

The Infrastructure

Below I’ve split the terraform into sections and talk through each one. Installing and configuring Terraform for your AWS account is outside the scope of this article, but HashiCorp provides a great introduction.

Initialize Terraform

To start with, you’ll need to tell terraform that you want to use the AWS provider. Add this to a file called main.tf and run terraform init.

provider aws {
  profile = "default"
  region  = "us-east-1"
}

I typically keep my terraform files in an infrastructure folder in the root of my project

Add a VPC

One requirment for ECS is a VPC. Most likely, you’ll want to build a new VPC and use that, but for brevity you can just import the default VPC that comes with your AWS account. In the AWS console, go to VPC’s and find your default VPC’s id, it’ll start with vpc-, and also find the CIDR block.

Add to your terraform file:

resource aws_vpc main {
  cidr_block = "your_vpc_CIDR_block"
  tags = {
    Name = "Default VPC"
  }
}

data aws_subnet_ids vpc_subnets {
  vpc_id = aws_vpc.main.id
}

data aws_subnet default_subnet {
  count = "${length(data.aws_subnet_ids.vpc_subnets.ids)}"
  id    = "${tolist(data.aws_subnet_ids.vpc_subnets.ids)[count.index]}"
}

Save and run terraform import aws_vpc.main your_vpc_id and then terraform apply to pull all of the subnets which are needed for subsequent tasks.

This should import the current state of your default VPC and allow you to pass it around to other terraform modules.

Build the container repo

You’ll need a place to upload your container to so that ECS can pull it in. AWS offers ECR (Elastic Container Registry) which is essentially a private docker repo.

To create the registry add to your terraform:

resource "aws_ecr_repository" "repo" {
  name                 = "your_repo"  # give this a better name
  image_tag_mutability = "MUTABLE"

  image_scanning_configuration {
    scan_on_push = true
  }
}

output repo_url {
  value = aws_ecr_repository.repo.repository_url
}

This creates a place to push our images too from our CI/CD process. Notice the ouput is the URL of the created repository. This will be important later when we talk about deployment.

Build the ALB (Application Load Balancer)

This will be the public entry point to your web service, and will direct traffic to one of your many containers. To make things easier, this shows how to allow port 80 traffic, but I’ve commented in the locations that would require a code change for port 443.

If you want to use SSL, you’ll need to generate a certificate for your domain name. If you manage your domain with Route53, this is easy enough to do in AWS Certificate Manager.

# configure the ALB target group
resource aws_lb_target_group lb_target_group {
  name        = "your-app-tg" # choose a name that makes sense
  port        = 4000          # Expose port 4000 from our container
  protocol    = "HTTP"
  vpc_id      = aws_vpc.main.id # our default vpc id
  target_type = "ip"
  health_check {
    path = "/health" 
    port = "4000"
  }
  stickiness {
    type            = "lb_cookie"
    enabled         = "true"
    cookie_duration = "3600"
  }
}

resource aws_lb_listener ecs_listener {
  load_balancer_arn  = "${aws_lb.load_balancer.arn}"
  port               = "80"     # 443 if using SSL
  protocol           = "HTTP"   # HTTPS if using SSL

  # uncomment following lines if using SSL
  # ssl_policy        = "ELBSecurityPolicy-2016-08" 
  # certificate_arn   = ""      # the ARN a valid cert from Certificate Manager

  default_action {
    type             = "forward"
    target_group_arn = "${aws_lb_target_group.lb_target_group.arn}"
  }
}

resource aws_lb load_balancer {
  name               = "${var.app_name}_lb"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.lb_security_group.id]
  subnets            = data.aws_subnet.default_subnet.*.id

  enable_deletion_protection = true
}

# needed to allow web traffic to hit the ALB
resource aws_security_group lb_security_group {
  name        = "lb_security_group"
  description = "Allow all outbound traffic and https inbound"
  vpc_id      = aws_vpc.main.id

  ingress {
    description = "HTTP"  # use HTTPS if ssl is enabled
    from_port   = 80      # use 443 if ssl is enabled
    to_port     = 80      # use 443 if ssl is enabled
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

# the url where you app will be accessible
output dns {
  value = aws_lb.load_balancer.dns_name
}

Configure ECS

And now finally our ECS configuration. ECS has the concept of Clusters which are groups of Services which run 1 or more instances of a Task which is defined by a TaskDefinition. The following configuration will build 1 cluster that has 1 service that runs 2 instances of a task.

Task Definition

The Task Definition is basically a description of how to run your container. Later on when we deploy, we’ll create new versions of this initial Task Definition that point to different versions of your docker image. We can then instruct the ECS service to use our new Task Definition and start new tasks with newer versions of our code.

The Task Definition will also need some roles created.

And we’ll also need to create the log group so the task can log output.

# this may need to change depending
# on how often you run this
variable task_version {
  default = 1
}

# this is the role that your container runs as
# you can give it permissions to other parts of AWS that it may need to access
# like S3 or DynamoDB for instance.
resource aws_iam_role ecs_role {
  name = "ecs_role"
  assume_role_policy = <<-EOF
  {
    "Version": "2012-10-17",
    "Statement": [
      {
        "Sid": "",
        "Effect": "Allow",
        "Principal": {
          "Service": "ecs-tasks.amazonaws.com"
        },
        "Action": "sts:AssumeRole"
      }
    ]
  }
  EOF
}

# this role and the following permissions are required
# for the ECS service to pull the container from ECR
# and write log events
resource aws_iam_role ecs_execution_role {
  name = "ecs_execution_role"
  assume_role_policy = <<-EOF
  {
    "Version": "2012-10-17",
    "Statement": [
      {
        "Sid": "",
        "Effect": "Allow",
        "Principal": {
          "Service": "ecs-tasks.amazonaws.com"
        },
        "Action": "sts:AssumeRole"
      }
    ]
  }
  EOF
}

resource aws_iam_policy ecs_policy {
  name = "ecs_policy"
  policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
      {
          "Effect": "Allow",
          "Action": [
              "ecr:GetAuthorizationToken",
              "ecr:BatchCheckLayerAvailability",
              "ecr:GetDownloadUrlForLayer",
              "ecr:BatchGetImage",
              "logs:CreateLogStream",
              "logs:PutLogEvents"
          ],
          "Resource": "*"
      }
  ]
}
EOF
}

resource aws_iam_policy_attachment attach_ecs_policy {
  name        = "attach-ecs-policy"
  roles       = [aws_iam_role.ecs_execution_role.name]
  policy_arn  = aws_iam_policy.ecs_policy.arn
}

resource aws_cloudwatch_log_group log_group {
  name = "/ecs/your_app"
}



resource aws_ecs_task_definition task_definition {
  family                    = "your_app_task"
  task_role_arn             = aws_iam_role.ecs_role.arn
  execution_role_arn        = aws_iam_role.ecs_execution_role.arn
  requires_compatibilities  = ["FARGATE"]
  memory                    = 8192
  cpu                       = 4096

  network_mode              = "awsvpc"

  container_definitions     = <<-EOF
  [
    {
      "cpu": 0,
      "image": "${aws_ecr_repository.repo.repository_url}:latest",
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "${aws_cloudwatch_log_group.log_group.name}",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "ecs"
        }
      },
      "portMappings": [
        {
          "hostPort": 4000,
          "protocol": "tcp",
          "containerPort": 4000
        }
      ],
      "environment": [],
      "mountPoints": [],
      "volumesFrom": [],
      "essential": true,
      "links": [],
      "name": "your_app"
    }
  ]
  EOF
}

Cluster and Service

These are pretty easy. We just need to


# this gets your AWS account id 
# needed to build the task ARN later
data "aws_caller_identity" "current" {}

resource aws_ecs_cluster ecs_cluster {
  name = "your_app_cluster"
}

resource aws_ecs_service service {
  name            = "your_app_service"
  cluster         = aws_ecs_cluster.ecs_cluster.id

  task_definition = "arn:aws:ecs:us-east-1:${data.aws_caller_identity.current.account_id}:task-definition/${aws_ecs_task_definition.task_definition.family}:${var.task_version}"
  desired_count   = 2
  launch_type     = "FARGATE"
  network_configuration {
    security_groups   = [aws_security_group.security_group.id]
    subnets           = data.aws_subnet.default_subnet.*.id
    assign_public_ip  = true # this seems to be required to access the container repo
  }
  load_balancer {
    target_group_arn = aws_lb_target_group.lb_target_group.arn
    container_name   = "your_app"
    container_port   = "4000"
  } 
}

# needed that that our container can access the outside world
# and traffic in your VPC can access the containers
resource aws_security_group security_group {
  name        = "your_app_ecs"
  description = "Allow all outbound traffic"
  vpc_id      = aws_vpc.main.id

  ingress {
    description = "HTTP/S Traffic"
    from_port   = 0
    to_port     = 65535
    protocol    = "tcp"
    cidr_blocks = [aws_vpc.main.cidr_block]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

The final file

Assuming you have the permission, you should be able terraform plan and terraform apply the following file.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
provider aws {
  profile = "default"
  region  = "us-east-1"
}

variable app_name {
  default = "ecs_app"  
}

variable task_version {
  default = 1
}

resource aws_vpc main {
  cidr_block = "172.31.0.0/16"
  tags = {
    Name = "Default VPC"
  }
}

resource "aws_ecr_repository" "repo" {
  name                 = "${var.app_name}_repo"  
  image_tag_mutability = "MUTABLE"

  image_scanning_configuration {
    scan_on_push = true
  }
}

data aws_subnet_ids vpc_subnets {
  vpc_id = aws_vpc.main.id
}

data aws_subnet default_subnet {
  count = "${length(data.aws_subnet_ids.vpc_subnets.ids)}"
  id    = "${tolist(data.aws_subnet_ids.vpc_subnets.ids)[count.index]}"
}

data "aws_caller_identity" "current" {}

resource aws_lb_target_group lb_target_group {
  name        = "ecs-app-tg" 
  port        = 4000
  protocol    = "HTTP"
  vpc_id      = aws_vpc.main.id 
  target_type = "ip"
  health_check {
    path = "/health" 
    port = "4000"
  }
  stickiness {
    type            = "lb_cookie"
    enabled         = "true"
    cookie_duration = "3600"
  }
}

resource aws_lb_listener ecs_listener {
  load_balancer_arn = aws_lb.load_balancer.arn
  port              = "80"
  protocol          = "HTTP"
  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.lb_target_group.arn
  }
}

resource aws_lb load_balancer {
  name               = "ecs-app-lb"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.lb_security_group.id]
  subnets            = data.aws_subnet.default_subnet.*.id

  enable_deletion_protection = true
}

resource aws_security_group lb_security_group {
  name        = "lb_security_group"
  description = "Allow all outbound traffic and https inbound"
  vpc_id      = aws_vpc.main.id

  ingress {
    description = "HTTP"
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

resource aws_ecs_cluster ecs_cluster {
  name = "${var.app_name}_cluster"
}

resource aws_ecs_task_definition task_definition {
  family                    = "${var.app_name}_task"
  task_role_arn             = aws_iam_role.ecs_role.arn
  execution_role_arn        = aws_iam_role.ecs_execution_role.arn
  requires_compatibilities  = ["FARGATE"]
  memory                    = 8192
  cpu                       = 4096

  network_mode              = "awsvpc"

  container_definitions     = <<-EOF
  [
    {
      "cpu": 0,
      "image": "${aws_ecr_repository.repo.repository_url}:latest",
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "${aws_cloudwatch_log_group.log_group.name}",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "ecs"
        }
      },
      "portMappings": [
        {
          "hostPort": 4000,
          "protocol": "tcp",
          "containerPort": 4000
        }
      ],
      "environment": [],
      "mountPoints": [],
      "volumesFrom": [],
      "essential": true,
      "links": [],
      "name": "${var.app_name}"
    }
  ]
  EOF
}

resource aws_ecs_service service {
  name            = "${var.app_name}_service"
  cluster         = aws_ecs_cluster.ecs_cluster.id

  task_definition = "arn:aws:ecs:us-east-1:${data.aws_caller_identity.current.account_id}:task-definition/${aws_ecs_task_definition.task_definition.family}:${var.task_version}"
  desired_count   = 1
  launch_type     = "FARGATE"
  network_configuration {
    security_groups   = [aws_security_group.security_group.id]
    subnets           = data.aws_subnet.default_subnet.*.id
    assign_public_ip  = true
  }
  load_balancer {
    target_group_arn = aws_lb_target_group.lb_target_group.arn
    container_name   = var.app_name
    container_port   = "4000"
  }
}

resource aws_security_group security_group {
  name        = var.app_name 
  description = "Allow all outbound traffic"
  vpc_id      = aws_vpc.main.id

  ingress {
    description = "HTTP/S Traffic"
    from_port   = 0
    to_port     = 65535
    protocol    = "tcp"
    cidr_blocks = [aws_vpc.main.cidr_block]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

resource aws_iam_role ecs_role {
  name = "ecs_role"
  assume_role_policy = <<-EOF
  {
    "Version": "2012-10-17",
    "Statement": [
      {
        "Sid": "",
        "Effect": "Allow",
        "Principal": {
          "Service": "ecs-tasks.amazonaws.com"
        },
        "Action": "sts:AssumeRole"
      }
    ]
  }
  EOF
}

resource aws_iam_role ecs_execution_role {
  name = "ecs_execution_role"
  assume_role_policy = <<-EOF
  {
    "Version": "2012-10-17",
    "Statement": [
      {
        "Sid": "",
        "Effect": "Allow",
        "Principal": {
          "Service": "ecs-tasks.amazonaws.com"
        },
        "Action": "sts:AssumeRole"
      }
    ]
  }
  EOF
}

resource aws_iam_policy ecs_policy {
  name = "ecs_policy"
  policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
      {
          "Effect": "Allow",
          "Action": [
              "ecr:GetAuthorizationToken",
              "ecr:BatchCheckLayerAvailability",
              "ecr:GetDownloadUrlForLayer",
              "ecr:BatchGetImage",
              "logs:CreateLogStream",
              "logs:PutLogEvents"
          ],
          "Resource": "*"
      }
  ]
}
EOF
}

resource aws_iam_policy_attachment attach_ecs_policy {
  name        = "attach-ecs-policy"
  roles       = [aws_iam_role.ecs_execution_role.name]
  policy_arn  = aws_iam_policy.ecs_policy.arn
}

resource aws_cloudwatch_log_group log_group {
  name = "/ecs/${var.app_name}"
}

output repo_url {
  value = aws_ecr_repository.repo.repository_url
}

output dns {
  value = aws_lb.load_balancer.dns_name
}

Wrap up

With the provided terraform file, you should be able to get the infrastructure setup. Of course, there is no image to pull and run yet, so ECS will likely try several times and fail.

In Part 2 we’ll push a Docker container with a simple Phoenix app to our private image repo and instruct ECS to pull and run it.

comments powered by Disqus