terraform-ecs. Registered container instance is showing 0

狂风中的少年 提交于 2019-11-28 12:56:49

To troubleshooting ecs problems, you can follow below steps.

  1. click service name nginx, check if any tasks are in pending status. If you see that, normally there are a lot of stopped tasks.

That means the containers are not healthy.

  1. click service name, events, check if there any error events to help you do the troubleshooting.

  2. Click ECS instances, if there are any instances in the list. If not, that means no ec2 instance is successfully registered itself to ECS cluster.

If you use AWS ECS AMI, it should be fine. But if you use your own AMI, you need add below userdata script

ecs-userdata.tpl

#!/bin/bash
echo "ECS_CLUSTER=${ecs_cluster_name}" >> /etc/ecs/ecs.config

update terraform codes:

data "template_file" "ecs_user_data" {

  template = "file("ecs-userdata.tpl") }"

  vars {
    ecs_cluster_name = "${var.ecs_cluster_name}"
  }
}


resource "aws_launch_configuration" "demo" {
  ...
  user_data = "${data.template_file.ecs_user_data.rendered}"
  ...
}
  1. Enable docker container logs, the easiest way is to send the logs to aws cloudwatch.

Add below resource first.

resource "aws_cloudwatch_log_group" "app_logs" {
  name              = "demo"
  retention_in_days = 14
}

Then add below codes into task definition.

"logConfiguration": {
  "logDriver": "awslogs",
  "options": {
    "awslogs-group": "${aws_cloudwatch_log_group.app_logs.name}",
    "awslogs-region": "${var.region}"
  }
},

after you applied change, go to cloudwatch, logs to check if there are any error logs.

  1. change iam role to ["ecs.amazonaws.com", "ec2.amazonaws.com"] "Principal": { "Service": ["ecs.amazonaws.com", "ec2.amazonaws.com"] }, Hope these steps are helpful for you.

Future reading:

Launching an Amazon ECS Container Instance

Here are few suggestions to check in AWS Console:

  • Make sure that you are using Amazon ECS-optimized AMIs.

    Basically these instances, once you login as root, they should have start ecs command.

    Terraform example:

    data "aws_ami" "ecs_ami" {
      most_recent = true
      owners      = ["amazon"]
    
      filter {
        name   = "name"
        values = ["amzn-ami-*-amazon-ecs-optimized"]
      }
    }
    
  • Check whether EC2 are spinned up.

  • Check your Load Balancing Target Group (e.g. why they're not registered by checking Health status of the instances in Targets tab, Attributes in Description tab and Health checks tab).
  • Check whether ECS agent is running on the EC2 instances.

    1. Login to EC2 instance as root.
    2. Run docker ps and check for whether ecs-agent container is running.
    3. Otherwise start manually by start ecs or restart ecs.

    Note: If you don't have docker, start or restart command, you're not using ECS-optimized AMI.

  • When the instances get terminated.

  • Once instances have ECS agent running, make sure you assigned them into the right cluster. E.g.

    root# cat /etc/ecs/ecs.config
    ECS_CLUSTER=demo
    
  • Note the IAM role of the running EC2 instance, then make sure that AmazonEC2ContainerServiceforEC2Role policy is attached to that role.

  • In Trust relationships tab of that cluster role, make sure to give the access to EC2 provider to that role. Example role trust policy:

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Sid": "",
          "Effect": "Allow",
          "Principal": {
            "Service": "ec2.amazonaws.com"
          },
          "Action": "sts:AssumeRole"
        }
      ]
    }
    

    Terraform example:

    data "aws_iam_policy_document" "instance" {
      provider = "aws.auto-scale-group"
    
      statement {
        effect  = "Allow"
        actions = ["sts:AssumeRole"]
    
        principals {
          type        = "Service"
          identifiers = ["ec2.amazonaws.com"]
        }
      }
    }
    

    See: What is the purpose of AssumeRolePolicyDocument in IAM?.

    You also need aws_iam_instance_profile and aws_iam_role, e.g.

    resource "aws_iam_instance_profile" "instance" {
      provider = "aws.auto-scale-group"
      name     = "myproject-profile-instance"
      role     = "${aws_iam_role.instance.name}"
    
      lifecycle {
        create_before_destroy = true
      }
    }
    
    resource "aws_iam_role" "instance" {
      provider           = "aws.auto-scale-group"
      name               = "myproject-role"
      path               = "/"
      assume_role_policy = "${data.aws_iam_policy_document.instance.json}"
    
      lifecycle {
        create_before_destroy = true
      }
    }
    
  • Now, your cluster should be ready to go.


Related:

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!