`aws ecs execute-command` 导致 `TargetNotConnectedException` `由于内部错误,执行命令失败`

问题描述 投票:0回答:4

我正在 ECS 集群上运行 Docker 映像,以对其进行 shell 并运行一些简单的测试。但是当我运行这个时:

aws ecs execute-command  \
  --cluster MyEcsCluster \
  --task $ECS_TASK_ARN \
  --container MainContainer \
  --command "/bin/bash" \
  --interactive

我收到错误:

The Session Manager plugin was installed successfully. Use the AWS CLI to start a session.


An error occurred (TargetNotConnectedException) when calling the ExecuteCommand operation: The execute command failed due to an internal error. Try again later.

我可以确认任务+容器+代理都在运行:

aws ecs describe-tasks \
  --cluster MyEcsCluster \
  --tasks $ECS_TASK_ARN \
  | jq '.'
      "containers": [
        {
          "containerArn": "<redacted>",
          "taskArn": "<redacted>",
          "name": "MainContainer",
          "image": "confluentinc/cp-kafkacat",
          "runtimeId": "<redacted>",
          "lastStatus": "RUNNING",
          "networkBindings": [],
          "networkInterfaces": [
            {
              "attachmentId": "<redacted>",
              "privateIpv4Address": "<redacted>"
            }
          ],
          "healthStatus": "UNKNOWN",
          "managedAgents": [
            {
              "lastStartedAt": "2021-09-20T16:26:44.540000-05:00",
              "name": "ExecuteCommandAgent",
              "lastStatus": "RUNNING"
            }
          ],
          "cpu": "0",
          "memory": "4096"
        }
      ],

我正在使用 CDK Typescript 代码定义 ECS 集群和任务定义:

    new Cluster(stack, `MyEcsCluster`, {
        vpc,
        clusterName: `MyEcsCluster`,
    })

    const taskDefinition = new FargateTaskDefinition(stack, TestTaskDefinition`, {
        family: `TestTaskDefinition`,
        cpu: 512,
        memoryLimitMiB: 4096,
    })
    taskDefinition.addContainer("MainContainer", {
        image: ContainerImage.fromRegistry("confluentinc/cp-kafkacat"),
        command: ["tail", "-F", "/dev/null"],
        memoryLimitMiB: 4096,
        // Some internet searches suggested setting this flag. This didn't seem to help.
        readonlyRootFilesystem: false,
    })
amazon-web-services docker amazon-ecs aws-fargate
4个回答
49
投票

ECS Exec Checker 应该能够找出您的设置出了什么问题。你可以尝试一下吗?

check-ecs-exec.sh 脚本允许您通过代表您调用各种 AWS API 来检查和验证您的 CLI 环境和 ECS 集群/任务是否已为 ECS Exec 做好准备。


8
投票

基于@clay 的评论

我还缺少

ssmmessages:*
权限。

https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-exec.html#ecs-exec-required-iam-permissions 表示诸如

之类的策略
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ssmmessages:CreateControlChannel",
                "ssmmessages:CreateDataChannel",
                "ssmmessages:OpenControlChannel",
                "ssmmessages:OpenDataChannel"
            ],
            "Resource": "*"
        }
    ]
}

应附加到“任务角色”中使用的角色(而不是“任务执行角色”),尽管唯一的

ssmmessages:CreateDataChannel
权限确实会削减它。

管理政策

arn:aws:iam::aws:policy/AmazonSSMFullAccess
arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
arn:aws:iam::aws:policy/AmazonSSMManagedEC2InstanceDefaultPolicy
arn:aws:iam::aws:policy/AWSCloud9SSMInstanceProfile

全部包含必要的权限,

AWSCloud9SSMInstanceProfile
是最简约的。


1
投票

我意识到我的上级组织限制了 ssmmessages 权限,将其列入白名单后,在新任务开始后解决了我的问题。


0
投票

需要启用SSM权限,即创建一个策略来启用执行命令并将其附加到您的Ecs任务实例角色

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ssmmessages:CreateControlChannel",
                "ssmmessages:CreateDataChannel",
                "ssmmessages:OpenControlChannel",
                "ssmmessages:OpenDataChannel"
            ],
            "Resource": "*"
        }
    ]
}
© www.soinside.com 2019 - 2024. All rights reserved.