通过 cloudformation 更新时,ECS 任务卡在 PENDING 中。

问题描述 投票:1回答:1

我有一个问题,部署ECS集群时,构建正常,但当在cloudformation中更新任务时,ECSSerivce旋转起来6 PENDING 新任务。但6个旧任务仍然是 RUNNING有时它会开始排出旧的任务,然后部署就可以了,但有时所有的旧任务都不会被排出,ECSService只是卡在了 UPDATE_IN_PROGRESS. 我怎么麻烦这样的事情?

下面是我的堆栈模板。

AWSTemplateFormatVersion: '2010-09-09'
Resources:
  ElasticLoadBalancer:
    Type: AWS::ElasticLoadBalancingV2::LoadBalancer
    Properties:
      SecurityGroups:
      - !Ref 'ELBSecurityGroup'
      Subnets:
      - !Ref 'InstanceSubnet'
      - !Ref 'SecondarySubnet'
      Scheme: internet-facing
  RedirectLoadBalancerListener:
    Type: AWS::ElasticLoadBalancingV2::Listener
    DependsOn: ECSServiceRole
    Properties:
      DefaultActions:
      - Type: forward
        TargetGroupArn: !Ref 'ECSTG'
      LoadBalancerArn: !Ref 'ElasticLoadBalancer'
      Port: '80'
      Protocol: HTTP
  RedirectLoadBalancerListenerRule:
    Type: AWS::ElasticLoadBalancingV2::ListenerRule
    DependsOn: RedirectLoadBalancerListener
    Properties:
      Actions:
      - Type: forward
        TargetGroupArn: !Ref 'ECSTG'
      Conditions:
      - Field: path-pattern
        Values:
        - /
      ListenerArn: !Ref 'RedirectLoadBalancerListener'
      Priority: '1'
  LoadBalancerListener:
    Type: AWS::ElasticLoadBalancingV2::Listener
    DependsOn: ECSServiceRole
    Properties:
      Certificates:
      - CertificateArn: !Ref 'SSLCertificateId'
      DefaultActions:
      - Type: forward
        TargetGroupArn: !Ref 'ECSTG'
      LoadBalancerArn: !Ref 'ElasticLoadBalancer'
      Port: '443'
      Protocol: HTTPS
  LoadBalancerListenerRule:
    Type: AWS::ElasticLoadBalancingV2::ListenerRule
    DependsOn: LoadBalancerListener
    Properties:
      Actions:
      - Type: forward
        TargetGroupArn: !Ref 'ECSTG'
      Conditions:
      - Field: path-pattern
        Values:
        - /
      ListenerArn: !Ref 'LoadBalancerListener'
      Priority: '1'
  ECSTG:
    DependsOn: ElasticLoadBalancer
    Type: AWS::ElasticLoadBalancingV2::TargetGroup
    Properties:
      HealthCheckIntervalSeconds: 6
      HealthCheckPath: /api/ping
      HealthCheckProtocol: HTTP
      HealthCheckTimeoutSeconds: 5
      HealthyThresholdCount: 2
      Port: 80
      Protocol: HTTP
      UnhealthyThresholdCount: 5
      VpcId: !Ref 'VPCId'
      TargetGroupAttributes:
      - Key: deregistration_delay.timeout_seconds
        Value: '20'
  AppSecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupDescription: AppSecurityGroup
      SecurityGroupIngress:
      - IpProtocol: '-1'
        FromPort: '-1'
        ToPort: '-1'
        SourceSecurityGroupId: !Ref 'ELBSecurityGroup'
      VpcId: !Ref 'VPCId'
  Route53Entry:
    Type: AWS::Route53::RecordSetGroup
    Properties:
      HostedZoneName: !Join ['', [!Ref 'Route53HostedZone', .]]
      Comment: Zone apex alias targeted to myELB LoadBalancer.
      RecordSets:
      - Name: !Join [., [!Ref 'ApplicationHost', !Ref 'Route53HostedZone']]
        Type: A
        AliasTarget:
          HostedZoneId: !GetAtt [ElasticLoadBalancer, CanonicalHostedZoneID]
          DNSName: !GetAtt [ElasticLoadBalancer, DNSName]
  ELBSecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupDescription: ELBSecurityGroup
      SecurityGroupIngress:
      - IpProtocol: tcp
        FromPort: '443'
        ToPort: '443'
        CidrIp: 0.0.0.0/0
      - IpProtocol: tcp
        FromPort: '80'
        ToPort: '80'
        CidrIp: 0.0.0.0/0
      VpcId: !Ref 'VPCId'
  CloudWatchAlarm:
    Type: AWS::CloudWatch::Alarm
    Properties:
      ActionsEnabled: true
      AlarmActions:
      - arn:aws:sns:us-east-1:6xxxxxxx:instance-alarm
      ComparisonOperator: LessThanOrEqualToThreshold
      Dimensions:
      - Name: LoadBalancer
        Value: !GetAtt [ElasticLoadBalancer, LoadBalancerFullName]
      - Name: TargetGroup
        Value: !GetAtt [ECSTG, TargetGroupFullName]
      EvaluationPeriods: 5
      MetricName: HealthyHostCount
      Namespace: AWS/ApplicationELB
      Period: 60
      Statistic: Maximum
      Threshold: 0
  LowOnCreditAlarm:
    Type: AWS::CloudWatch::Alarm
    Properties:
      ActionsEnabled: true
      AlarmActions:
      - arn:aws:sns:us-east-1:6xxxxxx:instance-alarm
      ComparisonOperator: LessThanThreshold
      Dimensions:
      - Name: AutoScalingGroupName
        Value: !Ref 'AutoScalingGroup'
      EvaluationPeriods: 1
      MetricName: CPUCreditBalance
      Namespace: AWS/EC2
      Period: 300
      Statistic: Average
      Threshold: 15
  Database:
    Type: AWS::RDS::DBInstance
    Properties:
      AllocatedStorage: '5'
      DBInstanceClass: db.t2.micro
      Engine: postgres
      BackupRetentionPeriod: 35
      EngineVersion: 9.5.2
      DBName: !If [RestoreDB, '', ekdb]
      MasterUsername: !Ref 'DBUser'
      MasterUserPassword: !Ref 'DBPassword'
      DBSecurityGroups:
      - !Ref 'DatabaseSecurityGroup'
      DBSubnetGroupName: !Ref 'DatabaseSubnetGroup'
      DBSnapshotIdentifier: !Ref 'DBSnapshot'
    DeletionPolicy: Snapshot
  DatabaseSecurityGroup:
    Type: AWS::RDS::DBSecurityGroup
    Properties:
      GroupDescription: DatabaseSecurityGroup
      DBSecurityGroupIngress:
      - EC2SecurityGroupId: !Ref 'AppSecurityGroup'
      EC2VpcId: !Ref 'VPCId'
  Redis:
    Type: AWS::ElastiCache::CacheCluster
    Properties:
      CacheNodeType: cache.t2.micro
      Engine: redis
      EngineVersion: 2.8.24
      NumCacheNodes: 1
      VpcSecurityGroupIds:
      - !Ref 'RedisSecurityGroup'
      CacheSubnetGroupName: !Ref 'RedisSubnetGroup'
  RedisSecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupDescription: RedisSecurityGroup
      SecurityGroupIngress:
      - IpProtocol: tcp
        FromPort: '6379'
        ToPort: '6379'
        SourceSecurityGroupId: !Ref 'AppSecurityGroup'
      VpcId: !Ref 'VPCId'
  FrontendUser:
    Type: AWS::IAM::User
    Properties:
      Groups:
      - SynapseAppUsers
  BackendUser:
    Type: AWS::IAM::User
    Properties:
      Groups:
      - SynapseAppUsers
  FrontendUserAccessKey:
    Type: AWS::IAM::AccessKey
    Properties:
      UserName: !Ref 'FrontendUser'
  BackendUserAccessKey:
    Type: AWS::IAM::AccessKey
    Properties:
      UserName: !Ref 'BackendUser'
  S3BucketPolicy:
    Type: AWS::S3::BucketPolicy
    Properties:
      Bucket: !Ref 'S3Bucket'
      PolicyDocument:
        Statement:
        - Action: s3:GetObject
          Effect: Allow
          Resource: !Sub 'arn:aws:s3:::${S3Bucket}/*'
          Principal:
            AWS:
            - !GetAtt 'FrontendUser.Arn'
            - !GetAtt 'BackendUser.Arn'
        - Action: s3:PutObject
          Effect: Allow
          Resource: !Sub 'arn:aws:s3:::${S3Bucket}/*'
          Principal:
            AWS:
            - !GetAtt 'BackendUser.Arn'
        - Action: s3:PutObjectAcl
          Effect: Allow
          Resource: !Sub 'arn:aws:s3:::${S3Bucket}/*'
          Principal:
            AWS:
            - !GetAtt 'BackendUser.Arn'
        - Action:
          - s3:PutObjectAcl
          - s3:PutObject
          - s3:GetObject
          - s3:DeleteObject
          Effect: Allow
          Resource: !Sub 'arn:aws:s3:::${S3Bucket}/*'
          Principal:
            AWS:
            - arn:aws:iam::6xxxxxxx:user/filestack-v3-policy
  S3Bucket:
    Type: AWS::S3::Bucket
    Properties:
      AccessControl: AuthenticatedRead
      CorsConfiguration:
        CorsRules:
        - AllowedHeaders:
          - '*'
          AllowedMethods:
          - GET
          - PUT
          - POST
          AllowedOrigins:
          - '*'
          ExposedHeaders:
          - ETag
          MaxAge: 3000
    DeletionPolicy: Retain
  AppIamRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Statement:
        - Effect: Allow
          Principal:
            Service:
            - ec2.amazonaws.com
          Action:
          - sts:AssumeRole
      Path: /
      Policies:
      - PolicyName: app-iam-role
        PolicyDocument:
          Statement:
          - Effect: Allow
            Action:
            - ecs:*
            - ecr:*
            - sns:*
            - logs:*
            Resource: '*'
          - Effect: Allow
            Action:
            - s3:PutObject
            - s3:GetObject
            - s3:PutObjectAcl
            - s3:DeleteObject
            Resource: !GetAtt [S3Bucket, Arn]
  AppInstanceProfile:
    Type: AWS::IAM::InstanceProfile
    Properties:
      Path: /
      Roles:
      - !Ref 'AppIamRole'
  LaunchConfig:
    Type: AWS::AutoScaling::LaunchConfiguration
    Properties:
      AssociatePublicIpAddress: true
      ImageId: !FindInMap [AWSRegionToAMI, !Ref 'AWS::Region', AMIID]
      InstanceType: !If [IsExclusive, t2.medium, m4.large]
      IamInstanceProfile: !Ref 'AppInstanceProfile'
      SecurityGroups:
      - !Ref 'AppSecurityGroup'
      UserData: !Base64
        Fn::Join:
        - ''
        - - '#!/bin/bash -xe

            '
          - echo ECS_CLUSTER=
          - !Ref 'ECSCluster'
          - ' >> /etc/ecs/ecs.config

            '
          - 'yum install -y aws-cfn-bootstrap

            '
          - '/opt/aws/bin/cfn-signal -e $? '
          - '         --stack '
          - !Ref 'AWS::StackName'
          - '         --resource AutoScalingGroup '
          - '         --region '
          - !Ref 'AWS::Region'
  AutoScalingGroup:
    Type: AWS::AutoScaling::AutoScalingGroup
    Properties:
      LaunchConfigurationName: !Ref 'LaunchConfig'
      MinSize: 1
      MaxSize: 2
      DesiredCapacity: !If [IsExclusive, 1, 2]
      VPCZoneIdentifier:
      - !Ref 'InstanceSubnet'
      HealthCheckGracePeriod: 600
      HealthCheckType: ELB
    CreationPolicy:
      ResourceSignal:
        Timeout: PT15M
    UpdatePolicy:
      AutoScalingReplacingUpdate:
        WillReplace: 'true'
  DatabaseSubnetGroup:
    Type: AWS::RDS::DBSubnetGroup
    Properties:
      DBSubnetGroupDescription: Subnet Group for database
      SubnetIds:
      - !Ref 'SecondarySubnet'
      - !Ref 'InstanceSubnet'
  RedisSubnetGroup:
    Type: AWS::ElastiCache::SubnetGroup
    Properties:
      Description: Subnet Group for Redis
      SubnetIds:
      - !Ref 'SecondarySubnet'
      - !Ref 'InstanceSubnet'
  ECSCluster:
    Type: AWS::ECS::Cluster
  ECSService:
    DependsOn:
    - RedirectLoadBalancerListener
    - LoadBalancerListener
    - AutoScalingGroup
    Type: AWS::ECS::Service
    Properties:
      Cluster: !Ref 'ECSCluster'
      DesiredCount: !If [IsExclusive, 2, 6]
      Role: !Ref 'ECSServiceRole'
      TaskDefinition: !Ref 'TaskDefinition'
      LoadBalancers:
      - ContainerName: nginx
        ContainerPort: '80'
        TargetGroupArn: !Ref 'ECSTG'
  ECSServiceRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Statement:
        - Effect: Allow
          Principal:
            Service:
            - ecs.amazonaws.com
          Action:
          - sts:AssumeRole
      Path: /
      Policies:
      - PolicyName: ecs-service
        PolicyDocument:
          Statement:
          - Effect: Allow
            Action:
            - elasticloadbalancing:DeregisterInstancesFromLoadBalancer
            - elasticloadbalancing:DeregisterTargets
            - elasticloadbalancing:Describe*
            - elasticloadbalancing:RegisterInstancesWithLoadBalancer
            - elasticloadbalancing:RegisterTargets
            - ec2:Describe*
            - ec2:AuthorizeSecurityGroupIngress
            Resource: '*'
  TaskDefinition:
    Type: AWS::ECS::TaskDefinition
    Properties:
      ContainerDefinitions:
      - Name: frontend
        Memory: '256'
        MemoryReservation: '32'
        Image: !Sub '6xxxxxxx0.dkr.ecr.us-east-1.amazonaws.com/frontend:${ImageTag}'
        LogConfiguration:
          LogDriver: awslogs
          Options:
            awslogs-group: !Ref 'ECSLogGroup'
            awslogs-region: !Ref 'AWS::Region'
            awslogs-stream-prefix: '[frontend]'
      - Name: backend
        Memory: '1024'
        MemoryReservation: '256'
        Links:
        - xray-daemon
        Environment:
        - Name: NODE_ENV
          Value: prod
        - Name: AWS_XRAY_DAEMON_ADDRESS
          Value: "xray-daemon:2000"
        - Name: APPLICATION_URL
          Value: !Sub 'https://${ApplicationHost}.${Route53HostedZone}'
        - Name: ACCOUNTS_TOKEN
          Value: !Ref AccountsToken
        - Name: ACCOUNTS_URL
          Value: !Ref 'AccountsUrl'
        - Name: HEAP_APPLICATION_ID
          Value: '3901275559'
        - Name: HUBSPOT_API_KEY
          Value: !Ref 'HubspotApiKey'
        - Name: USER_POOL
          Value: !Ref 'UserPool'
        - Name: POOL_CLIENTS
          Value: !Ref 'PoolClients'
        - Name: JWKS
          Value: !Ref 'JWKS'
        - Name: DATABASE_URL
          Value: !Sub ['postgresql://${DBUser}:${DBPassword}@${Address}:${Port}/ekdb',
            {Address: !GetAtt [Database, Endpoint.Address], Port: !GetAtt [Database,
                Endpoint.Port]}]
        - Name: REDIS_URL
          Value: !Sub ['redis://${Address}:${Port}/', {Address: !GetAtt [Redis, RedisEndpoint.Address],
              Port: !GetAtt [Redis, RedisEndpoint.Port]}]
        - Name: S3_FRONTEND_USER_ACCESS_KEY_ID
          Value: !Ref 'FrontendUserAccessKey'
        - Name: S3_FRONTEND_USER_SECRET
          Value: !GetAtt [FrontendUserAccessKey, SecretAccessKey]
        - Name: S3_BACKEND_USER_ACCESS_KEY_ID
          Value: !Ref 'BackendUserAccessKey'
        - Name: S3_BACKEND_USER_SECRET
          Value: !GetAtt [BackendUserAccessKey, SecretAccessKey]
        - Name: S3_BUCKET_NAME
          Value: !Ref 'S3Bucket'
        - Name: UPLOAD_STRATEGY
          Value: S3
        - Name: ACCOUNT_ID
          Value: !Ref 'AccountId'
        - Name: CHECK_ACCOUNT_ID
          Value: !Ref 'CheckAccountId'
        - Name: SNS_TOPIC_ARN
          Value: !Ref 'SNSTopicArn'
        Image: !Sub '6xxxxxx.dkr.ecr.us-east-1.amazonaws.com/backend:${ImageTag}'
        LogConfiguration:
          LogDriver: awslogs
          Options:
            awslogs-group: !Ref 'ECSLogGroup'
            awslogs-region: !Ref 'AWS::Region'
            awslogs-stream-prefix: '[backend]'
      - Name: nginx
        Memory: '256'
        MemoryReservation: '32'
        Links:
        - frontend
        - backend
        - pdf_viewer
        - preview
        Image: !Sub '67xxxxxx.dkr.ecr.us-east-1.amazonaws.com/nginx:${ImageTag}'
        LogConfiguration:
          LogDriver: awslogs
          Options:
            awslogs-group: !Ref 'ECSLogGroup'
            awslogs-region: !Ref 'AWS::Region'
            awslogs-stream-prefix: '[nginx]'
        PortMappings:
        - ContainerPort: 80
      - Name: pdf_viewer
        Memory: '256'
        MemoryReservation: '32'
        Image: !Sub '6xxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/pdf_viewer:${ImageTag}'
        LogConfiguration:
          LogDriver: awslogs
          Options:
            awslogs-group: !Ref 'ECSLogGroup'
            awslogs-region: !Ref 'AWS::Region'
            awslogs-stream-prefix: '[pdf_viewer]'
      - Name: preview
        Memory: '256'
        MemoryReservation: '32'
        Image: !Sub '6xxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/preview:${ImageTag}'
        LogConfiguration:
          LogDriver: awslogs
          Options:
            awslogs-group: !Ref 'ECSLogGroup'
            awslogs-region: !Ref 'AWS::Region'
            awslogs-stream-prefix: '[preview]'
      - Name: xray-daemon
        Memory: '256'
        MemoryReservation: '32'
        Image: 'amazon/aws-xray-daemon'
        PortMappings:
        - ContainerPort: 2000
          HostPort: 0
          Protocol: "udp"
        LogConfiguration:
          LogDriver: awslogs
          Options:
            awslogs-group: !Ref 'ECSLogGroup'
            awslogs-region: !Ref 'AWS::Region'
            awslogs-stream-prefix: '[xray-daemon]'
  ECSLogGroup:
    Type: AWS::Logs::LogGroup
Parameters:
  CheckAccountId:
    Type: String
    Description: Should user's account id be checked while logging in to the instance?
    Default: 'yes'
  Route53HostedZone:
    Type: String
  SSLCertificateId:
    Type: String
    Description: Pass SSL id from AWS Certificate Manager to pass to ELB
  ApplicationHost:
    Type: String
    Description: 'Host to be applied as follows: {host}.{Route53HostedZone}'
  DBUser:
    Type: String
    Description: Username that the database should be accessible with
  DBPassword:
    Type: String
    Description: Password that the database user should have
  HtpasswdEntry:
    Type: String
    Description: This is the file that should be htpasswd entry file
  DBSnapshot:
    Type: String
    Description: Database Snapshot ID if you want to restore DB from snapshot
    Default: ''
  VPCId:
    Type: String
    Description: VPC Id to assosiate instance to. Pass this if you want to hide the
      instances behind pre-existing VPC
    Default: vpc-355a6b51
  InstanceSubnet:
    Type: String
    Description: Subnet on which the instance should be set up. Required if VPCId
      is set
    Default: subnet-beb826c8
  SecondarySubnet:
    Type: String
    Description: Subnet on which the RDS and ElastiCache group will be set up as well.
      Required if VPCId is set
    Default: subnet-04e39239
  AccountId:
    Type: String
    Description: AccountId. used to filter out users from Auth0
  AccountsUrl:
    Type: String
    Description: Accounts url eg. https://app.getsynapse.com/
  SNSTopicArn:
    Type: String
    Description: ARN of SNS Topic that will be use to communicate between different
      parts of the infrastructure
  HubspotApiKey:
    Type: String
    Description: Hubspot api key
  UserPool:
    Type: String
    Description: Cognito UserPool
  PoolClients:
    Type: String
    Description: Cognito PoolClients
  JWKS:
    Type: String
    Description: Cognito JWKS
  ImageTag:
    Type: String
    Description: Tag of docker images
  AccountsToken:
    Type: String
    Description: Token used for authenticating with Accounts
Conditions:
  RestoreDB: !Not [!Equals [!Ref 'DBSnapshot', '']]
  IsExclusive: !Not [!Equals [!Ref 'AccountId', N/a]]
Outputs:
  InstanceURL:
    Value: !Join ['', ["https://", !Ref 'ApplicationHost', ., !Ref 'Route53HostedZone']]
Mappings:
  AWSRegionToAMI:
    us-east-1:
      AMIID: ami-a7a242da
    us-east-2:
      AMIID: ami-b86a5ddd
    us-west-1:
      AMIID: none
    us-west-2:
      AMIID: none
    eu-west-1:
      AMIID: none
    eu-central-1:
      AMIID: none
    ap-northeast-1:
      AMIID: none
    ap-southeast-1:
      AMIID: none
    ap-southeast-2:
      AMIID: none
amazon-web-services amazon-cloudformation amazon-iam amazon-ecs
1个回答
1
投票

根据评论,该问题似乎与以下方面有关 最大百分比最低健康百分比 的参数和它们的 默认值为200和100 分别。

  • MaximumPercent。如果服务使用滚动更新(ECS)部署类型,该服务的最高百分比为 最大百分比参数代表 在部署期间,允许服务中处于 RUNNING 或 PENDING 状态的任务数量的上限。

  • MinimumHealthyPercent。如果服务使用滚动更新 (ECS) 部署类型,则服务中的 最小健康百分比 表示部署期间服务中必须保持在 RUNNING 状态的任务数量的下限。

默认值为 200 和 100 表示对于 6 个任务大小的服务,在部署期间将出现 12项任务运行 有一次。这对容器实例来说似乎太多,无法容纳。

一个建议的解决方案是将值改为 150和50导致共计 6项任务 在部署期间运行(3个新的和3个旧的),直到部署完成。

© www.soinside.com 2019 - 2024. All rights reserved.