AWS 批处理上的 Python 3.8.10 Docker 映像:urllib3 和请求的连接超时错误

问题描述 投票:0回答:1

尝试在 AWS Batch 上运行 Python 3.8.10 Docker 镜像 时遇到连接超时错误。这是错误信息:

urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7f809b43d430>: Failed to establish a new connection: [Errno 110] Connection timed out
...
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='hooks.slack.com', port=443): Max retries exceeded with url: /services/XXXXXX/YYYYY/ZZZZZZZZ(Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f809b43d430>: Failed to establish a new connection: [Errno 110] Connection timed out'))

我正在使用以下 dockerfile:

FROM python:3.8.10

WORKDIR /code

COPY ./requirements.txt /code/requirements.txt

RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt

COPY ./app /code/app

CMD ["python", "./app/train.py"]

我的 ComputeEnvironment 在公共子网中,安全组当前允许所有 TLS 流量进出。我该如何解决这个问题?我错过了什么?

相关云格式模板:

Resources:
  # VPC
  VPC:
    Type: AWS::EC2::VPC
    Properties:
      CidrBlock: 10.0.0.0/16 # 16 means that our VPC will be able to host 65536 IPs approximately.
      EnableDnsSupport: true
      EnableDnsHostnames: true
      Tags:
        - Key: Name
          Value: service-vpc-${sls:stage}
    
  # Security Group
  # The creation of a Security Group is not really necessary, but I think is very useful at least to have a default one attached to the VPC. This security group allows SSH, HTTP and HTTPS access from the custom IP configure by the sgCidr parameter.
  DMZSecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupName: service-dmz-sg-${sls:stage}
      GroupDescription: DMZ Security Group to allow Access to SSH
      VpcId: !Ref VPC
      SecurityGroupIngress:
        - Description: Allow SSH
          IpProtocol: tcp
          FromPort: "22"
          ToPort: "22"
          CidrIp: 0.0.0.0/0 # Custom IP to configure the Security Group to allow access only from the specified IP that you can define, by default is 0.0.0.0/0 which means open to the internet.
        - Description: Allow Http
          IpProtocol: tcp
          FromPort: "80"
          ToPort: "80"
          CidrIp: 0.0.0.0/0 # Custom IP to configure the Security Group to allow access only from the specified IP that you can define, by default is 0.0.0.0/0 which means open to the internet.
        - Description: Allow Https
          IpProtocol: tcp
          FromPort: "443"
          ToPort: "443"
          CidrIp: 0.0.0.0/0 # Custom IP to configure the Security Group to allow access only from the specified IP that you can define, by default is 0.0.0.0/0 which means open to the internet.

      # Internet Gateway
      InternetGateway:
        Type: AWS::EC2::InternetGateway
        DependsOn: VPC
        Properties:
          Tags:
            - Key: Name
              Value: service-internet-gateway-${sls:stage}
      AttachGateway:
        Type: AWS::EC2::VPCGatewayAttachment
        Properties:
          VpcId: !Ref VPC
          InternetGatewayId: !Ref InternetGateway

  # NAT Gateway
  ElasticIPAddress:
    Type: AWS::EC2::EIP
    Properties:
      Domain: VPC
  NATGateway:
    Type: AWS::EC2::NatGateway
    Properties:
      AllocationId: !GetAtt ElasticIPAddress.AllocationId
      SubnetId: !Ref PublicSubnetA
      Tags:
        - Key: Name
          Value: service-nat-gateway-${sls:stage}

  # Subnets
  PublicSubnetA:
    Type: AWS::EC2::Subnet
    Properties:
      VpcId: !Ref VPC
      CidrBlock: 10.0.1.0/24
      AvailabilityZone: !Select [0, !GetAZs ""] # Get the first AZ in the list
      MapPublicIpOnLaunch: True
      Tags:
        - Key: Name
          Value: service-subnet-public-a-${sls:stage}
  PublicSubnetB:
    Type: AWS::EC2::Subnet
    Properties:
      VpcId: !Ref VPC
      CidrBlock: 10.0.2.0/24
      AvailabilityZone: !Select [1, !GetAZs ""] # Get the second AZ in the list
      MapPublicIpOnLaunch: True
      Tags:
        - Key: Name
          Value: service-subnet-public-b-${sls:stage}
  PrivateSubnetC:
    Type: AWS::EC2::Subnet
    Properties:
      VpcId: !Ref VPC
      AvailabilityZone: !Select [0, !GetAZs ""] # Get the first AZ in the list
      CidrBlock: 10.0.3.0/24
      Tags:
        - Key: Name
          Value: service-subnet-private-c-${sls:stage}
  PrivateSubnetD:
    Type: AWS::EC2::Subnet
    Properties:
      VpcId: !Ref VPC
      AvailabilityZone: !Select [1, !GetAZs ""] # Get the second AZ in the list
      CidrBlock: 10.0.4.0/24
      Tags:
        - Key: Name
          Value: service-subnet-private-d-${sls:stage}

  # Some route tables for our subnets:
  PublicRouteTable:
    Type: AWS::EC2::RouteTable
    Properties:
      VpcId: !Ref VPC
      Tags:
        - Key: Name
          Value: service-routetable-public-${sls:stage}
  PrivateRouteTable:
    Type: AWS::EC2::RouteTable
    Properties:
      VpcId: !Ref VPC
      Tags:
        - Key: Name
          Value: service-routetable-private-${sls:stage}

  # Public route table has direct routing to IGW:
  PublicRoute:
    Type: AWS::EC2::Route
    DependsOn: AttachGateway
    Properties:
      RouteTableId: !Ref PublicRouteTable
      DestinationCidrBlock: 0.0.0.0/0
      GatewayId: !Ref InternetGateway

  # Private route table can access web via NAT:
  PrivateRoute:
    Type: AWS::EC2::Route
    Properties:
      RouteTableId: !Ref PrivateRouteTable
      DestinationCidrBlock: 0.0.0.0/0
      # Route traffic through the NAT Gateway:
      NatGatewayId: !Ref NATGateway

  # Attach the public subnets to public route tables,
  # and attach the private subnets to private route tables:
  PublicSubnetARouteTableAssociation:
    Type: AWS::EC2::SubnetRouteTableAssociation
    Properties:
      SubnetId: !Ref PublicSubnetA
      RouteTableId: !Ref PublicRouteTable
  PublicSubnetBRouteTableAssociation:
    Type: AWS::EC2::SubnetRouteTableAssociation
    Properties:
      SubnetId: !Ref PublicSubnetB
      RouteTableId: !Ref PublicRouteTable
  PrivateSubnetCRouteTableAssociation:
    Type: AWS::EC2::SubnetRouteTableAssociation
    Properties:
      SubnetId: !Ref PrivateSubnetC
      RouteTableId: !Ref PrivateRouteTable
  PrivateSubnetDRouteTableAssociation:
    Type: AWS::EC2::SubnetRouteTableAssociation
    Properties:
      SubnetId: !Ref PrivateSubnetD
      RouteTableId: !Ref PrivateRouteTable

  BERecommenderTrainingSecurityGroup:
    Type: "AWS::EC2::SecurityGroup"
    Properties:
      GroupDescription: service-be-recommender-training-security-group-${sls:stage}
      VpcId: !Ref VPC
      SecurityGroupEgress: # Allow all outgoing traffic
        - IpProtocol: "tcp"
          FromPort: 80
          ToPort: 80
          CidrIp: "0.0.0.0/0"
        - IpProtocol: "tcp"
          FromPort: 443
          ToPort: 443
          CidrIp: "0.0.0.0/0"
  BERecommenderTrainingComputeEnvironment:
    Type: AWS::Batch::ComputeEnvironment
    Properties:
      Type: MANAGED
      #ServiceRole: arn:aws:iam::111122223333:role/aws-service-role/batch.amazonaws.com/AWSServiceRoleForBatch
      ComputeEnvironmentName: service-be-recommender-training-compute-environment-${sls:stage}
      ComputeResources:
        Type: FARGATE_SPOT
        MaxvCpus: 256
        SecurityGroupIds:
          - Fn::GetAtt:
              - VPC
              - DefaultSecurityGroup
          - !Ref BERecommenderTrainingSecurityGroup
        Subnets:
          - !Ref PublicSubnetA
          - !Ref PublicSubnetB
        #InstanceRole: ecsInstanceRole
  BERecommenderTrainingJobQueue:
    Type: AWS::Batch::JobQueue
    Properties:
      JobQueueName: service-be-recommender-training-job-queue-${sls:stage}
      Priority: 100
      ComputeEnvironmentOrder:
        - Order: 1
          ComputeEnvironment: !Ref BERecommenderTrainingComputeEnvironment
  BERecommenderTrainingExecutionRole:
    Type: "AWS::IAM::Role"
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Principal:
              Service:
                - "ecs-tasks.amazonaws.com"
            Action:
              - "sts:AssumeRole"
      Policies:
        - PolicyName: BERecommenderTrainingExecutionRolePolicyECR
          PolicyDocument:
            Version: "2012-10-17"
            Statement:
              - Effect: Allow
                Action:
                  - ecr:GetAuthorizationToken
                  - ecr:BatchCheckLayerAvailability
                  - ecr:GetDownloadUrlForLayer
                  - ecr:BatchGetImage
                  - logs:CreateLogStream
                  - logs:PutLogEvents
                Resource: "*"
  BERecommenderTrainingJobRole:
    Type: "AWS::IAM::Role"
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Principal:
              Service:
                - "ecs-tasks.amazonaws.com"
            Action:
              - "sts:AssumeRole"
      Policies:
        - PolicyName: BERecommenderTrainingJobRolePolicyS3
          PolicyDocument:
            Version: "2012-10-17"
            Statement:
              - Action:
                  - s3:ListBucket
                Effect: Allow
                Resource:
                  - arn:aws:s3:::service-s3-data-${sls:stage}
              - Action:
                  - s3:GetObject
                  - s3:PutObject
                  - s3:DeleteObject
                Effect: Allow
                Resource:
                  - arn:aws:s3:::service-s3-data-${sls:stage}/*
  BERecommenderTrainingJobDefinition:
    Type: AWS::Batch::JobDefinition
    Properties:
      JobDefinitionName: service-be-recommender-training-job-definition-${sls:stage}
      Type: container
      PlatformCapabilities:
        - FARGATE
      ContainerProperties:
        Image: !GetAtt BERecommenderTrainingRepository.RepositoryUri
        ResourceRequirements: # See https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-batch-jobdefinition-resourcerequirement.html
          - Type: MEMORY
            Value: !If [IsProd, "30720", "512"]
          - Type: VCPU
            Value: !If [IsProd, "4", "0.25"]
        ExecutionRoleArn: !GetAtt BERecommenderTrainingExecutionRole.Arn
        JobRoleArn: !GetAtt BERecommenderTrainingJobRole.Arn
        Environment:
          - Name: "STAGE"
            Value: !If [IsProd, "prod", "dev"]
  BERecommenderTrainingEventRuleRole:
    Type: "AWS::IAM::Role"
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Principal:
              Service:
                - "events.amazonaws.com"
            Action:
              - "sts:AssumeRole"
      Policies:
        - PolicyName: BERecommenderTrainingEventRulePolicy
          PolicyDocument:
            Version: "2012-10-17"
            Statement:
              - Effect: Allow
                Action:
                  - batch:SubmitJob
                  - ecr:GetAuthorizationToken
                  - ecr:BatchCheckLayerAvailability
                  - ecr:GetDownloadUrlForLayer
                  - ecr:BatchGetImage
                  - logs:CreateLogStream
                  - logs:PutLogEvents
                Resource: "*"
  BERecommenderTrainingEventRule:
    Type: AWS::Events::Rule
    Properties:
      Name: service-be-recommender-training-event-rule-${sls:stage}
      ScheduleExpression: cron(0 * ? * * *)
      State: !If [IsProd, ENABLED, DISABLED]
      Targets:
        - Id: service-be-recommender-training-batch-job-${sls:stage}
          Arn: !Ref BERecommenderTrainingJobQueue
          RoleArn: !GetAtt BERecommenderTrainingEventRuleRole.Arn
          BatchParameters:
            JobDefinition: !Ref BERecommenderTrainingJobDefinition
            JobName: service-be-recommender-training-batch-job
python ssl networking aws-security-group aws-batch
1个回答
0
投票

我找到了解决这个问题的方法:

我必须将公共子网切换为私有子网,并附加一个 NAT 网关。它现在已成功连接到互联网!

© www.soinside.com 2019 - 2024. All rights reserved.