我一直在关注这个 指南 营造 Kubernetes
集群 CloudFormation
但是 NodeGroup
从来没有加入集群,我也从来没有收到关于为什么没有加入的错误或解释。
我可以看到自动缩放组和 EC2
机被创造出来,但 EKS
报告说没有节点组。
如果我通过web管理工具手动创建一个新的节点组,它可以工作,但它分配了不同的节点组。security groups
. 它有一个 launch template
而非 launch configuration
.
同 AMI
,同 IAM
角色,同样的机器类型...
我在这两方面都很陌生 CloudFormation
和 EKS
,我现在不知道该如何继续查找问题所在。
下面是模板。
Description: >
Kubernetes cluster
Parameters:
EnvironmentName:
Description: An environment name that will be prefixed to resource names
Type: String
KeyName:
Description: The EC2 Key Pair to allow SSH access to the instances
Type: AWS::EC2::KeyPair::KeyName
VpcBlock:
Type: String
Default: 192.168.0.0/16
Description: The CIDR range for the VPC. This should be a valid private (RFC 1918) CIDR range.
Subnet01Block:
Type: String
Default: 192.168.64.0/18
Description: CidrBlock for subnet 01 within the VPC
Subnet02Block:
Type: String
Default: 192.168.128.0/18
Description: CidrBlock for subnet 02 within the VPC
Subnet03Block:
Type: String
Default: 192.168.192.0/18
Description: CidrBlock for subnet 03 within the VPC. This is used only if the region has more than 2 AZs.
NodeInstanceType:
Description: EC2 instance type for the node instances
Type: String
NodeImageId:
Type: AWS::EC2::Image::Id
Description: AMI id for the node instances.
NodeAutoScalingGroupMinSize:
Type: Number
Description: Minimum size of Node Group ASG.
Default: 1
NodeAutoScalingGroupMaxSize:
Type: Number
Description: Maximum size of Node Group ASG. Set to at least 1 greater than NodeAutoScalingGroupDesiredCapacity.
Default: 3
NodeAutoScalingGroupDesiredCapacity:
Type: Number
Description: Desired capacity of Node Group ASG.
Default: 3
BootstrapArguments:
Description: Arguments to pass to the bootstrap script. See files/bootstrap.sh in https://github.com/awslabs/amazon-eks-ami
Default: ""
Type: String
Resources:
VPC:
Type: AWS::EC2::VPC
Properties:
CidrBlock: !Ref VpcBlock
EnableDnsSupport: true
EnableDnsHostnames: true
Tags:
- Key: Environment
Value: !Ref EnvironmentName
InternetGateway:
Type: "AWS::EC2::InternetGateway"
Properties:
Tags:
- Key: Environment
Value: !Ref EnvironmentName
VPCGatewayAttachment:
Type: "AWS::EC2::VPCGatewayAttachment"
Properties:
InternetGatewayId: !Ref InternetGateway
VpcId: !Ref VPC
RouteTable:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref VPC
Tags:
- Key: Environment
Value: !Ref EnvironmentName
Route:
DependsOn: VPCGatewayAttachment
Type: AWS::EC2::Route
Properties:
RouteTableId: !Ref RouteTable
DestinationCidrBlock: 0.0.0.0/0
GatewayId: !Ref InternetGateway
Subnet01:
Type: AWS::EC2::Subnet
Properties:
AvailabilityZone: !Select [ 0, !GetAZs '' ]
CidrBlock: !Ref Subnet01Block
VpcId: !Ref VPC
MapPublicIpOnLaunch: true
Tags:
- Key: Environment
Value: !Ref EnvironmentName
Subnet02:
Type: AWS::EC2::Subnet
Metadata:
Comment: Subnet 02
Properties:
AvailabilityZone: !Select [ 1, !GetAZs '' ]
CidrBlock: !Ref Subnet02Block
VpcId: !Ref VPC
MapPublicIpOnLaunch: true
Tags:
- Key: Environment
Value: !Ref EnvironmentName
Subnet03:
Type: AWS::EC2::Subnet
Metadata:
Comment: Subnet 03
Properties:
AvailabilityZone: !Select [ 2, !GetAZs '' ]
CidrBlock: !Ref Subnet03Block
VpcId: !Ref VPC
MapPublicIpOnLaunch: true
Tags:
- Key: Environment
Value: !Ref EnvironmentName
Subnet01RouteTableAssociation:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
SubnetId: !Ref Subnet01
RouteTableId: !Ref RouteTable
Subnet02RouteTableAssociation:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
SubnetId: !Ref Subnet02
RouteTableId: !Ref RouteTable
Subnet03RouteTableAssociation:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
SubnetId: !Ref Subnet03
RouteTableId: !Ref RouteTable
ControlPlaneSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Cluster communication with worker nodes
VpcId: !Ref VPC
ClusterRole:
Type: AWS::IAM::Role
Properties:
RoleName: !Sub ${EnvironmentName}KubernetesClusterRole
AssumeRolePolicyDocument:
Statement:
- Effect: Allow
Principal:
Service: eks.amazonaws.com
Action: sts:AssumeRole
ManagedPolicyArns:
- arn:aws:iam::aws:policy/AmazonEKSServicePolicy
- arn:aws:iam::aws:policy/AmazonEKSClusterPolicy
Tags:
- Key: Environment
Value: !Ref EnvironmentName
Cluster:
Type: AWS::EKS::Cluster
Properties:
Name: !Sub ${EnvironmentName}KubernetesCluster
RoleArn: !GetAtt ClusterRole.Arn
ResourcesVpcConfig:
SecurityGroupIds:
- !Ref ControlPlaneSecurityGroup
SubnetIds:
- !Ref Subnet01
- !Ref Subnet02
- !Ref Subnet03
NodeRole:
Type: AWS::IAM::Role
Properties:
RoleName: !Sub ${EnvironmentName}KubernetesNodeRole
AssumeRolePolicyDocument:
Statement:
- Effect: Allow
Principal:
Service: ec2.amazonaws.com
Action: sts:AssumeRole
ManagedPolicyArns:
- arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy
- arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy
- arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly
- arn:aws:iam::aws:policy/AmazonDynamoDBFullAccess
Path: /
Tags:
- Key: Environment
Value: !Ref EnvironmentName
NodeInstanceProfile:
Type: AWS::IAM::InstanceProfile
Properties:
Path: "/"
Roles:
- !Ref NodeRole
NodeSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Security group for all nodes in the cluster
VpcId: !Ref VPC
Tags:
- Key: !Sub "kubernetes.io/cluster/${EnvironmentName}KubernetesCluster"
Value: 'owned'
- Key: Environment
Value: !Ref EnvironmentName
NodeSecurityGroupIngress:
Type: AWS::EC2::SecurityGroupIngress
DependsOn: NodeSecurityGroup
Properties:
Description: Allow node to communicate with each other
GroupId: !Ref NodeSecurityGroup
SourceSecurityGroupId: !Ref NodeSecurityGroup
IpProtocol: '-1'
FromPort: 0
ToPort: 65535
NodeSecurityGroupFromControlPlaneIngress:
Type: AWS::EC2::SecurityGroupIngress
DependsOn: NodeSecurityGroup
Properties:
Description: Allow worker Kubelets and pods to receive communication from the cluster control plane
GroupId: !Ref NodeSecurityGroup
SourceSecurityGroupId: !Ref ControlPlaneSecurityGroup
IpProtocol: tcp
FromPort: 1025
ToPort: 65535
ControlPlaneEgressToNodeSecurityGroup:
Type: AWS::EC2::SecurityGroupEgress
DependsOn: NodeSecurityGroup
Properties:
Description: Allow the cluster control plane to communicate with worker Kubelet and pods
GroupId: !Ref ControlPlaneSecurityGroup
DestinationSecurityGroupId: !Ref NodeSecurityGroup
IpProtocol: tcp
FromPort: 1025
ToPort: 65535
NodeSecurityGroupFromControlPlaneOn443Ingress:
Type: AWS::EC2::SecurityGroupIngress
DependsOn: NodeSecurityGroup
Properties:
Description: Allow pods running extension API servers on port 443 to receive communication from cluster control plane
GroupId: !Ref NodeSecurityGroup
SourceSecurityGroupId: !Ref ControlPlaneSecurityGroup
IpProtocol: tcp
FromPort: 443
ToPort: 443
ControlPlaneEgressToNodeSecurityGroupOn443:
Type: AWS::EC2::SecurityGroupEgress
DependsOn: NodeSecurityGroup
Properties:
Description: Allow the cluster control plane to communicate with pods running extension API servers on port 443
GroupId: !Ref ControlPlaneSecurityGroup
DestinationSecurityGroupId: !Ref NodeSecurityGroup
IpProtocol: tcp
FromPort: 443
ToPort: 443
ClusterControlPlaneSecurityGroupIngress:
Type: AWS::EC2::SecurityGroupIngress
DependsOn: NodeSecurityGroup
Properties:
Description: Allow pods to communicate with the cluster API Server
GroupId: !Ref ControlPlaneSecurityGroup
SourceSecurityGroupId: !Ref NodeSecurityGroup
IpProtocol: tcp
ToPort: 443
FromPort: 443
NodeGroup:
Type: AWS::AutoScaling::AutoScalingGroup
Properties:
DesiredCapacity: !Ref NodeAutoScalingGroupDesiredCapacity
LaunchConfigurationName: !Ref NodeLaunchConfig
MinSize: !Ref NodeAutoScalingGroupMinSize
MaxSize: !Ref NodeAutoScalingGroupMaxSize
VPCZoneIdentifier:
- !Ref Subnet01
- !Ref Subnet02
- !Ref Subnet03
Tags:
- Key: Name
Value: !Sub "${EnvironmentName}KubernetesCluster-Node"
PropagateAtLaunch: 'true'
- Key: !Sub 'kubernetes.io/cluster/${EnvironmentName}KubernetesCluster'
Value: 'owned'
PropagateAtLaunch: 'true'
UpdatePolicy:
AutoScalingRollingUpdate:
MaxBatchSize: '1'
MinInstancesInService: !Ref NodeAutoScalingGroupDesiredCapacity
PauseTime: 'PT5M'
NodeLaunchConfig:
Type: AWS::AutoScaling::LaunchConfiguration
Properties:
AssociatePublicIpAddress: 'true'
IamInstanceProfile: !Ref NodeInstanceProfile
ImageId: !Ref NodeImageId
InstanceType: !Ref NodeInstanceType
KeyName: !Ref KeyName
SecurityGroups:
- !Ref NodeSecurityGroup
BlockDeviceMappings:
- DeviceName: /dev/xvda
Ebs:
VolumeSize: 20
VolumeType: gp2
DeleteOnTermination: true
UserData:
Fn::Base64:
!Sub |
#!/bin/bash
set -o xtrace
/etc/eks/bootstrap.sh ${EnvironmentName}KubernetesCluster ${BootstrapArguments}
/opt/aws/bin/cfn-signal --exit-code $? \
--stack ${AWS::StackName} \
--resource NodeGroup \
--region ${AWS::Region}
Outputs:
KubernetesClusterName:
Description: Cluster name
Value: !Ref Cluster
Export:
Name: KubernetesClusterName
KubernetesClusterEndpoint:
Description: Cluster endpoint
Value: !GetAtt Cluster.Endpoint
Export:
Name: KubernetesClusterEndpoint
KubernetesNodeInstanceProfile:
Description: The name of the IAM profile for K8
Value: !GetAtt NodeInstanceProfile.Arn
Export:
Name: KubernetesNodeInstanceProfileArn
有两种方法可以将Worker节点添加到你的EKS集群中。
从你的模板中可以看出,你现在使用的是第一种方法。重要的是,你需要等到EKS集群准备好并处于激活状态时,再启动工作节点。您可以通过使用 取决于属性. 如果这还不能解决您的问题,请查看云计算初始化日志 (varlogcloud-init-output.log),检查加入集群时发生了什么。
如果您想使用托管节点组,只需删除 AutoScaling Group 和 LaunchConfiguration 并使用此类型。https:/docs.aws.amazon.comAWSCloudFormationlatestUserGuideaws-resource-eks-nodegroup.html。这样做的好处是,AWS会帮你在账户中创建所需的资源(AutoScaling Group和LaunchTemplate),你可以在AWS Console中看到节点组。