Terraform - 编辑 EKS 托管节点组会触发 RBAC 权限错误

问题描述 投票:0回答:0

使用 Terraform,我在 AWS (EKS) 中部署了 Kubernetes 集群,一切都很顺利。每当我尝试更改节点组或创建新节点组时,就会出现问题。

背景:

我使用了以下代码:

module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "19.15.3"

  cluster_name    = var.cluster_name
  cluster_version = "1.27"

  # EKS Addons
  cluster_addons = {
    
    coredns = {
      most_recent = true  # To ensure access to the latest settings provided
    }

    kube-proxy = {
      most_recent = true  # To ensure access to the latest settings provided
    }


    vpc-cni = {
      # Specify the VPC CNI addon should be deployed before compute to ensure
      # the addon is configured before data plane compute resources are created
      before_compute = true
      most_recent    = true  # To ensure access to the latest settings provided
      configuration_values = jsonencode({
      })
    }
  }

  vpc_id                         = module.vpc.vpc_id
  subnet_ids                     = module.vpc.private_subnets
  cluster_endpoint_public_access = true

  # Calico needs VXLAN communication between nodes
  node_security_group_additional_rules = {

    ingress_self_all = {
      description = "Node to node all ports/protocols"
      protocol    = "-1"
      from_port   = 0
      to_port     = 0
      type        = "ingress"
      self        = true
    }

    ingress_cluster_to_node_all = {
      description                   = "API Server to nodes all ports/protocols"
      protocol                      = "-1"
      from_port                     = 0
      to_port                       = 0
      type                          = "ingress"
      source_cluster_security_group = true
    }

    egress_all = {
      description      = "Node all egress"
      protocol         = "-1"
      from_port        = 0
      to_port          = 0
      type             = "egress"
      cidr_blocks      = ["0.0.0.0/0"]
      ipv6_cidr_blocks = ["::/0"]
    }
  }

  eks_managed_node_group_defaults = {
    ami_type = "AL2_x86_64"
  }

  eks_managed_node_groups = {

    default_nodes_groups = {
      name = "node-group-1"

      instance_types = ["t3.small"]

      min_size     = 3
      max_size     = 3
      desired_size = 3
    }

  }

  manage_aws_auth_configmap = true
  aws_auth_roles = [
    {
      rolearn  = module.eks_admins_iam_role.iam_role_arn
      username = module.eks_admins_iam_role.iam_role_name
      groups   = ["system:masters"]
    },
  ]
}

VPC 非常基本,对于与 RBAC 系统关联的 IAM 角色:masters 我遵循了这个非常详细的参考:参考链接

基本上,我:

  1. 创建 IAM 策略,为 eks 和 iam 提供适当的权限
  2. 使用策略创建一个可担任的角色(上面
    aws_auth_roles
    部分中放置的角色)
  3. 创建允许用户担任该角色的策略
  4. 创建一个具有此假定角色策略和管理员用户的组(对于 kubectl 访问...对于通过 Kubernetes 提供程序进行 Terraform 访问,“2.”中的角色就足够了)

然后,使用这些 Terraform 可以在创建/编辑期间与集群交互:

data "aws_eks_cluster_auth" "default" {
  name = var.cluster_name

  depends_on = [ module.eks.eks_managed_node_groups, ]
}

provider "kubernetes" {
  host                   = module.eks.cluster_endpoint
  cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)
  token                  = data.aws_eks_cluster_auth.default.token
}   

provider "helm" {
  kubernetes {
    host                   = module.eks.cluster_endpoint
    cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)
    token                  = data.aws_eks_cluster_auth.default.token
  }
}

最后我添加了 calico 和 AWS 负载均衡器控制器

# Calico addon to exploit Network Security Policies in EKS
module "kubernetes_addons_calico" {
  count  = var.calico_network_policies_enabled ? 1 : 0
  source = "github.com/aws-ia/terraform-aws-eks-blueprints//modules/kubernetes-addons?ref=v4.32.1"

  eks_cluster_id = module.eks.cluster_name

  enable_calico = true

  calico_helm_config = {
    name       = "calico"                                # (Required) Release name.
    repository = "https://docs.projectcalico.org/charts" # (Optional) Repository URL where to locate the requested chart.
    chart      = "tigera-operator"                       # (Required) Chart name to be installed.
    version    = "v3.26.1"                               # (Optional) Specify the exact chart version to install.
    namespace  = "tigera-operator"                       # (Optional) The namespace to install the release into.
    values = [
      <<-EOT
          installation:
            kubernetesProvider: EKS
        EOT
    ]
  }

}

module "lb_role" {
  source = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks"
  version = "5.28.0"

  role_name                              = "${var.cluster_name}-load-balancer-controller"
  attach_load_balancer_controller_policy = true

  oidc_providers = {
    main = {
      provider_arn               = module.eks.oidc_provider_arn
      namespace_service_accounts = ["kube-system:aws-load-balancer-controller"]
    }
  }
}

# Deploy the AWS Load Balancer Controller
# it creates also the service-role
resource "helm_release" "lb_controller" {
  name       = "aws-load-balancer-controller"
  chart      = "aws-load-balancer-controller"
  repository = "https://aws.github.io/eks-charts"
  version    = "1.5.5"
  namespace  = "kube-system"

  set {
    name  = "clusterName"
    value = var.cluster_name
  }

  set {
    name  = "rbac.create"
    value = "true"
  }

  set {
    name  = "serviceAccount.create"
    value = "true"
  }

  set {
    name  = "serviceAccount.name"
    value = "aws-load-balancer-controller"
  }

  set {
    name  = "serviceAccount.annotations.eks\\.amazonaws\\.com/role-arn"
    value = module.lb_role.iam_role_arn
  }
}

问题:

一切工作正常,我设法在集群上完成所有事情。尽管如此,如果我尝试将节点添加到节点组,或者创建新的节点组,我会收到以下错误:

│ Error: query: failed to query with labels: secrets is forbidden: User "system:anonymous" cannot list resource "secrets" in API group "" in the namespace "tigera-operator"
│ 
│   with module.k8s_control_plane.module.kubernetes_addons_calico[0].module.calico[0].module.helm_addon.helm_release.addon[0],
│   on .terraform/modules/k8s_control_plane.kubernetes_addons_calico/modules/kubernetes-addons/helm-addon/main.tf line 1, in resource "helm_release" "addon":
│    1: resource "helm_release" "addon" {
│ 
╵
╷
│ Error: configmaps "aws-auth" is forbidden: User "system:anonymous" cannot get resource "configmaps" in API group "" in the namespace "kube-system"
│ 
│   with module.k8s_control_plane.module.eks.kubernetes_config_map_v1_data.aws_auth[0],
│   on .terraform/modules/k8s_control_plane.eks/main.tf line 553, in resource "kubernetes_config_map_v1_data" "aws_auth":
│  553: resource "kubernetes_config_map_v1_data" "aws_auth" {
│ 
╵
╷
│ Warning: Argument is deprecated
│ 
│   with module.k8s_control_plane.module.eks.aws_eks_addon.before_compute["vpc-cni"],
│   on .terraform/modules/k8s_control_plane.eks/main.tf line 420, in resource "aws_eks_addon" "before_compute":
│  420:   resolve_conflicts        = try(each.value.resolve_conflicts, "OVERWRITE")
│ 
│ The "resolve_conflicts" attribute can't be set to "PRESERVE" on initial
│ resource creation. Use "resolve_conflicts_on_create" and/or
│ "resolve_conflicts_on_update" instead
╵
╷
│ Error: query: failed to query with labels: secrets is forbidden: User "system:anonymous" cannot list resource "secrets" in API group "" in the namespace "kube-system"
│ 
│   with module.k8s_control_plane.helm_release.lb_controller,
│   on ../../../modules/eks-ctrl-plane/aws-lb.tf line 26, in resource "helm_release" "lb_controller":
│   26: resource "helm_release" "lb_controller" {
│ 
╵
╷
│ Warning: Argument is deprecated
│ 
│   with module.k8s_control_plane.module.eks.aws_eks_addon.this["kube-proxy"],
│   on .terraform/modules/k8s_control_plane.eks/main.tf line 392, in resource "aws_eks_addon" "this":
│  392:   resolve_conflicts        = try(each.value.resolve_conflicts, "OVERWRITE")
│ 
│ The "resolve_conflicts" attribute can't be set to "PRESERVE" on initial
│ resource creation. Use "resolve_conflicts_on_create" and/or
│ "resolve_conflicts_on_update" instead
╵
╷
│ Warning: Argument is deprecated
│ 
│   with module.k8s_control_plane.module.eks.aws_eks_addon.this["coredns"],
│   on .terraform/modules/k8s_control_plane.eks/main.tf line 392, in resource "aws_eks_addon" "this":
│  392:   resolve_conflicts        = try(each.value.resolve_conflicts, "OVERWRITE")
│ 
│ The "resolve_conflicts" attribute can't be set to "PRESERVE" on initial
│ resource creation. Use "resolve_conflicts_on_create" and/or
│ "resolve_conflicts_on_update" instead
╵
Operation failed: failed running terraform plan (exit 1)

请注意,警告出现在之前成功的运行中,我很确定它们是无害的(我放置它们是为了完整性)。

我尝试做的主要两个改变(不是一起)是:

    default_nodes_groups = {
      name = "node-group-1"

      instance_types = ["t3.small"]

      min_size     = 3
      max_size     = 5
      desired_size = 5
    }

和:

eks_managed_node_groups = {

    default_nodes_groups = {
      name = "node-group-1"

      instance_types = ["t3.small"]

      min_size     = 3
      max_size     = 3
      desired_size = 3
    }

    extra_nodes_groups = {
      name = "node-group-2"

      instance_types = ["t3.small"]

      min_size     = 1
      max_size     = 2
      desired_size = 1
    }

  }

通过这些更改,一切都会以同样的方式失败。

我几乎不知道可能是什么问题,我尝试更新,我来回尝试理解为什么这些更改是由 system:anonymous 用户执行的,但我真的不明白。另外,为什么该更改需要查询集群中的机密,以及为什么这些查询是以某种方式执行的,而不是由拥有所有权限的 Kubernetes 提供者执行!

我认为 Kubernetes 操作是通过具有 system:masters RBAC 权限的 Kubernetes 提供程序执行的(事实上我可以更新所有插件/资源,创建/销毁它们等),但似乎节点组的更改是使用其他东西。我尝试检查 Terraform 用户的权限,但它已经具有完全管理策略,因此如果通过其 IAM 权限执行该更改,它基本上可以在 EKS 上执行任何操作。

我唯一关心的是,如果以某种方式,仅在编辑期间(不是创建......这些查询很奇怪),EKS 模块使用 Terraform 用户(而不是具有 RBAC system:mastes 权限的假设角色的 Kubernetes 提供者) )与 Kubernetes API 交互并编辑集群(这可以解释 system:anonymous)...但在我看来,这可能是问题所在,我的意思是为什么相比之下更改集群会有所不同首先创建它?

有谁知道可能出现的问题以及如何解决此问题以便能够从 Terraform 手动放大和缩小节点?

kubernetes terraform terraform-provider-aws amazon-eks rbac
© www.soinside.com 2019 - 2024. All rights reserved.