无法连接Kubernetes的ETCD

问题描述 投票:0回答:2

我已经不小心耗尽/取消了 Kubernetes 中的所有节点(甚至是主节点),现在我尝试通过连接到 ETCD 并手动更改其中的一些密钥来将其恢复。我成功地进入了etcd容器:

$ docker ps
CONTAINER ID        IMAGE                                      COMMAND                  CREATED             STATUS              PORTS               NAMES
8fbcb67da963        quay.io/coreos/etcd:v3.3.10                "/usr/local/bin/etcd"    17 hours ago        Up 17 hours                             etcd1
a0d6426df02a        cd48205a40f0                               "kube-controller-man…"   17 hours ago        Up 17 hours                             k8s_kube-controller-manager_kube-controller-manager-node1_kube-system_0441d7804a7366fd957f8b402008efe5_16
5fa8e47441a0        6bed756ced73                               "kube-scheduler --au…"   17 hours ago        Up 17 hours                             k8s_kube-scheduler_kube-scheduler-node1_kube-system_6f33d7866b72ca1b13c79edd42fa8dc6_14
2c8e07cf499f        gcr.io/google_containers/pause-amd64:3.1   "/pause"                 17 hours ago        Up 17 hours                             k8s_POD_kube-scheduler-node1_kube-system_6f33d7866b72ca1b13c79edd42fa8dc6_3
2ca43282ea1c        gcr.io/google_containers/pause-amd64:3.1   "/pause"                 17 hours ago        Up 17 hours                             k8s_POD_kube-controller-manager-node1_kube-system_0441d7804a7366fd957f8b402008efe5_3
9473644a3333        gcr.io/google_containers/pause-amd64:3.1   "/pause"                 17 hours ago        Up 17 hours                             k8s_POD_kube-apiserver-node1_kube-system_93ff1a9840f77f8b2b924a85815e17fe_3

然后我跑:

docker exec -it 8fbcb67da963 /bin/sh

然后我尝试运行以下命令:

ETCDCTL_API=3 etcdctl --endpoints https://172.16.16.111:2379 --cacert /etc/ssl/etcd/ssl/ca.pem --key /etc/ssl/etcd/ssl/member-node1-key.pem --cert /etc/ssl/etcd/ssl/member-node1.pem get / --prefix=true -w json --debug

这是我得到的结果:

ETCDCTL_CACERT=/etc/ssl/etcd/ssl/ca.pem
ETCDCTL_CERT=/etc/ssl/etcd/ssl/member-node1.pem
ETCDCTL_COMMAND_TIMEOUT=5s
ETCDCTL_DEBUG=true
ETCDCTL_DIAL_TIMEOUT=2s
ETCDCTL_DISCOVERY_SRV=
ETCDCTL_ENDPOINTS=[https://172.16.16.111:2379]
ETCDCTL_HEX=false
ETCDCTL_INSECURE_DISCOVERY=true
ETCDCTL_INSECURE_SKIP_TLS_VERIFY=false
ETCDCTL_INSECURE_TRANSPORT=true
ETCDCTL_KEEPALIVE_TIME=2s
ETCDCTL_KEEPALIVE_TIMEOUT=6s
ETCDCTL_KEY=/etc/ssl/etcd/ssl/member-node1-key.pem
ETCDCTL_USER=
ETCDCTL_WRITE_OUT=json
INFO: 2020/06/24 15:44:07 ccBalancerWrapper: updating state and picker called by balancer: IDLE, 0xc420246c00
INFO: 2020/06/24 15:44:07 dialing to target with scheme: ""
INFO: 2020/06/24 15:44:07 could not get resolver for scheme: ""
INFO: 2020/06/24 15:44:07 balancerWrapper: is pickfirst: false
INFO: 2020/06/24 15:44:07 balancerWrapper: got update addr from Notify: [{172.16.16.111:2379 <nil>}]
INFO: 2020/06/24 15:44:07 ccBalancerWrapper: new subconn: [{172.16.16.111:2379 0  <nil>}]
INFO: 2020/06/24 15:44:07 balancerWrapper: handle subconn state change: 0xc4201708d0, CONNECTING
INFO: 2020/06/24 15:44:07 ccBalancerWrapper: updating state and picker called by balancer: CONNECTING, 0xc420246c00
Error: context deadline exceeded

这是我的etcd.env:

# Environment file for etcd v3.3.10
ETCD_DATA_DIR=/var/lib/etcd
ETCD_ADVERTISE_CLIENT_URLS=https://172.16.16.111:2379
ETCD_INITIAL_ADVERTISE_PEER_URLS=https://172.16.16.111:2380
ETCD_INITIAL_CLUSTER_STATE=existing
ETCD_METRICS=basic
ETCD_LISTEN_CLIENT_URLS=https://172.16.16.111:2379,https://127.0.0.1:2379
ETCD_ELECTION_TIMEOUT=5000
ETCD_HEARTBEAT_INTERVAL=250
ETCD_INITIAL_CLUSTER_TOKEN=k8s_etcd
ETCD_LISTEN_PEER_URLS=https://172.16.16.111:2380
ETCD_NAME=etcd1
ETCD_PROXY=off
ETCD_INITIAL_CLUSTER=etcd1=https://172.16.16.111:2380,etcd2=https://172.16.16.112:2380,etcd3=https://172.16.16.113:2380
ETCD_AUTO_COMPACTION_RETENTION=8
ETCD_SNAPSHOT_COUNT=10000

# TLS settings
ETCD_TRUSTED_CA_FILE=/etc/ssl/etcd/ssl/ca.pem
ETCD_CERT_FILE=/etc/ssl/etcd/ssl/member-node1.pem
ETCD_KEY_FILE=/etc/ssl/etcd/ssl/member-node1-key.pem
ETCD_CLIENT_CERT_AUTH=true

ETCD_PEER_TRUSTED_CA_FILE=/etc/ssl/etcd/ssl/ca.pem
ETCD_PEER_CERT_FILE=/etc/ssl/etcd/ssl/member-node1.pem
ETCD_PEER_KEY_FILE=/etc/ssl/etcd/ssl/member-node1-key.pem
ETCD_PEER_CLIENT_CERT_AUTH=True

更新1:

这是我的 kubeadm-config.yaml:

apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 172.16.16.111
  bindPort: 6443
certificateKey: d73faece88f86e447eea3ca38f7b07e0a1f0bbb886567fee3b8cf8848b1bf8dd
nodeRegistration:
  name: node1
  taints: []
  criSocket: /var/run/dockershim.sock
---
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
clusterName: cluster.local
etcd:
  external:
      endpoints:
      - https://172.16.16.111:2379
      - https://172.16.16.112:2379
      - https://172.16.16.113:2379
      caFile: /etc/ssl/etcd/ssl/ca.pem
      certFile: /etc/ssl/etcd/ssl/node-node1.pem
      keyFile: /etc/ssl/etcd/ssl/node-node1-key.pem
dns:
  type: CoreDNS
  imageRepository: docker.io/coredns
  imageTag: 1.6.0
networking:
  dnsDomain: cluster.local
  serviceSubnet: 10.233.0.0/18
  podSubnet: 10.233.64.0/18
kubernetesVersion: v1.16.6
controlPlaneEndpoint: 172.16.16.111:6443
certificatesDir: /etc/kubernetes/ssl
imageRepository: gcr.io/google-containers
apiServer:
  extraArgs:
    anonymous-auth: "True"
    authorization-mode: Node,RBAC
    bind-address: 0.0.0.0
    insecure-port: "0"
    apiserver-count: "1"
    endpoint-reconciler-type: lease
    service-node-port-range: 30000-32767
    kubelet-preferred-address-types: "InternalDNS,InternalIP,Hostname,ExternalDNS,ExternalIP"
    profiling: "False"
    request-timeout: "1m0s"
    enable-aggregator-routing: "False"
    storage-backend: etcd3
    runtime-config: 
    allow-privileged: "true"
  extraVolumes:
  - name: usr-share-ca-certificates
    hostPath: /usr/share/ca-certificates
    mountPath: /usr/share/ca-certificates
    readOnly: true
  certSANs:
  - kubernetes
  - kubernetes.default
  - kubernetes.default.svc
  - kubernetes.default.svc.cluster.local
  - 10.233.0.1
  - localhost
  - 127.0.0.1
  - node1
  - lb-apiserver.kubernetes.local
  - 172.16.16.111
  - node1.cluster.local
  timeoutForControlPlane: 5m0s
controllerManager:
  extraArgs:
    node-monitor-grace-period: 40s
    node-monitor-period: 5s
    pod-eviction-timeout: 5m0s
    node-cidr-mask-size: "24"
    profiling: "False"
    terminated-pod-gc-threshold: "12500"
    bind-address: 0.0.0.0
    configure-cloud-routes: "false"
scheduler:
  extraArgs:
    bind-address: 0.0.0.0
  extraVolumes:
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
bindAddress: 0.0.0.0
clientConnection:
 acceptContentTypes: 
 burst: 10
 contentType: application/vnd.kubernetes.protobuf
 kubeconfig: 
 qps: 5
clusterCIDR: 10.233.64.0/18
configSyncPeriod: 15m0s
conntrack:
 maxPerCore: 32768
 min: 131072
 tcpCloseWaitTimeout: 1h0m0s
 tcpEstablishedTimeout: 24h0m0s
enableProfiling: False
healthzBindAddress: 0.0.0.0:10256
hostnameOverride: node1
iptables:
 masqueradeAll: False
 masqueradeBit: 14
 minSyncPeriod: 0s
 syncPeriod: 30s
ipvs:
 excludeCIDRs: []
 minSyncPeriod: 0s
 scheduler: rr
 syncPeriod: 30s
 strictARP: False
metricsBindAddress: 127.0.0.1:10249
mode: ipvs
nodePortAddresses: []
oomScoreAdj: -999
portRange: 
udpIdleTimeout: 250ms
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
clusterDNS:
- 169.254.25.10

更新2:

/etc/kubernetes/manigests/kube-apiserver.yaml 的内容:

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    component: kube-apiserver
    tier: control-plane
  name: kube-apiserver
  namespace: kube-system
spec:
  containers:
  - command:
    - kube-apiserver
    - --advertise-address=172.16.16.111
    - --allow-privileged=true
    - --anonymous-auth=True
    - --apiserver-count=1
    - --authorization-mode=Node,RBAC
    - --bind-address=0.0.0.0
    - --client-ca-file=/etc/kubernetes/ssl/ca.crt
    - --enable-admission-plugins=NodeRestriction
    - --enable-aggregator-routing=False
    - --enable-bootstrap-token-auth=true
    - --endpoint-reconciler-type=lease
    - --etcd-cafile=/etc/ssl/etcd/ssl/ca.pem
    - --etcd-certfile=/etc/ssl/etcd/ssl/node-node1.pem
    - --etcd-keyfile=/etc/ssl/etcd/ssl/node-node1-key.pem
    - --etcd-servers=https://172.16.16.111:2379,https://172.16.16.112:2379,https://172.16.16.113:2379
    - --insecure-port=0
    - --kubelet-client-certificate=/etc/kubernetes/ssl/apiserver-kubelet-client.crt
    - --kubelet-client-key=/etc/kubernetes/ssl/apiserver-kubelet-client.key
    - --kubelet-preferred-address-types=InternalDNS,InternalIP,Hostname,ExternalDNS,ExternalIP
    - --profiling=False
    - --proxy-client-cert-file=/etc/kubernetes/ssl/front-proxy-client.crt
    - --proxy-client-key-file=/etc/kubernetes/ssl/front-proxy-client.key
    - --request-timeout=1m0s
    - --requestheader-allowed-names=front-proxy-client
    - --requestheader-client-ca-file=/etc/kubernetes/ssl/front-proxy-ca.crt
    - --requestheader-extra-headers-prefix=X-Remote-Extra-
    - --requestheader-group-headers=X-Remote-Group
    - --requestheader-username-headers=X-Remote-User
    - --runtime-config=
    - --secure-port=6443
    - --service-account-key-file=/etc/kubernetes/ssl/sa.pub
    - --service-cluster-ip-range=10.233.0.0/18
    - --service-node-port-range=30000-32767
    - --storage-backend=etcd3
    - --tls-cert-file=/etc/kubernetes/ssl/apiserver.crt
    - --tls-private-key-file=/etc/kubernetes/ssl/apiserver.key
    image: gcr.io/google-containers/kube-apiserver:v1.16.6
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 8
      httpGet:
        host: 172.16.16.111
        path: /healthz
        port: 6443
        scheme: HTTPS
      initialDelaySeconds: 15
      timeoutSeconds: 15
    name: kube-apiserver
    resources:
      requests:
        cpu: 250m
    volumeMounts:
    - mountPath: /etc/ssl/certs
      name: ca-certs
      readOnly: true
    - mountPath: /etc/ca-certificates
      name: etc-ca-certificates
      readOnly: true
    - mountPath: /etc/ssl/etcd/ssl
      name: etcd-certs-0
      readOnly: true
    - mountPath: /etc/kubernetes/ssl
      name: k8s-certs
      readOnly: true
    - mountPath: /usr/local/share/ca-certificates
      name: usr-local-share-ca-certificates
      readOnly: true
    - mountPath: /usr/share/ca-certificates
      name: usr-share-ca-certificates
      readOnly: true
  hostNetwork: true
  priorityClassName: system-cluster-critical
  volumes:
  - hostPath:
      path: /etc/ssl/certs
      type: DirectoryOrCreate
    name: ca-certs
  - hostPath:
      path: /etc/ca-certificates
      type: DirectoryOrCreate
    name: etc-ca-certificates
  - hostPath:
      path: /etc/ssl/etcd/ssl
      type: DirectoryOrCreate
    name: etcd-certs-0
  - hostPath:
      path: /etc/kubernetes/ssl
      type: DirectoryOrCreate
    name: k8s-certs
  - hostPath:
      path: /usr/local/share/ca-certificates
      type: DirectoryOrCreate
    name: usr-local-share-ca-certificates
  - hostPath:
      path: /usr/share/ca-certificates
      type: ""
    name: usr-share-ca-certificates
status: {}

我使用kubespray安装集群。

如何连接到etcd?任何帮助将不胜感激。

docker kubernetes kubectl etcd
2个回答
2
投票

发生这种

context deadline exceeded
通常是因为

  1. 使用错误的证书。您可以使用对等证书而不是客户端证书。您需要检查 Kubernetes API 服务器参数,该参数将告诉您客户端证书位于何处,因为 Kubernetes API 服务器是 ETCD 的客户端。然后,您可以在节点的

    etcdctl
    命令中使用这些相同的证书。

  2. etcd 集群不再运行,因为对等成员已关闭。


0
投票

@Arghya Sadhu 提供了正确的解释。

现在,对于通过

kubeadm
并使用默认选项创建集群的人们来说,这里是从
kubectl
访问 etcd 的正确完整命令:

k exec -n kube-system etcd-<master_node> -- etcdctl --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/ser
ver.crt --key /etc/kubernetes/pki/etcd/server.key <etcd_subcommand>

事实上,证书会在集群创建时复制到此位置的 etcd 节点,以便通过

etcdctl
进行本地访问。

© www.soinside.com 2019 - 2024. All rights reserved.