kubernetes ingress-nginx |就绪探测失败 |

Question

大家好，我即将结束全新集群的安装，但遇到了一个奇怪的问题。

我通过清单和 helm 图表部署 ingress-nginx，但它们给了我相同的结果

kubectl get po 
nginx-ingress-dx6bg             0/1     Running             3 (26s ago)     3m44s   10.244.4.118   node-2   <none>           <none>
nginx-ingress-gqkhz             0/1     Running             3 (29s ago)     3m47s   10.244.3.16    node-1   <none>           <none>
nginx-ingress-dx6bg             0/1     Error               3 (86s ago)     4m44s   10.244.4.118   node-2   <none>           <none>
nginx-ingress-gqkhz             0/1     Error               3 (89s ago)     4m47s   10.244.3.16    node-1   <none>           <none>
nginx-ingress-dx6bg             0/1     CrashLoopBackOff    3 (12s ago)     4m56s   10.244.4.118   node-2   <none>           <none>
nginx-ingress-gqkhz             0/1     CrashLoopBackOff    3 (13s ago)     4m59s   10.244.3.16    node-1   <none>           <none>
nginx-ingress-gqkhz             0/1     Running             4 (44s ago)     5m30s   10.244.3.16    node-1   <none>           <none>
nginx-ingress-dx6bg             0/1     Running             4 (51s ago)     5m35s   10.244.4.118   node-2   <none>           <none>
nginx-ingress-b9fcfbb59-hwjc8   0/1     Running             6 (2m49s ago)   12m     10.244.4.116   node-2   <none>           <none>

并描述 pod，问题出在活性探针中

kd po -n nginx-ingress nginx-ingress-b9fcfbb59-hwjc8
Name:             nginx-ingress-b9fcfbb59-hwjc8
Namespace:        nginx-ingress
Priority:         0
Service Account:  nginx-ingress
Node:             node-2/192.168.17.15
Start Time:       Thu, 08 Feb 2024 17:09:37 +0100
Labels:           app=nginx-ingress
                  app.kubernetes.io/name=nginx-ingress
                  app.kubernetes.io/version=3.4.2
                  app.nginx.org/version=1.25.3
                  pod-template-hash=b9fcfbb59
Annotations:      <none>
Status:           Running
SeccompProfile:   RuntimeDefault
IP:               10.244.4.116
IPs:
  IP:           10.244.4.116
Controlled By:  ReplicaSet/nginx-ingress-b9fcfbb59
Containers:
  nginx-ingress:
    Container ID:  containerd://57299408237d9d8b1b7be67ac12d6999640ff2249305c8d289a78a58fe6b38c9
    Image:         nginx/nginx-ingress:3.4.2
    Image ID:      docker.io/nginx/nginx-ingress@sha256:4b97f1d3466c804d51abbdeb84f2c7c3ea00d6a937a320d62a4cf9d6b447d6ad
    Ports:         80/TCP, 443/TCP, 8081/TCP, 9113/TCP
    Host Ports:    0/TCP, 0/TCP, 0/TCP, 0/TCP
    Args:
      -nginx-configmaps=$(POD_NAMESPACE)/nginx-config
    State:          Running
      Started:      Thu, 08 Feb 2024 17:17:51 +0100
    Last State:     Terminated
      Reason:       Error
      Exit Code:    255
      Started:      Thu, 08 Feb 2024 17:15:30 +0100
      Finished:     Thu, 08 Feb 2024 17:16:30 +0100
    Ready:          False
    Restart Count:  5
    Requests:
      cpu:      100m
      memory:   128Mi
    Readiness:  http-get http://:readiness-port/nginx-ready delay=0s timeout=1s period=1s #success=1 #failure=3
    Environment:
      POD_NAMESPACE:  nginx-ingress (v1:metadata.namespace)
      POD_NAME:       nginx-ingress-b9fcfbb59-hwjc8 (v1:metadata.name)
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-vlfd8 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  kube-api-access-vlfd8:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                      From               Message
  ----     ------     ----                     ----               -------
  Normal   Scheduled  8m57s                    default-scheduler  Successfully assigned nginx-ingress/nginx-ingress-b9fcfbb59-hwjc8 to node-2
  Normal   Pulling    8m57s                    kubelet            Pulling image "nginx/nginx-ingress:3.4.2"
  Normal   Pulled     8m35s                    kubelet            Successfully pulled image "nginx/nginx-ingress:3.4.2" in 21.588s (21.589s including waiting)
  Normal   Created    8m35s                    kubelet            Created container nginx-ingress
  Normal   Started    8m35s                    kubelet            Started container nginx-ingress
  Warning  Unhealthy  3m56s (x250 over 8m34s)  kubelet            Readiness probe failed: Get "http://10.244.4.116:8081/nginx-ready": dial tcp 10.244.4.116:8081: connect: connection refused

根据 nginx corp 的已知问题，我指示 helm 增加超时，但没有任何积极结果。

helm install nginx-ingress-controller nginx-stable/nginx-ingress  --set rbac.create=true --set controller."nodeSelector\.kubernetes\.io/hostname"=node-2 --set nginxReloadTimeout=20000

您有什么建议吗？可能无需重置整个集群？

在不同的集群上它工作正常。

Answer 1

纯粹从部署角度来看——首先排除资源问题。对已停止的 nginx-ingress-gqkhz 或 nginx-ingress-dx6bg 副本进行描述并检查错误。还建议将其缩小到 1 或 2 个副本，并查看容器是否启动。就绪探针失败并不能说明什么。

此外，在显示为正在运行的容器上，读取日志（kubectl 日志 podname 容器名称）。这可能会给你一些信息。

虽然我在某些副本上看到 CrashLoopBackOff，但我必须排除任何网络问题，因为某些副本已拉取映像。

kubernetes ingress-nginx |就绪探测失败 |

问题描述投票：0回答：1

1个回答

最新问题

kubernetes ingress-nginx |就绪探测失败 |

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1