我有一个 Spring Boot 应用程序版本 3.2.1,我使用 Docker 驱动程序在 WSL 内的 Minikube 上运行。该应用程序非常小,只需几秒钟即可启动。在本地主机上运行时,可以通过 http://localhost:8080/actuator/health/readiness 访问就绪探针,并返回有效状态。
但是,当部署服务并部署到 minikube 时,我注意到“describe pod”包含:
Liveness probe failed: Get "http://10.244.0.28:8080/actuator/health/liveness":
dial tcp 10.244.0.28:8080: connect: connection refused
这会导致 Pod 连续多次重新启动,并且仅在几分钟后重新启动停止并且 Pod 稳定下来。 Pod 日志没有显示任何错误,仅显示 Spring Boot 启动时看到的标准输出。这些是我的清单:
apiVersion: v1
kind: Service
metadata:
name: my-portal-gateway
labels:
helm.sh/chart: my-portal-gateway-0.1.0
app.kubernetes.io/name: my-portal-gateway
app.kubernetes.io/instance: my-portal-gateway
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
spec:
type: ClusterIP
selector:
app.kubernetes.io/name: my-portal-gateway
app.kubernetes.io/instance: my-portal-gateway
ports:
- name: http
protocol: TCP
port: 80
targetPort: 8080
---
# Source: my-portal-gateway/templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-portal-gateway
labels:
helm.sh/chart: my-portal-gateway-0.1.0
app.kubernetes.io/name: my-portal-gateway
app.kubernetes.io/instance: my-portal-gateway
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
spec:
replicas: 2
selector:
matchLabels:
app.kubernetes.io/name: my-portal-gateway
app.kubernetes.io/instance: my-portal-gateway
template:
metadata:
labels:
helm.sh/chart: my-portal-gateway-0.1.0
app.kubernetes.io/name: my-portal-gateway
app.kubernetes.io/instance: my-portal-gateway
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/managed-by: Helm
spec:
containers:
- name: my-portal-gateway
image: "myrepo/my-portal-gateway:1.0.0"
imagePullPolicy: IfNotPresent
ports:
- name: http
containerPort: 8080
protocol: TCP
livenessProbe:
httpGet:
path: /actuator/health/liveness
port: 8080
readinessProbe:
httpGet:
path: /actuator/health/readiness
port: 8080
initialDelaySeconds: 60
periodSeconds: 20
failureThreshold: 3
successThreshold: 1
timeoutSeconds: 5
我在 Spring Boot 中的 application.yaml 文件:
server:
address: 0.0.0.0
management:
endpoints:
web:
exposure:
include:
- health
- info
endpoint:
health:
group:
readiness:
include: readinessProbe
probes:
enabled: true
如何解决 Pod 重启问题?谢谢。
我终于找到了一个解决方案 - 除了就绪探针中现有的
initialDelaySeconds
之外,还向活性探针添加 initialDelaySeconds
。