更新 kubernetes pod 时拉取并解压镜像“registry.k8s.io/pause:3.6”失败

问题描述 投票:0回答:1

当我更新 kubernetes(

v1.28.3
) 镜像时,我发现 pod 无法获取暂停镜像:

│   Warning  FailedCreatePodSandBox  10m                    kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to get sandbox image "registry.k8s.io/pause:3.6": fail │
│ ed to pull image "registry.k8s.io/pause:3.6": failed to pull and unpack image "registry.k8s.io/pause:3.6": failed to resolve reference "registry.k8s.io/pause:3.6": failed to do request: Head "https://u │
│ s-west2-docker.pkg.dev/v2/k8s-artifacts-prod/images/pause/manifests/3.6": dial tcp 142.251.8.82:443: i/o timeout                                                                                          │
│   Warning  FailedCreatePodSandBox  6m20s                  kubelet            Failed to create pod sandbox: rpc error: code = DeadlineExceeded desc = failed to get sandbox image "registry.k8s.io/pause:3 │
│ .6": failed to pull image "registry.k8s.io/pause:3.6": failed to pull and unpack image "registry.k8s.io/pause:3.6": failed to resolve reference "registry.k8s.io/pause:3.6": failed to do request: Head " │
│ https://us-west2-docker.pkg.dev/v2/k8s-artifacts-prod/images/pause/manifests/3.6": dial tcp 173.194.174.82:443: i/o timeout                                                                               │
│   Warning  FailedCreatePodSandBox  3m23s                  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to get sandbox image "registry.k8s.io/pause:3.6": fail │
│ ed to pull image "registry.k8s.io/pause:3.6": failed to pull and unpack image "registry.k8s.io/pause:3.6": failed to resolve reference "registry.k8s.io/pause:3.6": failed to do request: Head "https://u │
│ s-west2-docker.pkg.dev/v2/k8s-artifacts-prod/images/pause/manifests/3.6": dial tcp 173.194.174.82:443: i/o timeout                                                                                        │
│   Warning  FailedCreatePodSandBox  2m38s (x4 over 8m30s)  kubelet            Failed to create pod sandbox: rpc error: code = DeadlineExceeded desc = failed to get sandbox image "registry.k8s.io/pause:3 │
│ .6": failed to pull image "registry.k8s.io/pause:3.6": failed to pull and unpack image "registry.k8s.io/pause:3.6": failed to resolve reference "registry.k8s.io/pause:3.6": failed to do request: Head " │
│ https://us-west2-docker.pkg.dev/v2/k8s-artifacts-prod/images/pause/manifests/3.6": dial tcp 142.251.8.82:443: i/o timeout                                                                                 │
│   Warning  FailedCreatePodSandBox  115s                   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to get sandbox image "registry.k8s.io/pause:3.6": fail │
│ ed to pull image "registry.k8s.io/pause:3.6": failed to pull and unpack image "registry.k8s.io/pause:3.6": failed to resolve reference "registry.k8s.io/pause:3.6": failed to do request: Head "https://u │
│ s-west2-docker.pkg.dev/v2/k8s-artifacts-prod/images/pause/manifests/3.6": dial tcp 64.233.188.82:443: i/o timeout                                                                                         │
│   Warning  FailedCreatePodSandBox  28s (x9 over 12m)      kubelet            Failed to create pod sandbox: rpc error: code = DeadlineExceeded desc = failed to get sandbox image "registry.k8s.io/pause:3 │
│ .6": failed to pull image "registry.k8s.io/pause:3.6": failed to pull and unpack image "registry.k8s.io/pause:3.6": failed to resolve reference "registry.k8s.io/pause:3.6": failed to do request: Head " │
│ https://us-west2-docker.pkg.dev/v2/k8s-artifacts-prod/images/pause/manifests/3.6": dial tcp 64.233.188.82:443: i/o timeout

我已经尝试过像这样拉取图像:

ctr -n=k8s.io image pull k8s.m.daocloud.io/pause:3.6
ctr -n=k8s.io images tag k8s.m.daocloud.io/pause:3.6 registry.k8s.io/pause:3.6

这种方式工作了一段时间,下次更新pod时,又会出现这个错误。我应该怎么做才能永久解决这个问题?

kubernetes containers
1个回答
0
投票

看起来错误的根本原因是/var分区可能有足够的可用空间(用df -h检查),但是在升级过程中,一旦加载新版本映像但在删除旧版本映像之前,它的使用率可能超过 80%。这是问题的主要根源。

因此,Kubernetes 可能会“清理”“垃圾收集” 识别为未使用的图像,包括必要但不活动的 “暂停” 图像。

根据此 Medium 博客,作者:Richard Durso

要释放磁盘空间,请使用命令 crictl images —prune,有时它可能不会清除正在使用的所有内容。它甚至清除了 Kubernetes 暂停容器镜像。

kubelet 标志定义为:

image-gc-high-threshold:触发图像垃圾收集的磁盘使用百分比。默认为 85%。

image-gc-low-threshold:图像垃圾收集尝试释放的磁盘使用百分比。默认为 80%。

配置 Kubernetes 垃圾收集应该是维护磁盘空间的主要方式。但是,您仍然偶尔使用 CRI Purge 脚本来检查缓存了哪些映像,并在需要时手动执行它,例如在应用 Ubuntu 补丁之前进行节点维护时。

您可以尝试将 ImagePullPolicy 设置为 IfNotPresent (

spec.containers.imagePullPolicy: "ifNotPresent
"),以确保仅当节点上尚不存在容器映像时才从注册表中提取容器映像。这可以帮助开始使用本地映像缓存并防止未经授权的操作映像被部署到您的集群。

© www.soinside.com 2019 - 2024. All rights reserved.