我正在尝试设置 aws ecs fargate 部署配置。我能够在没有容器运行状况检查的情况下运行容器。但是, 我也想运行容器健康检查。我尝试了所有可能的方案来实现这一目标。但是,没有运气。
我尝试使用以下 aws 推荐命令从列出的 url 验证容器健康检查。
我尝试了以上两个命令。但是,它们都没有按预期工作。请帮我接收容器有效的健康检查命令
下面是我的 DockerFile
FROM centos:latest
RUN yum update -y
RUN yum install httpd httpd-tools curl -y
EXPOSE 80
CMD ["/usr/sbin/httpd", "-D", "FOREGROUND"]
HEALTHCHECK CMD curl --fail http://localhost:80/ || exit 1
FROM microsoft/dotnet:2.1-aspnetcore-runtime AS base
WORKDIR /app
EXPOSE 80
FROM microsoft/dotnet:2.1-sdk AS build
WORKDIR /DockerDemoApi
COPY ./DockerDemoApi.csproj DockerDemoApi/
RUN dotnet restore DockerDemoApi/DockerDemoApi.csproj
COPY . .
WORKDIR /DockerDemoApi
RUN dotnet build DockerDemoApi.csproj -c Release -o /app
FROM build AS publish
RUN dotnet publish DockerDemoApi.csproj -c Release -o /app
FROM base AS final
WORKDIR /app
COPY --from=publish /app .
ENTRYPOINT ["dotnet", "DockerDemoApi.dll"]
我在我的容器中添加了 curl 命令及其工作。但是,如果我在 AWS Healthcheck 任务中保留相同的命令,它就会失败。
任务定义JSON:
{
"ipcMode": null,
"executionRoleArn": "arn:aws:iam::xxxx:role/ecsTaskExecutionRole",
"containerDefinitions": [{
"dnsSearchDomains": null,
"logConfiguration": {
"logDriver": "awslogs",
"secretOptions": null,
"options": {
"awslogs-group": "/ecs/mall-health-check-task",
"awslogs-region": "ap-south-1",
"awslogs-stream-prefix": "ecs"
}
},
"entryPoint": [],
"portMappings": [
{
"hostPort": 80,
"protocol": "tcp",
"containerPort": 80
}
],
"command": [],
"linuxParameters": null,
"cpu": 256,
"environment": [],
"resourceRequirements": null,
"ulimits": null,
"dnsServers": null,
"mountPoints": [],
"workingDirectory": null,
"secrets": null,
"dockerSecurityOptions": null,
"memory": null,
"memoryReservation": 512,
"volumesFrom": [],
"stopTimeout": null,
"image": "xxxx.dkr.ecr.ap-south-
1.amazonaws.com/autoaml/api/dev/alpine:latest",
"startTimeout": null,
"dependsOn": null,
"disableNetworking": null,
"interactive": null,
"healthCheck": null,
"essential": true,
"links": [],
"hostname": null,
"extraHosts": null,
"pseudoTerminal": null,
"user": null,
"readonlyRootFilesystem": null,
"dockerLabels": null,
"systemControls": null,
"privileged": null,
"name": "sample-app"
}
],
"placementConstraints": [],
"memory": "512",
"taskRoleArn": "arn:aws:iam::xxxx:role/ecsTaskExecutionRole",
"compatibilities": [
"EC2",
"FARGATE"
],
"taskDefinitionArn": "arn:aws:ecs:ap-south-1:xxx:task-definition/mall-
health-check-task:9",
"family": "mall-health-check-task",
"requiresAttributes": [{
"targetId": null,
"targetType": null,
"value": null,
"name": "ecs.capability.execution-role-ecr-pull"
},
{
"targetId": null,
"targetType": null,
"value": null,
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"
},
{
"targetId": null,
"targetType": null,
"value": null,
"name": "ecs.capability.task-eni"
},
{
"targetId": null,
"targetType": null,
"value": null,
"name": "com.amazonaws.ecs.capability.ecr-auth"
},
{
"targetId": null,
"targetType": null,
"value": null,
"name": "com.amazonaws.ecs.capability.task-iam-role"
},
{
"targetId": null,
"targetType": null,
"value": null,
"name": "ecs.capability.execution-role-awslogs"
},
{
"targetId": null,
"targetType": null,
"value": null,
"name": "com.amazonaws.ecs.capability.logging-driver.awslogs"
},
{
"targetId": null,
"targetType": null,
"value": null,
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.21"
},
{
"targetId": null,
"targetType": null,
"value": null,
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.19"
}
],
"pidMode": null,
"requiresCompatibilities": [
"FARGATE"
],
"networkMode": "awsvpc",
"cpu": "256",
"revision": 9,
"status": "ACTIVE",
"proxyConfiguration": null,
"volumes": []
}
文档提到以下内容:
在 AWS 管理控制台中注册任务定义时,使用逗号分隔的命令列表,这些命令将在创建任务定义后自动转换为字符串。健康检查的示例输入可能是:
CMD-SHELL, curl -f http://localhost/ || exit 1
使用 AWS 管理控制台 JSON 面板、AWS CLI 或 API 注册任务定义时,您应该将命令列表括在括号中。健康检查的示例输入可能是:
[ "CMD-SHELL", "curl -f http://localhost/ || exit 1" ]
您是否验证了您的健康检查命令?我的意思是,http://127.0.0.0 是有效的,对吧? 当您点击 http://127.0.0.0(无端口)时,检查您的容器返回成功响应。
下面是示例任务定义。这是在容器中启动 tomcat 服务器并检查健康状况 (localhost:8080)
{
"ipcMode": null,
"executionRoleArn": "arn:aws:iam::accountid:role/taskExecutionRole",
"containerDefinitions": [
{
"dnsSearchDomains": null,
"logConfiguration": {
"logDriver": "awslogs",
"secretOptions": null,
"options": {
"awslogs-group": "/test/test-task",
"awslogs-region": "us-east-2",
"awslogs-stream-prefix": "test"
}
},
"entryPoint": null,
"portMappings": [
{
"hostPort": 8080,
"protocol": "tcp",
"containerPort": 8080
}
],
"command": null,
"linuxParameters": null,
"cpu": 0,
"environment": [],
"resourceRequirements": null,
"ulimits": null,
"dnsServers": null,
"mountPoints": [],
"workingDirectory": null,
"secrets": null,
"dockerSecurityOptions": null,
"memory": null,
"memoryReservation": null,
"volumesFrom": [],
"stopTimeout": null,
"image": "tomcat",
"startTimeout": null,
"dependsOn": null,
"disableNetworking": false,
"interactive": null,
"healthCheck": {
"retries": 3,
"command": [
"CMD-SHELL",
"curl -f http://localhost:8080/ || exit 1"
],
"timeout": 5,
"interval": 30,
"startPeriod": null
},
"essential": true,
"links": null,
"hostname": null,
"extraHosts": null,
"pseudoTerminal": null,
"user": null,
"readonlyRootFilesystem": null,
"dockerLabels": null,
"systemControls": null,
"privileged": null,
"name": "tomcat"
}
],
"memory": "1024",
"taskRoleArn": "arn:aws:iam::accountid:role/taskExecutionRole",
"family": "test-task",
"pidMode": null,
"requiresCompatibilities": [
"FARGATE"
],
"networkMode": "awsvpc",
"cpu": "512",
"proxyConfiguration": null,
"volumes": []
}
您正在使用的 docker 图像,它是否安装了
curl
部分软件包?.
根据您的屏幕截图,看起来您正在直接使用
httpd:2.4
docker image。如果是这样,那么curl
不是包的一部分。
您需要从上面
httpd:2.4
创建自己的docker镜像作为基础。下面是示例 Dockerfile 内容,用于获取图像的卷曲部分。
例子 -
FROM httpd:2.4
RUN apt-get update; \
apt-get install -y --no-install-recommends curl;
然后构建镜像并将其推送到您的 dockerhub 帐户或私人 docker 仓库。
docker build -t my-apache2 .
docker run -dit --name my-running-app -p 80:80 my-apache2
现在有了上图,您应该可以运行 healthcheck 命令了。
https://hub.docker.com/_/httpd
https://github.com/docker-library/httpd/blob/master/2.4/Dockerfile
我不知道为什么,但将 http://localhost 更改为 http://127.0.0.1(不仅仅是 127.0.0.1)可以解决问题。
我遵循了here的建议,它解决了我的健康检查问题。
面临同样的问题并为我的用例找到了解决方案:
一个任务定义中的三个容器,分别是
使用 ecs-params.yml 文件声明健康检查:
version: 1
task_definition:
task_execution_role: ecsTaskExecutionRole
ecs_network_mode: awsvpc
task_size:
mem_limit: 2GB
cpu_limit: 1024
services:
nginx-sidecar:
healthcheck:
test: curl -f http://localhost || exit 0
interval: 10s
timeout: 3s
retries: 3
start_period: 5s
<service 2>:
healthcheck:
test: curl -f http://localhost:3023 || exit 0
interval: 10s
timeout: 3s
retries: 3
start_period: 5s
<service 3>:
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3019/health"]
interval: 10s
timeout: 3s
retries: 3
start_period: 5s
确保 curl 在您的 docker 文件中可用,并且您也可以在本地调用它
我的 Dockerfile:
FROM node:14.17-alpine
RUN apk add --update curl
您可以在 ecs-params.yml 中包含以下任一命令以进行健康检查:
test: curl -f http://localhost || exit 0
test: ["CMD", "curl", "-f", "http://localhost"]
两者在我的用例中都有效。希望这会有所帮助,因为其他答案都不适合我。
根据您的任务定义:
"healthCheck": null,
你需要在那里定义它,而不是在 Dockerfile 中。
我遇到了类似的问题,问题出在 docker 镜像平台本身。
我正在使用 Apple M1 构建一个基于 Alpine Linux 的最小 Docker 镜像。
AWS ELB 健康检查有效,但容器 HealthCheck 总是失败并显示 UNKNOWN。
就我而言,我通过使用 linux/amd64 构建 docker 镜像解决了这个问题。
供日后参考:
docker buildx build --platform=linux/amd64 ...
不知道你是否也面临着类似的问题