我有以下情况:我有一个 Kubernetes 集群,其中有一个使用 RabbitMQ 进行通信的 Celery 应用程序。我有一个包含 Celery 任务的 pod,但该任务有 3 个不同的队列:
我需要这个 Pod 的两个实例,但每个实例都有不同的执行命令,这些是部署:
首次部署:
apiVersion: apps/v1
kind: Deployment
metadata:
name: worker-forecast-model-training-deploy
labels:
role: worker-forecast-model-training-service
spec:
replicas: 0
selector:
matchLabels:
role: worker-forecast-model-training-service
tier: web-service
template:
metadata:
labels:
role: worker-forecast-model-training-service
tier: web-service
spec:
containers:
- name: worker-forecast-model-training
image: prueba-celery-keda
imagePullPolicy: IfNotPresent
command:
- "celery"
args: [
"-A",
"app.worker",
"worker",
"--without-gossip",
"--without-mingle",
"--without-heartbeat",
"-l",
"info",
"--pool",
"solo",
"-Q",
"training-forecast-dev,training-forecast-solo-dev"
]
env:
- name: C_FORCE_ROOT
value: "True"
resources:
requests:
memory: "80Mi"
cpu: "80m"
limits:
memory: "11000Mi"
cpu: "2"
第二次部署:
apiVersion: apps/v1
kind: Deployment
metadata:
name: worker-forecast-model-training-prefork-deploy
labels:
role: worker-forecast-model-training-prefork-service
spec:
replicas: 0
selector:
matchLabels:
role: worker-forecast-model-training-prefork-service
tier: web-service
template:
metadata:
labels:
role: worker-forecast-model-training-prefork-service
tier: web-service
spec:
containers:
- name: worker-forecast-model-training-prefork
image: prueba-celery-keda-prefork
imagePullPolicy: IfNotPresent
command:
- "celery"
args: [
"-A",
"app.worker",
"worker",
"--without-gossip",
"--without-mingle",
"--without-heartbeat",
"-l",
"info",
"--pool",
"prefork",
"-Q",
"training-forecast-dev,training-forecast-prefork-dev"
]
env:
- name: C_FORCE_ROOT
value: "True"
resources:
requests:
memory: "80Mi"
cpu: "80m"
limits:
memory: "11000Mi"
cpu: "2"
唯一改变的是启动命令(--pool 与本例中的 Celery 无关,它正在监听的队列)。正如您所看到的,两者共享训练预测开发队列,除此之外,每个队列都有不同的队列。
所以,我也有两个 ScaledObject:
第一科达:
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: worker-forecast-model-deploy
spec:
scaleTargetRef:
name: worker-forecast-model-training-deploy
pollingInterval: 10
cooldownPeriod: 28800
idleReplicaCount: 0
minReplicaCount: 1
maxReplicaCount: 1
advanced:
restoreToOriginalReplicaCount: true
horizontalPodAutoscalerConfig:
behavior:
scaleDown:
stabilizationWindowSeconds: 60
policies:
- type: Percent
value: 20
periodSeconds: 1800
triggers:
- type: rabbitmq
metadata:
host: amqp://default_user:[email protected]:5672//
queueName: training-forecast-dev
mode: QueueLength
value: "1"
- type: rabbitmq
metadata:
host: amqp://default_user:[email protected]:5672//
queueName: training-forecast-solo-dev
mode: QueueLength
value: "1"
第二科达:
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: worker-forecast-model-training-prefork-deploy
spec:
scaleTargetRef:
name: worker-forecast-model-training-prefork-deploy
pollingInterval: 10
cooldownPeriod: 28800
idleReplicaCount: 0
minReplicaCount: 1
maxReplicaCount: 1
advanced:
restoreToOriginalReplicaCount: true
horizontalPodAutoscalerConfig:
behavior:
scaleDown:
stabilizationWindowSeconds: 60
policies:
- type: Percent
value: 20
periodSeconds: 1800
triggers:
- type: rabbitmq
metadata:
host: amqp://default_user:[email protected]:5672//
queueName: training-forecast-dev
mode: QueueLength
value: "1"
- type: rabbitmq
metadata:
host: amqp://default_user:[email protected]:5672//
queueName: training-forecast-prefork-dev
mode: QueueLength
value: "1"
预期行为 我需要,如果任务到达training-forecast-prefork-dev或training-forecast-prefork-dev,则会引发相应的pod(运行正确)。我还需要当任务到达训练预测开发时,两个 Pod 都会被提升。
当前行为 当一项任务到达training-forecast-dev时,只有一个pod被提升。我认为既然有两个 ScaledObject,那么两者都会被激活,最终结果将是两个 pod 的执行,但即使有多个任务到达training-forecast-dev,最终也只会引发一个。
您正在使用rabbitmq队列,因此只有一个pod被创建是有意义的,因为消息正在被消耗并在被消耗后从队列中删除。我会尝试使用不同的队列名称
training-forecast-dev
,也许 training-forecast-dev2
。
如果您希望将相同的消息放入
training-forecast-dev
和 training-forecast-dev2
,您可以查看“生成”扇出示例,例如 this。