我正在尝试将 Liveness 和 Readiness Probe 添加到 Openshift 中的 MongoDB 中。没有这些参数,Mongo 每次都可以完美部署。我必须创建单节点副本集,因为我的备份方法需要它,所以我需要使用副本集运行 Mongo。我不知道这是否重要,但我在 Dockerfile 中使用entrypoint.sh 而不是 CMD
我正在像这样创建我的副本集
"rs.initiate(
{
_id: 'rs0',
members: [
{ _id: 0, host: '(openshift-service-name):27017'},
]
});"
然后我要在 helm 图表中添加 Liveness 和 Readiness Probe
livenessProbe:
exec:
command: [ "mongosh", "--host", "localhost", "--port", "27017", "-u", "admin", "-p", "'admin'", "--authenticationDatabase", "'admin'", "--eval", "'db.getSiblingDB(\"admin\").runCommand({ replSetGetStatus: 1 }).ok ? 0 : 2'" ]
initialDelaySeconds: 70
periodSeconds: 60
failureThreshold: 3
timeoutSeconds: 20
readinessProbe:
exec:
command: [ "mongosh", "--host", "localhost", "--port", "27017", "-u", "admin", "-p", "'admin'", "--authenticationDatabase", "'admin'", "--eval", "'db.runCommand({ ping: 1 }).ok ? 0 : 2'" ]
initialDelaySeconds: 70
periodSeconds: 60
failureThreshold: 3
timeoutSeconds: 20
有趣的是,使用上面给出的配置进行的活性检查也会“损坏”数据库,但在第一次运行时它总是正常运行。但是,如果我杀死了 pod,它就会消失。
另一方面,由于容器永久处于未就绪状态,因此不可能运行 Readiness Probe,无论我在这里提供什么命令。例如回声也是不可能的。我尝试删除初始延迟,设置为 10 秒,具有更宽的超时框架等。
我相信有最重要的日志:
{"t":{"$date":"2024-01-18T06:15:34.306+00:00"},"s":"I", "c":"CONTROL", "id":20711, "ctx":"LogicalSessionCacheReap","msg":"Failed to reap transaction table","attr":{"error":"NotYetInitialized: Replication has not yet been configured"}}
{"t":{"$date":"2024-01-18T06:15:34.306+00:00"},"s":"I", "c":"SHARDING", "id":7012500, "ctx":"QueryAnalysisConfigurationsRefresher","msg":"Failed to refresh query analysis configurations, will try again at the next interval","attr":{"error":"PrimarySteppedDown: No primary exists currently"}}
{"t":{"$date":"2024-01-18T06:15:34.307+00:00"},"s":"I", "c":"NETWORK", "id":5693100, "ctx":"ReplCoord-0","msg":"Asio socket.set_option failed with std::system_error","attr":{"note":"connect (sync) TCP fast open","option":{"level":6,"name":30,"data":"01 00 00 00"},"error":{"what":"set_option: Protocol not available","message":"Protocol not available","category":"asio.system","value":92}}}
{"t":{"$date":"2024-01-18T06:15:34.466+00:00"},"s":"I", "c":"-", "id":4939300, "ctx":"monitoring-keys-for-HMAC","msg":"Failed to refresh key cache","attr":{"error":"ReadConcernMajorityNotAvailableYet: Read concern majority reads are currently not possible.","nextWakeupMillis":400}}
{"t":{"$date":"2024-01-18T06:15:36.800+00:00"},"s":"I", "c":"REPL", "id":21394, "ctx":"ReplCoord-0","msg":"This node is not a member of the config"}
{"t":{"$date":"2024-01-18T06:15:36.800+00:00"},"s":"I", "c":"REPL", "id":21358, "ctx":"ReplCoord-0","msg":"Replica set state transition","attr":{"newState":"REMOVED","oldState":"STARTUP"}}
{"t":{"$date":"2024-01-18T06:16:14.306+00:00"},"s":"I", "c":"SHARDING", "id":7012500, "ctx":"QueryAnalysisConfigurationsRefresher","msg":"Failed to refresh query analysis configurations, will try again at the next interval","attr":{"error":"PrimarySteppedDown: No primary exists currently"}}
您需要稍等一下,直到副本集启动。使用这样的脚本:
rs.initiate(
{
_id: 'rs0',
members: [
{ _id: 0, host: '(openshift-service-name):27017' },
]
}
);
while (! db.hello().isWritablePrimary ) sleep(1000);