nats.io 集群在 docker swarm 中的奇怪行为

问题描述 投票:0回答:0

我有一个由一个节点制作的测试群,我正在尝试在群平衡器后面部署一个 nats 集群。 我创建了一个种子服务,其中包含一个初始化集群的实例,以及一个包含我向外部公开的服务器的双节点服务。所有服务共享一个名为 public_network 的自定义覆盖网络。 当我尝试将客户端连接到集群并使用请求/回复时,我有一个奇怪的行为:一个调用正常,一个调用超时,重复。 管理器按预期开始工作,第一台服务器连接到管理器并建立路由,第二台服务器连接到管理器但无法建立到第一台服务器的路由。所以网格是不完整的。

我看到的问题是两台服务器都使用相同的 IP 连接到管理器,这不是容器的特定 IP,但似乎是端点 lb-public_network 之一。第二台服务器尝试连接到该 ip 并失败,因为连接被拒绝。

我会很感激一些帮助。

这里是我用于部署的撰写文件(通过 Portainer):

version: '3.2'

services:
  #seed server not exposed
  manager:
    image: nats:alpine
    restart: unless-stopped
    command: --http_port 8222 --cluster_name nats_cluster --cluster nats://0.0.0.0:6222 -D -no_advertise
    ports:
      - 8222:8222
    networks:
    - public

  #Exposed pool 
  cluster:
    image: nats:alpine
    restart: unless-stopped
    command: --cluster_name nats_cluster --cluster nats://0.0.0.0:6222 --routes=nats://manager:6222 -D -no_advertise 
    deploy:
      replicas: 2
    ports:
      - 4222:4222
    networks:
    - public
    depends_on:
    - manager

networks:
  public:
    external:
      name: public_network

以及各种日志:

容器管理员:

`[1] 2023/04/03 16:30:57.889294 [INF] Starting nats-server
[1] 2023/04/03 16:30:57.889391 [INF]   Version:  2.9.15
[1] 2023/04/03 16:30:57.889427 [INF]   Git:      [b91fa85]
[1] 2023/04/03 16:30:57.889446 [DBG]   Go build: go1.19.6
[1] 2023/04/03 16:30:57.889462 [INF]   Cluster:  nats_cluster
[1] 2023/04/03 16:30:57.889479 [INF]   Name:     NDS6ZANTINQRRIANQMXYTTZLSB6VGRCS5QYO3RZRJ5ICVBW6MC7XBV4D
[1] 2023/04/03 16:30:57.889516 [INF]   ID:       NDS6ZANTINQRRIANQMXYTTZLSB6VGRCS5QYO3RZRJ5ICVBW6MC7XBV4D
[1] 2023/04/03 16:30:57.889549 [DBG] Created system account: "$SYS"
[1] 2023/04/03 16:30:57.890053 [INF] Starting http monitor on 0.0.0.0:8222
[1] 2023/04/03 16:30:57.890199 [INF] Listening for client connections on 0.0.0.0:4222
[1] 2023/04/03 16:30:57.890228 [DBG] Get non local IPs for "0.0.0.0"
[1] 2023/04/03 16:30:57.890426 [DBG]   ip=10.200.8.227
[1] 2023/04/03 16:30:57.890488 [DBG]   ip=172.18.0.4
[1] 2023/04/03 16:30:57.890549 [DBG]   ip=10.200.0.191
[1] 2023/04/03 16:30:57.890572 [INF] Server is ready
[1] 2023/04/03 16:30:57.890742 [DBG] maxprocs: Leaving GOMAXPROCS=2: CPU quota undefined
[1] 2023/04/03 16:30:57.890830 [INF] Cluster name is nats_cluster
[1] 2023/04/03 16:30:57.890895 [INF] Listening for route connections on 0.0.0.0:6222
[1] 2023/04/03 16:32:32.454950 [INF] 10.200.8.5:45176 - rid:4 - Route connection created
[1] 2023/04/03 16:32:32.455398 [DBG] 10.200.8.5:45176 - rid:4 - Registering remote route "NCMHXM6GT5PVNBFFDNQVRZSK7FVBDUXLWUILL3RLG2B6NMZRTT4G3IBC"
[1] 2023/04/03 16:32:32.455427 [DBG] 10.200.8.5:45176 - rid:4 - Sent local subscriptions to route
[1] 2023/04/03 16:32:33.512476 [DBG] 10.200.8.5:45176 - rid:4 - Router Ping Timer
[1] 2023/04/03 16:32:36.002113 [INF] 10.200.8.5:49014 - rid:5 - Route connection created
[1] 2023/04/03 16:32:36.002584 [DBG] 10.200.8.5:49014 - rid:5 - Registering remote route "NCAGFLAATEBXYYJKG5SWWWOLPWTHEGK3E6FEB6PWL4I7FCA37Z5NNI6W"
[1] 2023/04/03 16:32:36.002607 [DBG] 10.200.8.5:49014 - rid:5 - Sent local subscriptions to route
[1] 2023/04/03 16:32:37.178918 [DBG] 10.200.8.5:49014 - rid:5 - Router Ping Timer`

容器集群1:

[1] 2023/04/03 16:32:35.999629 [INF] Starting nats-server
[1] 2023/04/03 16:32:35.999674 [INF]   Version:  2.9.15
[1] 2023/04/03 16:32:35.999677 [INF]   Git:      [b91fa85]
[1] 2023/04/03 16:32:35.999679 [DBG]   Go build: go1.19.6
[1] 2023/04/03 16:32:35.999681 [INF]   Cluster:  nats_cluster
[1] 2023/04/03 16:32:35.999684 [INF]   Name:     NCAGFLAATEBXYYJKG5SWWWOLPWTHEGK3E6FEB6PWL4I7FCA37Z5NNI6W
[1] 2023/04/03 16:32:35.999686 [INF]   ID:       NCAGFLAATEBXYYJKG5SWWWOLPWTHEGK3E6FEB6PWL4I7FCA37Z5NNI6W
[1] 2023/04/03 16:32:35.999706 [DBG] Created system account: "$SYS"
[1] 2023/04/03 16:32:36.000084 [INF] Listening for client connections on 0.0.0.0:4222
[1] 2023/04/03 16:32:36.000092 [DBG] Get non local IPs for "0.0.0.0"
[1] 2023/04/03 16:32:36.000218 [DBG]   ip=10.200.8.11
[1] 2023/04/03 16:32:36.000258 [DBG]   ip=172.18.0.7
[1] 2023/04/03 16:32:36.000284 [DBG]   ip=10.200.0.226
[1] 2023/04/03 16:32:36.000296 [INF] Server is ready
[1] 2023/04/03 16:32:36.000426 [DBG] maxprocs: Leaving GOMAXPROCS=2: CPU quota undefined
[1] 2023/04/03 16:32:36.000821 [INF] Cluster name is nats_cluster
[1] 2023/04/03 16:32:36.000870 [INF] Listening for route connections on 0.0.0.0:6222
[1] 2023/04/03 16:32:36.001805 [DBG] Trying to connect to route on manager:6222 (10.200.8.148:6222)
[1] 2023/04/03 16:32:36.002363 [DBG] 10.200.8.148:6222 - rid:4 - Route connect msg sent
[1] 2023/04/03 16:32:36.002409 [INF] 10.200.8.148:6222 - rid:4 - Route connection created
[1] 2023/04/03 16:32:36.002999 [DBG] 10.200.8.148:6222 - rid:4 - Registering remote route "NDS6ZANTINQRRIANQMXYTTZLSB6VGRCS5QYO3RZRJ5ICVBW6MC7XBV4D"
[1] 2023/04/03 16:32:36.003029 [DBG] 10.200.8.148:6222 - rid:4 - Sent local subscriptions to route

容器集群2:

[1] 2023/04/03 16:32:32.451270 [INF] Starting nats-server
[1] 2023/04/03 16:32:32.452089 [INF]   Version:  2.9.15
[1] 2023/04/03 16:32:32.452093 [INF]   Git:      [b91fa85]
[1] 2023/04/03 16:32:32.452096 [DBG]   Go build: go1.19.6
[1] 2023/04/03 16:32:32.452098 [INF]   Cluster:  nats_cluster
[1] 2023/04/03 16:32:32.452101 [INF]   Name:     NCMHXM6GT5PVNBFFDNQVRZSK7FVBDUXLWUILL3RLG2B6NMZRTT4G3IBC
[1] 2023/04/03 16:32:32.452105 [INF]   ID:       NCMHXM6GT5PVNBFFDNQVRZSK7FVBDUXLWUILL3RLG2B6NMZRTT4G3IBC
[1] 2023/04/03 16:32:32.452131 [DBG] Created system account: "$SYS"
[1] 2023/04/03 16:32:32.453006 [INF] Listening for client connections on 0.0.0.0:4222
[1] 2023/04/03 16:32:32.453018 [DBG] Get non local IPs for "0.0.0.0"
[1] 2023/04/03 16:32:32.453229 [DBG]   ip=10.200.0.225
[1] 2023/04/03 16:32:32.453526 [DBG]   ip=172.18.0.6
[1] 2023/04/03 16:32:32.453569 [DBG]   ip=10.200.8.10
[1] 2023/04/03 16:32:32.453586 [INF] Server is ready
[1] 2023/04/03 16:32:32.453719 [DBG] maxprocs: Leaving GOMAXPROCS=2: CPU quota undefined
[1] 2023/04/03 16:32:32.453725 [INF] Cluster name is nats_cluster
[1] 2023/04/03 16:32:32.453752 [INF] Listening for route connections on 0.0.0.0:6222
[1] 2023/04/03 16:32:32.454610 [DBG] Trying to connect to route on manager:6222 (10.200.8.148:6222)
[1] 2023/04/03 16:32:32.455126 [DBG] 10.200.8.148:6222 - rid:4 - Route connect msg sent
[1] 2023/04/03 16:32:32.455161 [INF] 10.200.8.148:6222 - rid:4 - Route connection created
[1] 2023/04/03 16:32:32.455602 [DBG] 10.200.8.148:6222 - rid:4 - Registering remote route "NDS6ZANTINQRRIANQMXYTTZLSB6VGRCS5QYO3RZRJ5ICVBW6MC7XBV4D"
[1] 2023/04/03 16:32:32.455626 [DBG] 10.200.8.148:6222 - rid:4 - Sent local subscriptions to route
[1] 2023/04/03 16:32:33.588019 [DBG] 10.200.8.148:6222 - rid:4 - Router Ping Timer
[1] 2023/04/03 16:32:36.002829 [DBG] Trying to connect to route on 10.200.8.5:6222 (10.200.8.5:6222)
[1] 2023/04/03 16:32:36.003020 [ERR] Error trying to connect to route (attempt 1): dial tcp 10.200.8.5:6222: connect: connection refused
[1] 2023/04/03 16:34:33.588613 [DBG] 10.200.8.148:6222 - rid:4 - Router Ping Timer

我希望一切都按预期工作。我觉得很奇怪,每个集群容器都显示自己的 lb ip 地址而不是它自己的。我认为 lb 是一种传入机制,而不是输出。

docker-swarm nats.io
© www.soinside.com 2019 - 2024. All rights reserved.