为什么通过redis-ha对k8s进行聚类不起作用？

Question

我正在尝试与Node.JS（ioredis / cluster）一起创建Redis集群，但这似乎不起作用。

在GKE上是v1.11.8-gke.6。

我正在做的正是ha-redis docs中所说的：

 ~  helm install --set replicas=3 --name redis-test stable/redis-ha  
NAME:   redis-test
LAST DEPLOYED: Fri Apr 26 00:13:31 2019
NAMESPACE: yt
STATUS: DEPLOYED

RESOURCES:
==> v1/ConfigMap
NAME                           DATA  AGE
redis-test-redis-ha-configmap  3     0s
redis-test-redis-ha-probes     2     0s

==> v1/Pod(related)
NAME                          READY  STATUS    RESTARTS  AGE
redis-test-redis-ha-server-0  0/2    Init:0/1  0         0s

==> v1/Role
NAME                 AGE
redis-test-redis-ha  0s

==> v1/RoleBinding
NAME                 AGE
redis-test-redis-ha  0s

==> v1/Service
NAME                            TYPE       CLUSTER-IP   EXTERNAL-IP  PORT(S)             AGE
redis-test-redis-ha             ClusterIP  None         <none>       6379/TCP,26379/TCP  0s
redis-test-redis-ha-announce-0  ClusterIP  10.7.244.34  <none>       6379/TCP,26379/TCP  0s
redis-test-redis-ha-announce-1  ClusterIP  10.7.251.35  <none>       6379/TCP,26379/TCP  0s
redis-test-redis-ha-announce-2  ClusterIP  10.7.252.94  <none>       6379/TCP,26379/TCP  0s

==> v1/ServiceAccount
NAME                 SECRETS  AGE
redis-test-redis-ha  1        0s

==> v1/StatefulSet
NAME                        READY  AGE
redis-test-redis-ha-server  0/3    0s


NOTES:
Redis can be accessed via port 6379 and Sentinel can be accessed via port 26379 on the following DNS name from within your cluster:
redis-test-redis-ha.yt.svc.cluster.local

To connect to your Redis server:
1. Run a Redis pod that you can use as a client:

   kubectl exec -it redis-test-redis-ha-server-0 sh -n yt

2. Connect using the Redis CLI:

  redis-cli -h redis-test-redis-ha.yt.svc.cluster.local

 ~  k get pods | grep redis-test                                         
redis-test-redis-ha-server-0           2/2       Running   0          1m
redis-test-redis-ha-server-1           2/2       Running   0          1m
redis-test-redis-ha-server-2           2/2       Running   0          54s
 ~  kubectl exec -it redis-test-redis-ha-server-0 sh -n yt
Defaulting container name to redis.
Use 'kubectl describe pod/redis-test-redis-ha-server-0 -n yt' to see all of the containers in this pod.
/data $ redis-cli -h redis-test-redis-ha.yt.svc.cluster.local
redis-test-redis-ha.yt.svc.cluster.local:6379> set test key
(error) READONLY You can't write against a read only replica.

但最后我连接的只有一个随机pod是可写的。我在几个容器上运行日志，一切似乎都很好。我试图在cluster info运行redis-cli但我到处都得到ERR This instance has cluster support disabled。

日志：

 ~  k logs pod/redis-test-redis-ha-server-0  redis
1:C 25 Apr 2019 20:13:43.604 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 25 Apr 2019 20:13:43.604 # Redis version=5.0.3, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 25 Apr 2019 20:13:43.604 # Configuration loaded
1:M 25 Apr 2019 20:13:43.606 * Running mode=standalone, port=6379.
1:M 25 Apr 2019 20:13:43.606 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:M 25 Apr 2019 20:13:43.606 # Server initialized
1:M 25 Apr 2019 20:13:43.606 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
1:M 25 Apr 2019 20:13:43.627 * DB loaded from disk: 0.021 seconds
1:M 25 Apr 2019 20:13:43.627 * Ready to accept connections
1:M 25 Apr 2019 20:14:11.801 * Replica 10.7.251.35:6379 asks for synchronization
1:M 25 Apr 2019 20:14:11.801 * Partial resynchronization not accepted: Replication ID mismatch (Replica asked for 'c2827ffe011d774db005a44165bac67a7e7f7d85', my replication IDs are '8311a1ca896e97d5487c07f2adfd7d4ef924f36b' and '0000000000000000000000000000000000000000')
1:M 25 Apr 2019 20:14:11.802 * Delay next BGSAVE for diskless SYNC
1:M 25 Apr 2019 20:14:17.825 * Starting BGSAVE for SYNC with target: replicas sockets
1:M 25 Apr 2019 20:14:17.825 * Background RDB transfer started by pid 55
55:C 25 Apr 2019 20:14:17.826 * RDB: 0 MB of memory used by copy-on-write
1:M 25 Apr 2019 20:14:17.926 * Background RDB transfer terminated with success
1:M 25 Apr 2019 20:14:17.926 # Slave 10.7.251.35:6379 correctly received the streamed RDB file.
1:M 25 Apr 2019 20:14:17.926 * Streamed RDB transfer with replica 10.7.251.35:6379 succeeded (socket). Waiting for REPLCONF ACK from slave to enable streaming
1:M 25 Apr 2019 20:14:18.828 * Synchronization with replica 10.7.251.35:6379 succeeded
1:M 25 Apr 2019 20:14:42.711 * Replica 10.7.252.94:6379 asks for synchronization
1:M 25 Apr 2019 20:14:42.711 * Partial resynchronization not accepted: Replication ID mismatch (Replica asked for 'c2827ffe011d774db005a44165bac67a7e7f7d85', my replication IDs are 'af453adde824b2280ba66adb40cc765bf390e237' and '0000000000000000000000000000000000000000')
1:M 25 Apr 2019 20:14:42.711 * Delay next BGSAVE for diskless SYNC
1:M 25 Apr 2019 20:14:48.976 * Starting BGSAVE for SYNC with target: replicas sockets
1:M 25 Apr 2019 20:14:48.977 * Background RDB transfer started by pid 125
125:C 25 Apr 2019 20:14:48.978 * RDB: 0 MB of memory used by copy-on-write
1:M 25 Apr 2019 20:14:49.077 * Background RDB transfer terminated with success
1:M 25 Apr 2019 20:14:49.077 # Slave 10.7.252.94:6379 correctly received the streamed RDB file.
1:M 25 Apr 2019 20:14:49.077 * Streamed RDB transfer with replica 10.7.252.94:6379 succeeded (socket). Waiting for REPLCONF ACK from slave to enable streaming
1:M 25 Apr 2019 20:14:49.761 * Synchronization with replica 10.7.252.94:6379 succeeded
 ~  k logs pod/redis-test-redis-ha-server-1 redis 
1:C 25 Apr 2019 20:14:11.780 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 25 Apr 2019 20:14:11.781 # Redis version=5.0.3, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 25 Apr 2019 20:14:11.781 # Configuration loaded
1:S 25 Apr 2019 20:14:11.786 * Running mode=standalone, port=6379.
1:S 25 Apr 2019 20:14:11.791 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:S 25 Apr 2019 20:14:11.791 # Server initialized
1:S 25 Apr 2019 20:14:11.791 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
1:S 25 Apr 2019 20:14:11.792 * DB loaded from disk: 0.001 seconds
1:S 25 Apr 2019 20:14:11.792 * Before turning into a replica, using my master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
1:S 25 Apr 2019 20:14:11.792 * Ready to accept connections
1:S 25 Apr 2019 20:14:11.792 * Connecting to MASTER 10.7.244.34:6379
1:S 25 Apr 2019 20:14:11.792 * MASTER <-> REPLICA sync started
1:S 25 Apr 2019 20:14:11.792 * Non blocking connect for SYNC fired the event.
1:S 25 Apr 2019 20:14:11.793 * Master replied to PING, replication can continue...
1:S 25 Apr 2019 20:14:11.799 * Trying a partial resynchronization (request c2827ffe011d774db005a44165bac67a7e7f7d85:6006176).
1:S 25 Apr 2019 20:14:17.824 * Full resync from master: af453adde824b2280ba66adb40cc765bf390e237:722
1:S 25 Apr 2019 20:14:17.824 * Discarding previously cached master state.
1:S 25 Apr 2019 20:14:17.852 * MASTER <-> REPLICA sync: receiving streamed RDB from master
1:S 25 Apr 2019 20:14:17.853 * MASTER <-> REPLICA sync: Flushing old data
1:S 25 Apr 2019 20:14:17.853 * MASTER <-> REPLICA sync: Loading DB in memory
1:S 25 Apr 2019 20:14:17.853 * MASTER <-> REPLICA sync: Finished with success

我缺少什么或有更好的方法进行群集？

Answer 1

不是最好的解决方案，但我认为我可以使用Sentinel而不是寻找其他方式（或者可能没有其他方式）。它支持大多数语言，所以它不应该很难（除了redis-cli，无法确定如何查询Sentinel服务器）。

这就是我在ioredis上完成这项工作的方法（node.js，对不熟悉ES6语法，对不起）：

import * as IORedis from 'ioredis';
import Redis from 'ioredis';
import { redisHost, redisPassword, redisPort } from './config';

export function getRedisConfig(): IORedis.RedisOptions {
  // I'm not sure how to set this properly
  // ioredis/cluster automatically resolves all pods by hostname, but not this.
  // So I have to implicitly specify all pods.
  // Or resolve them all by hostname
  return {
    sentinels: process.env.REDIS_CLUSTER.split(',').map(d => {
      const [host, port = 26379] = d.split(':');

      return { host, port: Number(port) };
    }),
    name: process.env.REDIS_MASTER_NAME || 'mymaster',
    ...(redisPassword ? { password: redisPassword } : {}),
  };
}

export async function initializeRedis() {
  if (process.env.REDIS_CLUSTER) {
    const cluster = new Redis(getRedisConfig());

    return cluster;
  }

  // For dev environment
  const client = new Redis(redisPort, redisHost);

  if (redisPassword) {
    await client.auth(redisPassword);
  }

  return client;
}

在环境中：

env:
  - name: REDIS_CLUSTER
    value: redis-redis-ha-server-1.redis-redis-ha.yt.svc.cluster.local:26379,redis-redis-ha-server-0.redis-redis-ha.yt.svc.cluster.local:23679,redis-redis-ha-server-2.redis-redis-ha.yt.svc.cluster.local:23679

您可能想使用密码保护它。

Answer 2

如果您正在尝试解决HA，并且正在使用GKE + GCP，则值得探讨与Redis Labs就Redis Enterprise进行的对话。它原生支持HA，并与GCP深度集成，并与GKE配合良好。

显然，这成了一个构建与购买的争论，我为Redis Labs工作，所以我有一个偏见。也就是说 - 如果你感到困惑，探索这个选项几乎没有什么害处。

为什么通过redis-ha对k8s进行聚类不起作用？

问题描述投票：1回答：1

1个回答

最新问题

为什么通过redis-ha对k8s进行聚类不起作用？

问题描述 投票：1回答：1

1个回答

最新问题

问题描述投票：1回答：1