Redis Sentinel(卫兵模式)
Redis Sentinel (哨兵模式)
(一)配置好redis主备两个实例
端口号分别为: Master: 6379 Slave:6381,并启动;
具体配置请参照 《Redis快速部署(主备)》
(二)配置哨兵
Redis哨兵实际上是一个特殊的redis实例,使用配置文件sentinel.conf进行启动
#1.哨兵启动的端口
port 26379
#2.指定master节点地址,最后一个数字表示需要多少sentinel同意才能执行failover
sentinel monitor master1 127.0.0.1 6379 1
#3.选填,如果redis设置了密码,这里也需要设置,不然sentinel无法连接redis
sentinel auth-pass master1 <password>
#4.选填,指定已知的slave,如果不填Sentinel可以自动识别,Sentinel会在运行过程中动态的修改这一项
sentinel known-slave master1 127.0.0.1 6381
#5.第一次心跳检测识别到故障之后重试多少时间(单位毫秒,每秒重试一次)
sentinel down-after-milliseconds master1 5000
#6.选填,用于协调多台sentinel,如果已经有sentinel去做failover了,则等待这个时间后再试,保持默认3分钟即可
sentinel failover-timeout master1 180000
#7.用于协调多台sentinel,sentinel每做一次failover数字递增,默认从1开始计数即可
sentinel config-epoch master1 1
#8.选填,已知的其他Sentinel,如果不填Sentinel可以自动识别
sentinel known-sentinel master1 127.0.0.1 26380 a5ce2add75d868d47ce8e7cae9a744f2c900d67d
#9.master down掉多长时间后进行主从切换,默认为3分钟,这里改为1分钟
sentinel failover-timeout master1 60000
(三)启动哨兵
ui@localhost src]$ ./redis-server ../sentinel.conf --sentinel &
查看当前启动的redis进程:
ui@localhost src]$ ps -ef |grep 'redis' 500 8337 3880 0 00:07 pts/1 00:00:06 ./redis-server *:26379 [sentinel] 500 8413 3880 0 00:28 pts/1 00:00:01 ./redis-server 127.0.0.1:6379 500 8469 3880 0 00:41 pts/1 00:00:00 ./redis-server 127.0.0.1:6381 500 8485 3940 0 00:44 pts/3 00:00:00 ./redis-cli -p 6381 500 8494 3880 0 00:46 pts/1 00:00:00 grep redis
(四)通过client连接redis,测试redis是否正常
- 先连接Master,测试读写操作,正常
ui@localhost src]$ ./redis-cli -p 6379 127.0.0.1:6379> set foo 123 OK 127.0.0.1:6379> get foo "123" 127.0.0.1:6379> exit
- 再连接Slave,测试读写, Slave只能读,不能写;
ui@localhost src]$ ./redis-cli -p 6381 127.0.0.1:6381> get foo "123" 127.0.0.1:6381> set foo 12333 (error) READONLY You can't write against a read only slave.
(五)测试failover
- 我们手动kill掉Master的进程
ui@localhost src]$ kill 8413
- 查看哨兵的日志输出,发现已经开始做failover,把原来的slave选为master,如下:
8337:X 06 Jul 00:47:54.989 # +sdown master master1 127.0.0.1 6379 8337:X 06 Jul 00:47:54.989 # +odown master master1 127.0.0.1 6379 #quorum 1/1 8337:X 06 Jul 00:47:54.989 # +new-epoch 238 8337:X 06 Jul 00:47:54.989 # +try-failover master master1 127.0.0.1 6379 8337:X 06 Jul 00:47:55.042 # +vote-for-leader a5ce2add75d868d47ce8e7cae9a744f2c900d67d 238 8337:X 06 Jul 00:47:55.042 # +elected-leader master master1 127.0.0.1 6379 8337:X 06 Jul 00:47:55.042 # +failover-state-select-slave master master1 127.0.0.1 6379 8337:X 06 Jul 00:47:55.100 # +selected-slave slave 127.0.0.1:6381 127.0.0.1 6381 @ master1 127.0.0.1 6379 8337:X 06 Jul 00:47:55.100 * +failover-state-send-slaveof-noone slave 127.0.0.1:6381 127.0.0.1 6381 @ master1 127.0.0.1 6379 8337:X 06 Jul 00:47:55.172 * +failover-state-wait-promotion slave 127.0.0.1:6381 127.0.0.1 6381 @ master1 127.0.0.1 6379 8337:X 06 Jul 00:47:55.686 # +promoted-slave slave 127.0.0.1:6381 127.0.0.1 6381 @ master1 127.0.0.1 6379 8337:X 06 Jul 00:47:55.686 # +failover-state-reconf-slaves master master1 127.0.0.1 6379 8337:X 06 Jul 00:47:55.693 # +failover-end master master1 127.0.0.1 6379 8337:X 06 Jul 00:47:55.693 # +switch-master master1 127.0.0.1 6379 127.0.0.1 6381 8337:X 06 Jul 00:47:55.693 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ master1 127.0.0.1 6381
- 再次测试6381的读写,此时6381已经从slave变成了master,所以读写都可以正常执行。
127.0.0.1:6381> set foo 12333 (error) READONLY You can't write against a read only slave. 127.0.0.1:6381> get foo "123" 127.0.0.1:6381> set foo 999 OK
- 尝试连接原来的master 6379,已经kill掉了
ui@localhost src]$ ./redis-cli -p 6379 Could not connect to Redis at 127.0.0.1:6379: Connection refused Could not connect to Redis at 127.0.0.1:6379: Connection refused
- 重启6379后,查看哨兵日志输出,可以看到6379被当作6381的slave了
8337:X 06 Jul 00:54:28.058 # -sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ master1 127.0.0.1 6381 8337:X 06 Jul 00:54:37.995 * +convert-to-slave slave 127.0.0.1:6379 127.0.0.1 6379 @ master1 127.0.0.1 6381
- 再次连接6379进行读写操作,证实6379确实变成了slave了。
ui@localhost src]$ ./redis-cli -p 6379 127.0.0.1:6379> get foo "999" 127.0.0.1:6379> set foo 1234 (error) READONLY You can't write against a read only slave.
(六)哨兵日志说明:
- 1. 主观下线(Subjectively Down, 简称 SDOWN)指的是单个 Sentinel 实例对服务器做出的下线判断。
- 2. 客观下线(Objectively Down, 简称 ODOWN)指的是多个 Sentinel 实例在对同一个服务器做出 SDOWN 判断, 并且通过SENTINEL is-master-down-by-addr 命令互相交流之后, 得出的服务器下线判断。
备注:
- 客观下线条件只适用于主服务器: 对于任何其他类型的 Redis 实例, Sentinel 在将它们判断为下线前不需要进行协商, 所以从服务器或者其他 Sentinel 永远不会达到客观下线条件。
- 只要一个 Sentinel 发现某个主服务器进入了客观下线状态, 这个Sentinel 就可能会被其他 Sentinel 推选出, 并对失效的主服务器执行自动故障迁移操作。