10. Redis哨兵(高可用)
10.1 redis集群介紹
主從架構(gòu)無(wú)法實(shí)現(xiàn)master和slave角色的自動(dòng)切換愚铡,即當(dāng)master出現(xiàn)redis服務(wù)異常、主機(jī)斷電精堕、磁盤損壞等問(wèn)題導(dǎo)致master無(wú)法使用丰歌,而redis高可用無(wú)法實(shí)現(xiàn)自故障轉(zhuǎn)移(將slave提升為master),需要手動(dòng)改環(huán)境配置才能切換到slave redis服務(wù)器予跌,另外也無(wú)法橫向擴(kuò)展Redis服務(wù)的并行寫入性能,當(dāng)單臺(tái)Redis服務(wù)器性能無(wú)法滿足業(yè)務(wù)寫入需求的時(shí)候就必須需要一種方式解決以上的兩個(gè)核心問(wèn)題善茎,即:1.master和slave角色的無(wú)縫切換券册,讓業(yè)務(wù)無(wú)感知從而不影響業(yè)務(wù)使用 2.可以橫向動(dòng)態(tài)擴(kuò)展Redis服務(wù)器,從而實(shí)現(xiàn)多臺(tái)服務(wù)器并行寫入以實(shí)現(xiàn)更高并發(fā)的目的垂涯。
Redis 集群實(shí)現(xiàn)方式:
客戶端分片
代理分片
Redis Cluster
10.2 哨兵工作原理
10.2.1 哨兵架構(gòu)和故障轉(zhuǎn)移
哨兵節(jié)點(diǎn)和redis節(jié)點(diǎn)是獨(dú)立的關(guān)系,一套哨兵, 可以監(jiān)控多個(gè)redis主從架構(gòu), 通過(guò)master-name來(lái)區(qū)分
類似于MySQL的MHA
一般一套哨兵監(jiān)控一套redis主從, 并且哨兵和redis節(jié)點(diǎn)配置在同一臺(tái)服務(wù)器上, 不會(huì)單獨(dú)用多個(gè)獨(dú)立的服務(wù)器專門跑哨兵節(jié)點(diǎn)
redis哨兵配合客戶端應(yīng)用程序
應(yīng)用程序連接哨兵節(jié)點(diǎn), 通過(guò)哨兵節(jié)點(diǎn)獲取存活的主,從節(jié)點(diǎn)信息, 進(jìn)而發(fā)起連接
故障轉(zhuǎn)移
Sentinel 進(jìn)程是用于監(jiān)控redis集群中Master主服務(wù)器工作的狀態(tài)烁焙,在Master主服務(wù)器發(fā)生故障的時(shí)候,可以實(shí)現(xiàn)
Master和Slave服務(wù)器的切換耕赘,保證系統(tǒng)的高可用骄蝇,其已經(jīng)被集成在redis2.6+的版本中,Redis的哨兵模式到了2.8
版本之后就穩(wěn)定了下來(lái)操骡。一般在生產(chǎn)環(huán)境也建議使用Redis的2.8版本的以后版本.
哨兵(Sentinel) 是一個(gè)分布式系統(tǒng)九火,可以在一個(gè)架構(gòu)中運(yùn)行多個(gè)哨兵(sentinel) 進(jìn)程,一套哨兵系統(tǒng)可以管理多套主從復(fù)制, 靠mastername區(qū)分開. 這些進(jìn)程使用流言協(xié)議(gossip protocols)來(lái)接收關(guān)于Master主服務(wù)器是否下線的信息册招,并使用投票協(xié)議(Agreement Protocols)來(lái)決定是否執(zhí)行自動(dòng)故障遷移,以及選擇哪個(gè)Slave作為新的Master岔激。
每個(gè)哨兵(Sentinel)進(jìn)程會(huì)向其它哨兵(Sentinel)、Master是掰、Slave定時(shí)發(fā)送消息虑鼎,以確認(rèn)對(duì)方是否”活”著,如果發(fā)現(xiàn)對(duì)方在指定配置時(shí)間(可配置的)內(nèi)未得到回應(yīng),則暫時(shí)認(rèn)為對(duì)方已離線炫彩,也就是所謂的”主觀認(rèn)為宕機(jī)” 匾七,主觀是每個(gè)成員都具有的獨(dú)自的而且可能相同也可能不同的意識(shí),英文名稱:Subjective Down江兢,簡(jiǎn)稱SDOWN昨忆。
有主觀宕機(jī),對(duì)應(yīng)的就有客觀宕機(jī)杉允。當(dāng)“哨兵群”中的多數(shù)Sentinel進(jìn)程在對(duì)Master主服務(wù)器做出SDOWN 的判斷邑贴,并且通過(guò) SENTINEL is-master-down-by-addr 命令互相交流之后,得出的Master Server下線判斷夺颤,這種方式就是“客觀宕機(jī)”痢缎,客觀是不依賴于某種意識(shí)而已經(jīng)實(shí)際存在的一切事物,英文名稱是:Objectively Down世澜, 簡(jiǎn)稱 ODOWN独旷。
通過(guò)一定的vote算法,從剩下的slave從服務(wù)器節(jié)點(diǎn)中寥裂,選一臺(tái)提升為Master服務(wù)器節(jié)點(diǎn)嵌洼,然后自動(dòng)修改相關(guān)配置,并開啟故障轉(zhuǎn)移(failover)
Sentinel 機(jī)制可以解決master和slave角色的自動(dòng)切換問(wèn)題封恰,但單個(gè)Master 的性能瓶頸問(wèn)題無(wú)法解決,類似于MySQL的MHA
實(shí)現(xiàn)主從自動(dòng)故障轉(zhuǎn)移, 但是還是一個(gè)主節(jié)點(diǎn), 無(wú)法實(shí)現(xiàn)多個(gè)主節(jié)點(diǎn)同時(shí)寫入
為了節(jié)約資源, 可以把哨兵節(jié)點(diǎn)和redis節(jié)點(diǎn)部署在同一臺(tái)服務(wù)器上
哨兵模式下, 客戶端要和哨兵節(jié)點(diǎn)連接, 通過(guò)哨兵節(jié)點(diǎn)來(lái)獲取redis集群中哪個(gè)是主節(jié)點(diǎn)哪個(gè)是從節(jié)點(diǎn). 當(dāng)主節(jié)點(diǎn)故障時(shí), 由哨兵節(jié)點(diǎn)協(xié)商提升從節(jié)點(diǎn)為主節(jié)點(diǎn), 客戶端無(wú)需做調(diào)整. 但哨兵只是配置中心, 并不是代理, 也就是客戶端雖然連接哨兵, 但是其只為了獲取存活的主節(jié)點(diǎn)和從節(jié)點(diǎn)信息, 之后還是向?qū)?yīng)的節(jié)點(diǎn)發(fā)起連接
redis哨兵節(jié)點(diǎn)和普通redis沒(méi)有區(qū)別, 要實(shí)現(xiàn)讀寫分離, 需要依賴于客戶的程序
主節(jié)點(diǎn)確認(rèn)故障后, 哨兵節(jié)點(diǎn)會(huì)內(nèi)部選出一個(gè)節(jié)點(diǎn)執(zhí)行操作, 由其提升從節(jié)點(diǎn)為主節(jié)點(diǎn)
哨兵中的哨兵節(jié)點(diǎn)個(gè)數(shù), 應(yīng)該大于等于3且最好為奇數(shù), 這樣投票才不會(huì)出現(xiàn)平票
redis3.0版本前, 一般使用哨兵模式. 3.0后推出了cluster功能, 支持更大規(guī)模的生產(chǎn)環(huán)境
10.2.2 哨兵中三個(gè)定時(shí)任務(wù)
1. 每10秒, 每個(gè)哨兵對(duì)master和slave執(zhí)行info命令, 用于發(fā)現(xiàn)集群中節(jié)點(diǎn), 確定主從關(guān)系
2. 每2秒, 每個(gè)哨兵通過(guò)master節(jié)點(diǎn)的channel交換信息(發(fā)布者/訂閱者模式), 通過(guò)sentinel_:hello頻道交互對(duì)節(jié)點(diǎn)的"看法"和自身信息
3. 每一秒每個(gè)哨兵對(duì)其他哨兵和所有redis節(jié)點(diǎn)執(zhí)行ping操作, 探測(cè)存活
10.3 實(shí)現(xiàn)哨兵
實(shí)驗(yàn)環(huán)境: 三臺(tái)redis服務(wù)器, 實(shí)現(xiàn)主從. 同時(shí)每臺(tái)redis跑一個(gè)哨兵節(jié)點(diǎn)為了節(jié)省資源
主: 10.0.0.82; 哨兵
從1: 10.0.0.83; 哨兵
從2: 10.0.0.84; 哨兵
10.3.1 從0搭建主從
10.0.0.82 master節(jié)點(diǎn)
bind 0.0.0.0
requirepass redis
masterauth redis # 搭建主從時(shí), 主節(jié)點(diǎn)就要配置masterauth, 因?yàn)橐坏┧礄C(jī), 并且恢復(fù)后, 就需要和新的主節(jié)點(diǎn)進(jìn)行同步, 如果這里沒(méi)有配置, 那么其宕機(jī)恢復(fù)后是無(wú)法加入到主從架構(gòu)的
[00:27:29 root@master ~]#redis-cli -a redis info replication
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
# Replication
role:master
connected_slaves:2
slave0:ip=10.0.0.83,port=6379,state=online,offset=168,lag=0
slave1:ip=10.0.0.84,port=6379,state=online,offset=168,lag=0
master_replid:d9f80eaacd824a6f6e9620154acb661cb713e40e
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:168
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:168
10.0.0.83 slave-1
bind 0.0.0.0
requirepass redis
masterauth redis
replicaof 10.0.0.82 6379
[00:27:28 root@slave-1 ~]#redis-cli -a redis info replication
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
# Replication
role:slave
master_host:10.0.0.82
master_port:6379
master_link_status:up
master_last_io_seconds_ago:2
master_sync_in_progress:0
slave_repl_offset:168
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:d9f80eaacd824a6f6e9620154acb661cb713e40e
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:168
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:168
10.0.0.84 slave-2
bind 0.0.0.0
requirepass redis
masterauth redis
replicaof 10.0.0.82 6379
[00:27:28 root@slave-2 ~]#redis-cli -a redis info replication
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
# Replication
role:slave
master_host:10.0.0.82
master_port:6379
master_link_status:up
master_last_io_seconds_ago:2
master_sync_in_progress:0
slave_repl_offset:168
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:d9f80eaacd824a6f6e9620154acb661cb713e40e
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:168
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:127
repl_backlog_histlen:42
10.3.2 哨兵搭建:
- 軟件安裝
redis哨兵也是來(lái)自于redis包, 因此如果在單獨(dú)的服務(wù)器上搭建哨兵, 需要單獨(dú)安裝redis. 本案例是redis和哨兵在同一臺(tái)服務(wù)器, 因此無(wú)需再安裝redis
- 編輯配置文件
配置文件說(shuō)明:
/etc/redis-sentinel.conf
[root@master ~]# grep "^[^#]" /etc/redis-sentinel.conf
bind 0.0.0.0
port 26379 #哨兵默認(rèn)監(jiān)聽端口號(hào)26379
daemonize no
pidfile /var/run/redis-sentinel.pid
logfile ""
dir /tmp #工作目錄
sentinel monitor mymaster 127.0.0.1 6379 2 #mymaster就是每個(gè)主從復(fù)制的名稱, 用來(lái)區(qū)分多組主從架構(gòu), 需要修改mymaster和master節(jié)點(diǎn)的ip地址
#指定當(dāng)前mymaster集群中master服務(wù)器的地址和端接口
#2為法定人數(shù)限制(quorom). 既有幾個(gè)sentinel認(rèn)為master down了就進(jìn)行故障轉(zhuǎn)移, 一般此值是所有sentinel節(jié)點(diǎn)(一般總數(shù)>=3的奇數(shù); 如:3,5,7等)的一半以上的整數(shù)值; 比如: 總數(shù)是3, 即3/2=1.5, 取整為2, 是master的ODOWN客觀下線的依據(jù)
sentinel auth-pass mymaster 123456 #mymaster集群中master的密碼, 注意此行要在上面行的下面書寫
sentinel down-after-milliseconds mymaster 30000 #(SDOWN)判斷mymaster集群中所有節(jié)點(diǎn)的主觀下線時(shí)間, 單位: 毫秒, 默認(rèn)30000=30秒, 建議3000=3秒
sentinel parallel-syncs mymaster 1 #發(fā)生故障轉(zhuǎn)移后, 同時(shí)向新的master同步數(shù)據(jù)的slave數(shù)量. 數(shù)字越小總同步時(shí)間越長(zhǎng), 但可以減輕新master的負(fù)載壓力
sentinel failover-timeout mymaster 180000 #所有slaves指向新的master所需的超時(shí)時(shí)間, 單位:毫秒
sentinel deny-scripts-reconfig yes #禁止修改腳本
logfile /var/log/redis/sentinel.log
修改sentinel配置文件
bind 0.0.0.0
sentinel monitor mymaster 10.0.0.82 6379 2
sentinel auth-pass mymaster redis
sentinel down-after-milliseconds mymaster 3000 #修改SD的下線時(shí)間為3秒, 默認(rèn)的30秒過(guò)長(zhǎng)
每個(gè)哨兵節(jié)點(diǎn)的配置文件都是統(tǒng)一的, 因此只需要修改一個(gè)節(jié)點(diǎn)然后復(fù)制到其他節(jié)點(diǎn)即可
[root@master ~]# scp /etc/redis-sentinel.conf 10.0.0.83:/etc
[root@master ~]# scp /etc/redis-sentinel.conf 10.0.0.84:/etc
service文件
/usr/lib/systemd/system/redis-sentinel.service
- 啟動(dòng)哨兵服務(wù). 一定要先修改哨兵配置文件, 然后再啟動(dòng)哨兵服務(wù). 因?yàn)閱?dòng)哨兵服務(wù)后, 哨兵配置文件會(huì)增加很多信息, 比如myid(用來(lái)在哨兵節(jié)點(diǎn)中標(biāo)識(shí)某個(gè)節(jié)點(diǎn)), 所以要確保每個(gè)哨兵節(jié)點(diǎn)專屬信息都是不一樣的.
systemctl enable --now redis-sentinel
[01:01:13 root@master ~]#grep -E ^[a-zA-Z] /etc/redis-sentinel.conf
port 26379
daemonize no
pidfile "/var/run/redis-sentinel.pid"
logfile "/var/log/redis/sentinel.log"
dir "/tmp"
sentinel myid bb1518fa88dcfc481ed510d94e3ba28c54e864d2 # 哨兵節(jié)點(diǎn)全部啟動(dòng)后, 要確保每個(gè)節(jié)點(diǎn)的myid不同
sentinel deny-scripts-reconfig yes
sentinel monitor mymaster 10.0.0.82 6379 2
sentinel down-after-milliseconds mymaster 3000
sentinel auth-pass mymaster redis
sentinel config-epoch mymaster 0
protected-mode no
supervised systemd
sentinel leader-epoch mymaster 0
sentinel known-replica mymaster 10.0.0.84 6379 # 這兩行是哨兵獲取的mymaster主從架構(gòu)中的從節(jié)點(diǎn)信息
sentinel known-replica mymaster 10.0.0.83 6379
sentinel known-sentinel mymaster 10.0.0.83 26379 eb357a630f40ff86a6f11cefc8860283e269dd25 # 這兩行是這個(gè)哨兵節(jié)點(diǎn)獲悉的其余哨兵節(jié)點(diǎn)的信息, 因?yàn)樵诿總€(gè)哨兵配置文件中都指向了它所管理的主從架構(gòu)的名字和主節(jié)點(diǎn)ip/port, 因此每個(gè)哨兵都可以發(fā)現(xiàn)其余哨兵節(jié)點(diǎn)
sentinel known-sentinel mymaster 10.0.0.84 26379 564e45cf8b5e5060cc2ed4fee5f14a24966b1fc1
sentinel current-epoch 0
[root@master ~]# ss -ntl
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 128 0.0.0.0:26379 0.0.0.0:*
LISTEN 0 128 0.0.0.0:6379 0.0.0.0:*
LISTEN 0 128 0.0.0.0:22 0.0.0.0:*
LISTEN 0 128 [::]:26379 [::]:*
LISTEN 0 128 [::]:22 [::]:*
哨兵日志
[01:05:19 root@master ~]#cat /var/log/redis/sentinel.log
1900:X 08 Mar 2021 01:01:09.909 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1900:X 08 Mar 2021 01:01:09.909 # Redis version=5.0.3, bits=64, commit=00000000, modified=0, pid=1900, just started
1900:X 08 Mar 2021 01:01:09.909 # Configuration loaded
1900:X 08 Mar 2021 01:01:09.909 * supervised by systemd, will signal readiness
1900:X 08 Mar 2021 01:01:09.936 * Running mode=sentinel, port=26379.
1900:X 08 Mar 2021 01:01:09.936 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1900:X 08 Mar 2021 01:01:09.939 # Sentinel ID is bb1518fa88dcfc481ed510d94e3ba28c54e864d2
1900:X 08 Mar 2021 01:01:09.939 # +monitor master mymaster 10.0.0.82 6379 quorum 2
1900:X 08 Mar 2021 01:01:09.941 * +slave slave 10.0.0.83:6379 10.0.0.83 6379 @ mymaster 10.0.0.82 6379
1900:X 08 Mar 2021 01:01:09.951 * +slave slave 10.0.0.84:6379 10.0.0.84 6379 @ mymaster 10.0.0.82 6379
1900:X 08 Mar 2021 01:01:11.941 * +sentinel sentinel eb357a630f40ff86a6f11cefc8860283e269dd25 10.0.0.83 26379 @ mymaster 10.0.0.82 6379
1900:X 08 Mar 2021 01:01:11.943 * +sentinel sentinel 564e45cf8b5e5060cc2ed4fee5f14a24966b1fc1 10.0.0.84 26379 @ mymaster 10.0.0.82 6379
初次啟動(dòng)sentinel時(shí)由于時(shí)間差的問(wèn)題, 可能會(huì)出現(xiàn)SD, 可以忽略, sentinel一般只要不是OD就沒(méi)問(wèn)題
通過(guò)redis-cli連接sentinel查看信息, 需要連接到26379端口. 而且每個(gè)哨兵看到的信息都是一樣的
[01:12:31 root@master ~]#redis-cli -p 26379
127.0.0.1:26379> info sentinel
# Sentinel
sentinel_masters:1 #
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=10.0.0.82:6379,slaves=2,sentinels=3
[01:01:24 root@slave-1 ~]#redis-cli -p 26379
127.0.0.1:26379> info sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=10.0.0.82:6379,slaves=2,sentinels=3
[01:13:38 root@slave-2 ~]#redis-cli -p 26379
127.0.0.1:26379> info sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=10.0.0.82:6379,slaves=2,sentinels=3
redis哨兵無(wú)法使用keys,select等命令, 因?yàn)椴皇莚edis數(shù)據(jù)服務(wù)
127.0.0.1:26379> keys *
(error) ERR unknown command `keys`, with args beginning with: `*`,
127.0.0.1:26379> select 1
(error) ERR unknown command `select`, with args beginning with: `1`,
10.3.3 模擬主節(jié)點(diǎn)故障切換
- 停止主節(jié)點(diǎn) redis服務(wù), 觀察各哨兵節(jié)點(diǎn)日志文件
[root@master ~]# systemctl stop redis
在redis主節(jié)點(diǎn)查看哨兵日志
[01:20:15 root@master ~]#tail -f /var/log/redis/sentinel.log
1900:X 08 Mar 2021 01:20:50.087 # +sdown master mymaster 10.0.0.82 6379
1900:X 08 Mar 2021 01:20:50.160 # +odown master mymaster 10.0.0.82 6379 #quorum 2/2
1900:X 08 Mar 2021 01:20:50.160 # +new-epoch 1
1900:X 08 Mar 2021 01:20:50.160 # +try-failover master mymaster 10.0.0.82 6379
1900:X 08 Mar 2021 01:20:50.162 # +vote-for-leader bb1518fa88dcfc481ed510d94e3ba28c54e864d2 1
1900:X 08 Mar 2021 01:20:50.196 # 564e45cf8b5e5060cc2ed4fee5f14a24966b1fc1 voted for bb1518fa88dcfc481ed510d94e3ba28c54e864d2 1
1900:X 08 Mar 2021 01:20:50.198 # eb357a630f40ff86a6f11cefc8860283e269dd25 voted for bb1518fa88dcfc481ed510d94e3ba28c54e864d2 1
1900:X 08 Mar 2021 01:20:50.223 # +elected-leader master mymaster 10.0.0.82 6379
1900:X 08 Mar 2021 01:20:50.223 # +failover-state-select-slave master mymaster 10.0.0.82 6379
1900:X 08 Mar 2021 01:20:50.279 # +selected-slave slave 10.0.0.84:6379 10.0.0.84 6379 @ mymaster 10.0.0.82 6379
1900:X 08 Mar 2021 01:20:50.279 * +failover-state-send-slaveof-noone slave 10.0.0.84:6379 10.0.0.84 6379 @ mymaster 10.0.0.82 6379
1900:X 08 Mar 2021 01:20:50.339 * +failover-state-wait-promotion slave 10.0.0.84:6379 10.0.0.84 6379 @ mymaster 10.0.0.82 6379
1900:X 08 Mar 2021 01:20:51.200 # +promoted-slave slave 10.0.0.84:6379 10.0.0.84 6379 @ mymaster 10.0.0.82 6379
1900:X 08 Mar 2021 01:20:51.200 # +failover-state-reconf-slaves master mymaster 10.0.0.82 6379
1900:X 08 Mar 2021 01:20:51.264 * +slave-reconf-sent slave 10.0.0.83:6379 10.0.0.83 6379 @ mymaster 10.0.0.82 6379
1900:X 08 Mar 2021 01:20:52.207 * +slave-reconf-inprog slave 10.0.0.83:6379 10.0.0.83 6379 @ mymaster 10.0.0.82 6379
1900:X 08 Mar 2021 01:20:52.207 * +slave-reconf-done slave 10.0.0.83:6379 10.0.0.83 6379 @ mymaster 10.0.0.82 6379
1900:X 08 Mar 2021 01:20:52.261 # +failover-end master mymaster 10.0.0.82 6379
1900:X 08 Mar 2021 01:20:52.261 # +switch-master mymaster 10.0.0.82 6379 10.0.0.84 6379 # 主節(jié)點(diǎn)完成切換, 新的主節(jié)點(diǎn)為10.0.0.84
1900:X 08 Mar 2021 01:20:52.262 * +slave slave 10.0.0.83:6379 10.0.0.83 6379 @ mymaster 10.0.0.84 6379 # 將10.0.0.83指向新的主節(jié)點(diǎn)10.0.0.84
1900:X 08 Mar 2021 01:20:52.262 * +slave slave 10.0.0.82:6379 10.0.0.82 6379 @ mymaster 10.0.0.84 6379 # 宕機(jī)的主節(jié)點(diǎn)也會(huì)被修改, 指向新的主節(jié)點(diǎn)
1900:X 08 Mar 2021 01:20:55.266 # +sdown slave 10.0.0.82:6379 10.0.0.82 6379 @ mymaster 10.0.0.84 6379 # 由于主節(jié)點(diǎn)還處于宕機(jī)狀態(tài), 因此還會(huì)出現(xiàn)sdown
在redis從節(jié)點(diǎn)查看哨兵日志
[01:24:14 root@slave-1 ~]#tail -f /var/log/redis/sentinel.log
1891:X 08 Mar 2021 01:20:49.431 # +sdown master mymaster 10.0.0.82 6379
1891:X 08 Mar 2021 01:20:49.543 # +new-epoch 1
1891:X 08 Mar 2021 01:20:49.548 # +vote-for-leader bb1518fa88dcfc481ed510d94e3ba28c54e864d2 1
1891:X 08 Mar 2021 01:20:50.559 # +odown master mymaster 10.0.0.82 6379 #quorum 3/2
1891:X 08 Mar 2021 01:20:50.559 # Next failover delay: I will not start a failover before Mon Mar 8 01:26:49 2021
1891:X 08 Mar 2021 01:20:50.626 # +config-update-from sentinel bb1518fa88dcfc481ed510d94e3ba28c54e864d2 10.0.0.82 26379 @ mymaster 10.0.0.82 6379
1891:X 08 Mar 2021 01:20:50.627 # +switch-master mymaster 10.0.0.82 6379 10.0.0.84 6379
1891:X 08 Mar 2021 01:20:50.627 * +slave slave 10.0.0.83:6379 10.0.0.83 6379 @ mymaster 10.0.0.84 6379
1891:X 08 Mar 2021 01:20:50.627 * +slave slave 10.0.0.82:6379 10.0.0.82 6379 @ mymaster 10.0.0.84 6379
1891:X 08 Mar 2021 01:20:53.678 # +sdown slave 10.0.0.82:6379 10.0.0.82 6379 @ mymaster 10.0.0.84 6379
[01:24:17 root@slave-2 ~]#tail -f /var/log/redis/sentinel.log
1889:X 08 Mar 2021 01:20:49.492 # +sdown master mymaster 10.0.0.82 6379
1889:X 08 Mar 2021 01:20:49.552 # +new-epoch 1
1889:X 08 Mar 2021 01:20:49.562 # +vote-for-leader bb1518fa88dcfc481ed510d94e3ba28c54e864d2 1
1889:X 08 Mar 2021 01:20:49.576 # +odown master mymaster 10.0.0.82 6379 #quorum 3/2
1889:X 08 Mar 2021 01:20:49.576 # Next failover delay: I will not start a failover before Mon Mar 8 01:26:50 2021
1889:X 08 Mar 2021 01:20:50.632 # +config-update-from sentinel bb1518fa88dcfc481ed510d94e3ba28c54e864d2 10.0.0.82 26379 @ mymaster 10.0.0.82 6379
1889:X 08 Mar 2021 01:20:50.632 # +switch-master mymaster 10.0.0.82 6379 10.0.0.84 6379
1889:X 08 Mar 2021 01:20:50.632 * +slave slave 10.0.0.83:6379 10.0.0.83 6379 @ mymaster 10.0.0.84 6379
1889:X 08 Mar 2021 01:20:50.632 * +slave slave 10.0.0.82:6379 10.0.0.82 6379 @ mymaster 10.0.0.84 6379
1889:X 08 Mar 2021 01:20:53.679 # +sdown slave 10.0.0.82:6379 10.0.0.82 6379 @ mymaster 10.0.0.84 6379
- 查看各節(jié)點(diǎn)replication信息, 以及redis配置文件
10.0.0.82
# 宕機(jī)的原主節(jié)點(diǎn)服務(wù)處于停止?fàn)顟B(tài), 無(wú)法查看
[root@master ~]# redis-cli -a redis
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
Could not connect to Redis at 127.0.0.1:6379: Connection refused
10.0.0.83 slave-1
[01:29:30 root@slave-1 ~]#redis-cli -a redis
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
127.0.0.1:6379> info replication
# Replication
role:slave
master_host:10.0.0.84 # 主節(jié)點(diǎn)已經(jīng)指向了新的10.0.0.84
master_port:6379
master_link_status:up # 狀態(tài)是up的
master_last_io_seconds_ago:1
master_sync_in_progress:0
slave_repl_offset:341599
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:819ff842c904343d0b0702f81bfda242e56e8164
master_replid2:b78e3f5790d472bb7ff3a48d358df241055f9247
master_repl_offset:341599
second_repl_offset:233756
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:341599
10.0.0.84 新的master
[01:29:45 root@slave-2 ~]#redis-cli -a redis
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
127.0.0.1:6379> info replication
# Replication
role:master # 新的主節(jié)點(diǎn)已經(jīng)變成了master節(jié)點(diǎn)
connected_slaves:1 # 當(dāng)前只有slave-1, 因?yàn)樵鞴?jié)點(diǎn)還沒(méi)有修復(fù)
slave0:ip=10.0.0.83,port=6379,state=online,offset=340388,lag=0
master_replid:819ff842c904343d0b0702f81bfda242e56e8164
master_replid2:b78e3f5790d472bb7ff3a48d358df241055f9247
master_repl_offset:340521
second_repl_offset:233756
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:340521
- 查看各節(jié)點(diǎn)配置文件是否被修改
故障切換前:
原主節(jié)點(diǎn)10.0.0.82沒(méi)有指定replicaof
所有從節(jié)點(diǎn)的replicaof都是指向的原主節(jié)點(diǎn),10.0.0.82
第一次故障切換后:
新提升的從節(jié)點(diǎn)10.0.0.84的配置文件中, replicaof項(xiàng)會(huì)被刪除(如果是手動(dòng)添加了一行, 那么就把這行刪除, 留下默認(rèn)的被注釋掉的配置, 如果是直接修改的被注釋掉的行, 那么這行就會(huì)被刪除), 這里因?yàn)榕渲胷elicaof時(shí), 重新寫了一行, 所以切換后只有默認(rèn)被注釋掉的那行
其余從節(jié)點(diǎn)(10.0.0.83)的配置文件中, replicaof會(huì)被修改為指向新的主節(jié)點(diǎn)ip, 10.0.0.84
slave-1 10.0.0.83
[01:35:25 root@slave-1 ~]#grep "replicaof 10.0.0" /etc/redis.conf
replicaof 10.0.0.84 6379
修復(fù)后的故障原主節(jié)點(diǎn)(10.0.0.82), 重啟服務(wù)后, 會(huì)變成新的主節(jié)點(diǎn)的從節(jié)點(diǎn), 并且在配置文件中的最后一行, 追加replicaof指向新的主節(jié)點(diǎn)的ip
每個(gè)哨兵節(jié)點(diǎn)會(huì)對(duì)所有的其余哨兵節(jié)點(diǎn)和主從節(jié)點(diǎn)都定期做ping, 每隔一秒探測(cè)可活, 如果從節(jié)點(diǎn)宕機(jī), 哨兵只會(huì)進(jìn)行主觀sdown,并不會(huì)進(jìn)行客觀down處理, 需要配合監(jiān)控對(duì)從節(jié)點(diǎn)進(jìn)行監(jiān)控并且修復(fù)從節(jié)點(diǎn)宕機(jī)故障
1891:X 08 Mar 2021 01:42:43.013 # +sdown slave 10.0.0.83:6379 10.0.0.83 6379 @ mymaster 10.0.0.84 6379 # 手動(dòng)停止10.0.0.83從節(jié)點(diǎn)redis服務(wù)
1891:X 08 Mar 2021 01:43:16.123 * +reboot slave 10.0.0.83:6379 10.0.0.83 6379 @ mymaster 10.0.0.84 6379 # 手動(dòng)啟動(dòng)redis后
如果主從本身搭建有問(wèn)題, 比如密碼不對(duì), bind沒(méi)有修改地址等原因, 那么提升了新的主節(jié)點(diǎn)后, 配置錯(cuò)誤的從節(jié)點(diǎn)是不會(huì)自動(dòng)切換指向新的主節(jié)點(diǎn)的, 只能手動(dòng)先修改其redis配置文件, 然后手動(dòng)修改配置文件指向新的主節(jié)點(diǎn) ,然后重啟redis服務(wù)
之后再發(fā)生故障:
停止新的主節(jié)點(diǎn)10.0.0.84的redis服務(wù)
systemctl stop redis
可以看到新的主節(jié)點(diǎn)變成了10.0.0.83
1889:X 08 Mar 2021 01:57:21.013 # +switch-master mymaster 10.0.0.84 6379 10.0.0.83 6379
新的主節(jié)點(diǎn)10.0.0.83的配置文件中, replicaof項(xiàng)會(huì)被刪除
10.0.0.82從節(jié)點(diǎn)的配置文件中, replicaof會(huì)被修改為指向新的主節(jié)點(diǎn)ip
修復(fù)后的故障主節(jié)點(diǎn),10.0.0.84, 重啟服務(wù)后, 會(huì)變成新的主節(jié)點(diǎn)的從節(jié)點(diǎn), 并且在配置文件中, 加一行replicaof指向新的主節(jié)點(diǎn)的ip
故障切換時(shí), 除了會(huì)自動(dòng)修改redis配置文件, sentinel的配置文件也會(huì)被修改
以slave-2-10.0.0.84為例
sentinel leader-epoch mymaster 2
sentinel known-replica mymaster 10.0.0.84 6379 # 兩次故障切換后, 10.0.0.83成為了新的主節(jié)點(diǎn)
sentinel known-replica mymaster 10.0.0.82 6379
sentinel known-sentinel mymaster 10.0.0.83 26379 eb357a630f40ff86a6f11cefc8860283e269dd25 # 除了10.0.0.84, 其余兩個(gè)哨兵節(jié)點(diǎn)信息
sentinel known-sentinel mymaster 10.0.0.82 26379 bb1518fa88dcfc481ed510d94e3ba28c54e864d2
sentinel current-epoch 2
10.3.4 手動(dòng)切換主節(jié)點(diǎn)
目前10.0.0.83是剛提升的主節(jié)點(diǎn), 10.0.0.82和10.0.0.84為從節(jié)點(diǎn)
手動(dòng)下線主節(jié)點(diǎn), 在任意哨兵節(jié)點(diǎn)完成即可
在10.0.0.84上操作, 手動(dòng)切換主節(jié)點(diǎn)
[02:04:02 root@slave-2 ~]#redis-cli -p 26379
127.0.0.1:26379> sentinel failover mymaster
新的主節(jié)點(diǎn)變成了10.0.0.82
1891:X 08 Mar 2021 02:13:19.894 # +switch-master mymaster 10.0.0.83 6379 10.0.0.82 6379
10.0.0.82上查看sentinel信息
[01:59:19 root@master ~]#redis-cli -p 26379
127.0.0.1:26379> info sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=10.0.0.82:6379,slaves=2,sentinels=3
手動(dòng)切換并不是讓當(dāng)前的主節(jié)點(diǎn)下線, 只是換一個(gè)主節(jié)點(diǎn)而已, 原先的主節(jié)點(diǎn)會(huì)自動(dòng)變成從節(jié)點(diǎn), 通過(guò)這種方式完成的主節(jié)點(diǎn)切換, redis的配置文件也會(huì)被自動(dòng)修改replicaof字段
10.3.5 應(yīng)用程序如何通過(guò)sentinel連接redis客戶端
在哨兵集群下, 應(yīng)用程序必須通過(guò)連接sentinel來(lái)獲取redis主從節(jié)點(diǎn)信息, 然后再連接redis, 而非直接連接redis
連接過(guò)程:
- 客戶端應(yīng)用程序代碼要指明所有的sentinel信息, 從這些sentinel中選出一個(gè)可用的節(jié)點(diǎn)
- 客戶端向選出的哨兵節(jié)點(diǎn)發(fā)送get-master-addr-by-name [主從集群的名字]
- sentinel返回這個(gè)主從架構(gòu)中master的地址
- 客戶端應(yīng)用程序向這個(gè)master發(fā)送role指令, 來(lái)確認(rèn)其角色
- 客戶端訂閱sentinel的相關(guān)頻道, 來(lái)獲取新的master的信息變化, 并自動(dòng)連接新的master
Python實(shí)現(xiàn):
- 準(zhǔn)備一個(gè)額外的服務(wù)器, CentOS-7(10.0.0.187), 安裝依賴包
[02:28:16 root@python ~]#yum -y install python3 python3-redis
- 準(zhǔn)備測(cè)試腳本
[02:40:04 root@python ~]#vim sentinel_test.py +8
#! /usr/bin/python3
import redis
from redis.sentinel import Sentinel
# 連接哨兵服務(wù)器(ip地址, 如果有DNS也可以寫主機(jī)名)
sentinel = Sentinel([('10.0.0.82', 26379),
('10.0.0.83', 26379),
('10.0.0.84', 26379)], socket_timeout = 0.5)
redis_auth_pass = 'redis'
# mymaster是配置哨兵模式的redis主從架構(gòu)的名字, 此為默認(rèn)值, 實(shí)際名稱按照實(shí)際情況修改
# 獲取主從中master服務(wù)器地址
master = sentinel.discover_master('mymaster')
print(master)
# 獲取從服務(wù)器地址
slave = sentinel.discover_slaves('mymaster')
print(slave)
# 獲取master服務(wù)器地址, 并且寫入數(shù)據(jù)
master = sentinel.master_for('mymaster', socket_timeout=0.5, password = redis_auth_pass, db = 0)
w_ret = master.set('name','david')
# 獲取從服務(wù)器并進(jìn)行讀取(默認(rèn)是round-roubin輪詢)
slave = sentinel.slave_for('mymaster', socket_timeout = 0.5, password = redis_auth_pass, db = 0)
r_ret = slave.get('name')
print(r_ret)
# 輸出: david
- 執(zhí)行, 查看結(jié)果
[02:40:03 root@python ~]#./sentinel_test.py
('10.0.0.82', 6379) # master節(jié)點(diǎn)
[('10.0.0.84', 6379), ('10.0.0.83', 6379)] # slave節(jié)點(diǎn)
b'david' # 獲取name對(duì)應(yīng)的value