3.哨兵機制
? 在《4.1 集群設計繞不開的話題》中我們提到了單點故障的問題,在《4.2 主從復制》中我們又提到了使用從庫來做主庫的備份,保證數(shù)據(jù)盡量不丟失腻贰。不過問題來了,如果主庫掛掉了,我們?nèi)绾沃滥啬唐埽慷颐看沃鲙鞉斓粑覀兌甲屵\維人員來將從庫手動改成主庫嗎?如果代碼里已經(jīng)將主庫連接寫死了邢羔,是不是還得把所有影響到的項目都重新發(fā)布呢驼抹?這樣顯然不合理,所以拜鹤,就有了哨兵機制框冀。
? 哨兵機制(Redis Sentinel)有兩個主要任務:
- 監(jiān)控:定時給被監(jiān)控的Redis實例做心跳檢測,看看是否在正常工作
- 自動故障轉(zhuǎn)移:當主Redis掛掉了敏簿,Sentinel會將從服務器升級為主服務器明也,并且給客戶端提供新的主服務器地址宣虾。
3.1 Sentinel 啟動初始化
? 當一個Sentinel啟動時,它會執(zhí)行以下步驟:
-
初始化服務器
這里的初始化服務器并不會像《3.1.2.1 Redis服務啟動流程》中描述的那樣完整的初始化Redis服務器温数,比如它不會載入RDB或AOF文件绣硝。
-
將普通Redis服務器的代碼替換成Sentinel代碼
Redis服務器啟動的時候會載入命令表,而Sentinel啟動載入的命令表和正常的不太一樣撑刺,所以它只支持PING鹉胖、SENTINEL、INFO够傍、SUBSCRIBE甫菠、UNSUBSCRIBE、PSUBSCRIBE和PUNSUBSCRIBE這七個命令冕屯。除了這個以外還有一些其他的代碼替換寂诱。
-
初始化Sentinel狀態(tài)
這里會初始化一個sentinelState結(jié)構(gòu)體,如下:
/* Main state. */ struct sentinelState { char myid[CONFIG_RUN_ID_SIZE+1]; /* This sentinel ID. */ uint64_t current_epoch; /* Current epoch. */ dict *masters; /* Dictionary of master sentinelRedisInstances. Key is the instance name, value is the sentinelRedisInstance structure pointer. */ int tilt; /* Are we in TILT mode? */ int running_scripts; /* Number of scripts in execution right now. */ mstime_t tilt_start_time; /* When TITL started. */ mstime_t previous_time; /* Last time we ran the time handler. */ list *scripts_queue; /* Queue of user scripts to execute. */ char *announce_ip; /* IP addr that is gossiped to other sentinels if not NULL. */ int announce_port; /* Port that is gossiped to other sentinels if non zero. */ unsigned long simfailure_flags; /* Failures simulation. */ int deny_scripts_reconfig; /* Allow SENTINEL SET ... to change script paths at runtime? */ } sentinel;
-
根據(jù)給定的配置文件安聘,初始化Sentinel的監(jiān)視主服務器列表
上述sentinelState結(jié)構(gòu)體中的masters字典記錄了所有被Sentinel監(jiān)視的主服務器相關信息痰洒。其中key是被監(jiān)視主服務器的名字,value是Redis服務器實例浴韭,用sentinelRedisInstance來保存信息丘喻,如下:
typedef struct sentinelRedisInstance { int flags; /* See SRI_... defines */ char *name; /* Master name from the point of view of this sentinel. */ char *runid; /* Run ID of this instance, or unique ID if is a Sentinel.*/ uint64_t config_epoch; /* Configuration epoch. */ sentinelAddr *addr; /* Master host. */ instanceLink *link; /* Link to the instance, may be shared for Sentinels. */ mstime_t last_pub_time; /* Last time we sent hello via Pub/Sub. */ mstime_t last_hello_time; /* Only used if SRI_SENTINEL is set. Last time we received a hello from this Sentinel via Pub/Sub. */ mstime_t last_master_down_reply_time; /* Time of last reply to SENTINEL is-master-down command. */ mstime_t s_down_since_time; /* Subjectively down since time. */ mstime_t o_down_since_time; /* Objectively down since time. */ mstime_t down_after_period; /* Consider it down after that period. */ mstime_t info_refresh; /* Time at which we received INFO output from it. */ dict *renamed_commands; /* Commands renamed in this instance: Sentinel will use the alternative commands mapped on this table to send things like SLAVEOF, CONFING, INFO, ... */ /* Role and the first time we observed it. * This is useful in order to delay replacing what the instance reports * with our own configuration. We need to always wait some time in order * to give a chance to the leader to report the new configuration before * we do silly things. */ int role_reported; mstime_t role_reported_time; mstime_t slave_conf_change_time; /* Last time slave master addr changed. */ /* Master specific. */ dict *sentinels; /* Other sentinels monitoring the same master. */ dict *slaves; /* Slaves for this master instance. */ unsigned int quorum;/* Number of sentinels that need to agree on failure. */ int parallel_syncs; /* How many slaves to reconfigure at same time. */ char *auth_pass; /* Password to use for AUTH against master & replica. */ char *auth_user; /* Username for ACLs AUTH against master & replica. */ /* Slave specific. */ mstime_t master_link_down_time; /* Slave replication link down time. */ int slave_priority; /* Slave priority according to its INFO output. */ mstime_t slave_reconf_sent_time; /* Time at which we sent SLAVE OF <new> */ struct sentinelRedisInstance *master; /* Master instance if it's slave. */ char *slave_master_host; /* Master host as reported by INFO */ int slave_master_port; /* Master port as reported by INFO */ int slave_master_link_status; /* Master link status as reported by INFO */ unsigned long long slave_repl_offset; /* Slave replication offset. */ /* Failover */ char *leader; /* If this is a master instance, this is the runid of the Sentinel that should perform the failover. If this is a Sentinel, this is the runid of the Sentinel that this Sentinel voted as leader. */ uint64_t leader_epoch; /* Epoch of the 'leader' field. */ uint64_t failover_epoch; /* Epoch of the currently started failover. */ int failover_state; /* See SENTINEL_FAILOVER_STATE_* defines. */ mstime_t failover_state_change_time; mstime_t failover_start_time; /* Last failover attempt start time. */ mstime_t failover_timeout; /* Max time to refresh failover state. */ mstime_t failover_delay_logged; /* For what failover_start_time value we logged the failover delay. */ struct sentinelRedisInstance *promoted_slave; /* Promoted slave instance. */ /* Scripts executed to notify admin or reconfigure clients: when they * are set to NULL no script is executed. */ char *notification_script; char *client_reconfig_script; sds info; /* cached INFO output */ } sentinelRedisInstance;
-
創(chuàng)建連向主服務器的網(wǎng)絡連接
Sentinel會和被監(jiān)視的主服務器創(chuàng)建兩個異步網(wǎng)絡連接,一個是用來向Redis服務器發(fā)送命令和接收命令用的囱桨,另一個是用來訂閱_sentinel_:hello頻道用的仓犬。
3.2 監(jiān)視流程
? Sentinel默認以十秒一次的頻率給主服務器發(fā)送INFO命令,并通過分析INFO命令來獲取主服務器的當前信息舍肠。這里除了獲得主服務器信息以外搀继,還會取到從服務器的一些信息,并且Sentinel會給每個從服務器構(gòu)建一個sentinelRedisInstance結(jié)構(gòu)體來保存從服務器信息翠语。
? 當每次有新的從服務器被加入進來叽躯,Sentinel還會構(gòu)建和從服務器的命令連接和訂閱連接,并且也是十秒一次向從服務器發(fā)送INFO命令肌括。
? 除了發(fā)送INFO命令以外点骑,Sentinel還會以兩秒一次的頻率向所有被監(jiān)視的服務器發(fā)送PUBLISH命令,這樣就能保證所有訂閱了這個服務器的Sentinel都接收到相關的信息谍夭,除此之外黑滴,Sentinel還可以互相知曉彼此的存在。sentinelRedisInstance結(jié)構(gòu)體中的sentinels字典就是用來保存其他Sentinel的地方紧索。每個Sentinel還會互相建立命令連接來相互交換信息袁辈,不過Sentinel不會互相建立訂閱連接。
? Sentinel會以每秒一次珠漂,向所有建立了命令連接的機器發(fā)送PING命令晚缩,如果有機器在down-after-milliseconds時間范圍內(nèi)未響應的話尾膊,這個機器會被Sentinel標記成主觀下線(Sentinel自己認為這個服務器不可用了)。當有機器被Sentinel認為主觀下線之后荞彼,Sentinel會詢問其他監(jiān)視這個機器的Sentinel冈敛,如果返回的信息表示這個機器確實不能提供服務了,這時鸣皂,所有Sentinel會將其標記為客觀下線抓谴。被標記成下線并不意味著不被Sentinel監(jiān)視了,Sentinel會定時監(jiān)聽這個機器寞缝,如果他又可以PING通了齐邦,Sentinel還是會重新將其標記為在線狀態(tài)。如果是master節(jié)點不可用第租,那么當新的master被選出來之后,老的master就會被當做新master的從服務器我纪。
? master節(jié)點不可用的情況下慎宾,Sentinel會做一次故障轉(zhuǎn)移。但是畢竟一臺服務器可能有多個master浅悉,不可能所有Sentinel都做故障轉(zhuǎn)移趟据,所以在做故障轉(zhuǎn)移之前,Sentinel會選舉出一個領頭的Sentinel來做這件事情术健,選舉的規(guī)則就是每個Sentinel都向其他Sentinel發(fā)送一個SENTINEL is-master-down-by-addr命令汹碱,如果其他Sentinel有半數(shù)都通過了某個Sentinel,那么就由這個有個Sentinel作為領頭有個Sentinel來執(zhí)行故障轉(zhuǎn)移(這里使用了raft算法)荞估。
? 故障轉(zhuǎn)移的時候需要做兩步咳促,一是選舉一個新的master,新master會在所有正常的從節(jié)點中選取一個偏移量最大的節(jié)點作為新master勘伺,這里可能會涉及到一部分的數(shù)據(jù)丟失跪腹,不過這不可避免(可參閱《4.1 集群設計繞不開的話題》)。二是其他從服務器SLAVEOF這個新的master飞醉。
思考:
? 常規(guī)我們都是配置三臺Sentinel來處理冲茸,如果Sentinel有一臺掛了,剛好master服務器也掛了缅帘。Sentinel機制需要選舉一臺領頭Sentinel處理故障轉(zhuǎn)移轴术,但是兩臺Sentinel會出現(xiàn)腦裂情況,所以這個時候Redis服務就不可用了钦无。