架構(gòu)圖
故障轉(zhuǎn)移過(guò)程
(1)從宕機(jī)崩潰的master保存二進(jìn)制日志事件(binlog events);
(2)識(shí)別含有最新更新的slave碱蒙;
(3)應(yīng)用差異的中繼日志(relay log)到其他的slave;
(4)應(yīng)用從master保存的二進(jìn)制日志事件(binlog events)哀墓;
(5)提升一個(gè)slave為新的master;
(6)使其他的slave連接新的master進(jìn)行復(fù)制篮绰;
(7)在新的master啟動(dòng)vip地址季惯,保證前端請(qǐng)求可以發(fā)送到新的master。
介紹
MHA(Master High Availability)目前在MySQL高可用方面是一個(gè)相對(duì)成熟的解決方案勉抓,它由日本DeNA公司youshimaton(現(xiàn)就職于Facebook公司)開發(fā),是一套優(yōu)秀的作為MySQL高可用性環(huán)境下故障切換和主從提升的高可用軟件纵散。在MySQL故障切換過(guò)程中隐圾,MHA能做到在0~30秒之內(nèi)自動(dòng)完成數(shù)據(jù)庫(kù)的故障切換操作,并且在進(jìn)行故障切換的過(guò)程中暇藏,MHA能在最大程度上保證數(shù)據(jù)的一致性,以達(dá)到真正意義上的高可用瘩例。
MHA 由兩部分組成甸各;
- MHA Manager(管理節(jié)點(diǎn))
MHA Manager可以單獨(dú)部署在一臺(tái)獨(dú)立的機(jī)器上管理多個(gè)master-slave集群焰坪,也可以部署在一臺(tái)slave節(jié)點(diǎn)上。
- MHA Node(數(shù)據(jù)節(jié)點(diǎn))
MHA Node運(yùn)行在每臺(tái)MySQL服務(wù)器上
MHA Manager會(huì)定時(shí)探測(cè)集群中的master節(jié)點(diǎn)某饰,當(dāng)master出現(xiàn)故障時(shí),它可以自動(dòng)將最新數(shù)據(jù)的slave提升為新的master诫尽,然后將所有其他的slave重新指向新的master炬守。整個(gè)故障轉(zhuǎn)移過(guò)程對(duì)應(yīng)用程序完全透明。
目前MHA主要支持一主多從的架構(gòu),要搭建MHA,要求一個(gè)復(fù)制集群中必須最少有三臺(tái)數(shù)據(jù)庫(kù)服務(wù)器酣藻,一主二從曹洽,即一臺(tái)充當(dāng)master,一臺(tái)充當(dāng)備用master辽剧,另外一臺(tái)充當(dāng)從庫(kù)送淆,因?yàn)橹辽傩枰_(tái)服務(wù)器。
缺陷
- 至少需要三臺(tái)服務(wù)器 - 一主二從怕轿,即一臺(tái)充當(dāng)master偷崩,一臺(tái)充當(dāng)備用master,另外一臺(tái)充當(dāng)從庫(kù)
- 依賴 ssh 通信 - 在MHA自動(dòng)故障切換過(guò)程中环凿,MHA試圖從宕機(jī)的主服務(wù)器上保存二進(jìn)制日志放吩,最大程度的保證數(shù)據(jù)的不丟失,但這并不總是可行的到推。例如惕澎,如果主服務(wù)器硬件故障或無(wú)法通過(guò)ssh訪問(wèn)唧喉,MHA沒(méi)法保存二進(jìn)制日志,只進(jìn)行故障轉(zhuǎn)移而丟失了最新的數(shù)據(jù)董朝。
配置方法
基于 Docker
服務(wù)器配置
- 操作系統(tǒng):Ubuntu 14.04
- mha-manager - 172.17.0.2
- mha_master - 172.17.0.3
- mha_slave01(主備) - 172.17.0.4
- mha_slave02 - 172.17.0.5
1. MySQL 主從復(fù)制
2. MHA 配置
-
安裝 MHA
- MHA 有 manager 端和 node 端干跛,這是下載地址:MHA
- mha-manager 安裝 manager(mha4mysql-manager_0.55-0_all.deb) 端和 node(mha4mysql-node_0.54-0_all.deb) 端
- mha_master楼入、mha_slave01、mha_slave02 安裝 node(mha4mysql-node-0.56.tar.gz) 端
以下是執(zhí)行的命令:
mha-manager(172.17.0.2)-
下載安裝包遥赚,解壓
root@mha-manager:~# wget https://downloads.mariadb.com/MHA/mha4mysql-manager-0.56.tar.gz root@mha-manager:~# wget https://downloads.mariadb.com/MHA/mha4mysql-node-0.56.tar.gz root@mha-manager:~# tar -xzvf mha4mysql-node-0.56.tar.gz root@mha-manager:~# tar -xzvf mha4mysql-manager-0.56.tar.gz
-
首先安裝 MHA Node
root@mha-manager:~/mha4mysql-node-0.56# cd mha4mysql-node-0.56 root@mha-manager:~/mha4mysql-node-0.56# perl Makefile.PL root@mha-manager:~/mha4mysql-node-0.56# make && make install
如果在執(zhí)行
perl Makefile.PL
命令時(shí)凫佛,出現(xiàn)如下輸出:[Core Features] - DBI ...missing. - DBD::mysql ...missing. ==> Auto-install the 2 mandatory module(s) from CPAN? [y] y
并且在執(zhí)行
make
命令時(shí)出現(xiàn)如下輸出:[ERROR] Expected to read at least 204133 lines, but /root/.cpanplus/02packages.details.txt.gz contains only 190253 lines! *** Installing DBI... *** Could not find a version 0 or above for DBI; skipping. *** Installing DBD::mysql... *** Could not find a version 0 or above for DBD::mysql; skipping. *** Module::AutoInstall installation finished.
需要手動(dòng)安裝 libdbd-mysql-perl
root@mha-manager:~/mha4mysql-node-0.56# apt install libdbd-mysql-perl
重新執(zhí)行
make && make install
-
安裝 MHA Manager
root@mha-manager:~# cd mha4mysql-manager-0.56 root@mha-manager:~/mha4mysql-manager-0.56# perl Makefile.PL
會(huì)輸出如下信息:
*** Module::AutoInstall version 1.03 *** Checking for Perl dependencies... [Core Features] - DBI ...loaded. (1.63) - DBD::mysql ...loaded. (4.025) - Time::HiRes ...loaded. (1.9725) - Config::Tiny ...missing. - Log::Dispatch ...missing. - Parallel::ForkManager ...missing. - MHA::NodeConst ...loaded. (0.56) ==> Auto-install the 3 mandatory module(s) from CPAN? [y]
先結(jié)束運(yùn)行御蒲,安裝依賴包
root@mha-manager:~/mha4mysql-manager-0.56# apt install libconfig-tiny-perl root@mha-manager:~/mha4mysql-manager-0.56# apt install liblog-dispatch-perl root@mha-manager:~/mha4mysql-manager-0.56# apt install libparallel-forkmanager-perl
安裝完成后重新執(zhí)行
perl Makefile.PL
厚满,輸出如下:*** Module::AutoInstall version 1.03 *** Checking for Perl dependencies... [Core Features] - DBI ...loaded. (1.63) - DBD::mysql ...loaded. (4.025) - Time::HiRes ...loaded. (1.9725) - Config::Tiny ...loaded. (2.20) - Log::Dispatch ...loaded. (2.41) - Parallel::ForkManager ...loaded. (1.06) - MHA::NodeConst ...loaded. (0.56) *** Module::AutoInstall configuration finished. Checking if your kit is complete... Looks good Writing Makefile for mha4mysql::manager Writing MYMETA.yml and MYMETA.json
然后執(zhí)行
root@mha-manager:~/mha4mysql-manager-0.56# make && make install
就很順利了。
安裝完成后會(huì)產(chǎn)生一些相關(guān)工具:
Manager 工具
masterha_check_ssh : 檢查MHA的SSH配置遵馆。 masterha_check_repl : 檢查MySQL復(fù)制货邓。 masterha_manager : 啟動(dòng)MHA四濒。 masterha_check_status : 檢測(cè)當(dāng)前MHA運(yùn)行狀態(tài)。 masterha_master_monitor : 監(jiān)測(cè)master是否宕機(jī)戈二。 masterha_master_switch : 控制故障轉(zhuǎn)移(自動(dòng)或手動(dòng))喳资。 masterha_conf_host : 添加或刪除配置的server信息仆邓。
Node 工具
save_binary_logs : 保存和復(fù)制master的二進(jìn)制日志。 apply_diff_relay_logs : 識(shí)別差異的中繼日志事件并應(yīng)用于其它slave徙硅。 filter_mysqlbinlog : 去除不必要的ROLLBACK事件(MHA已不再使用這個(gè)工具)察署。 purge_relay_logs : 清除中繼日志(不會(huì)阻塞SQL線程)峻汉。
Node 工具通常由MHA Manager的腳本觸發(fā)休吠,無(wú)需人為操作。
注意:為了盡可能的減少主庫(kù)硬件損壞宕機(jī)造成的數(shù)據(jù)丟失阳懂,因此在配置MHA的同時(shí)建議配置成MySQL 5.5的半同步復(fù)制。關(guān)于半同步復(fù)制原理各位自己進(jìn)行查閱巷燥。(不是必須)
-
安裝 MHA Node
mha_master mha_slave01 mha_slave02
root@mha_master:~# cd mha4mysql-node-0.56 root@mha_master:~/mha4mysql-node-0.56# apt install libdbd-mysql-perl root@mha_master:~/mha4mysql-node-0.56# perl Makefile.PL root@mha_master:~/mha4mysql-node-0.56# make && make install
三臺(tái)機(jī)器安裝完畢后就可以開始接下來(lái)的配置了缰揪。
-
配置 MHA
-
配置四臺(tái)機(jī)器之間 ssh 互相免密鑰登陸
-
用ssh-keygen創(chuàng)建公鑰葱淳,一直默認(rèn)回車,最后會(huì)在.ssh/下面生成id_rsa.pub
root@mha-manager:~# ssh-keygen -t rsa
復(fù)制
id_rsa.pub
里面的內(nèi)容粘貼到另外三臺(tái)機(jī)器的/root/.ssh/authorized_keys
中艳狐,如果文件不存在的話新建一個(gè)即可皿桑。同理,在另外三臺(tái)機(jī)器上進(jìn)行同樣的操作蒜茴,讓這四臺(tái)機(jī)器之間可以無(wú)密碼登錄粉私。
如果是 root 登錄的話近零,需要修改 ssh 配置文件,
vi /etc/ssh/sshd_config
中的 PermitRootLogin 窖杀,設(shè)置為 YES裙士,再重啟 ssh。
-
-
配置 MHA Manager
-
創(chuàng)建數(shù)據(jù)文件目錄和配置文件目錄
root@mha-manager:~# pwd /root root@mha-manager:~# mkdir -p /usr/local/masterha/app1 root@mha-manager:~# mkdir /etc/masterha root@mha-manager:~# cp mha4mysql-manager-0.56/samples/conf/app1.cnf /etc/masterha/
-
MHA 的配置文件
[server default] manager_workdir=/var/log/masterha/app1 manager_log=/var/log/masterha/app1/manager.log master_binlog_dir=/var/lib/mysql # master_ip_failover_script=/data/perl/master_ip_failover # master_ip_online_change_script=/data/perl/master_ip_failover # report_script=/data/perl/send_report user=root password=root remote_workdir=/tmp repl_user=repl_user repl_password=123456 ssh_user=root [server1] hostname=172.17.0.3 port=3306 [server2] hostname=172.17.0.4 port=3306 candidate_master=1 check_repl_delay=0 [server3] hostname=172.17.0.5 port=3306
參數(shù)解讀:
參數(shù)信息 參數(shù)含義 備注 manager_workdir 設(shè)置manager的工作目錄 $1 manager_log 設(shè)置manager的日志 $1 master_binlog_dir 設(shè)置master 保存binlog的位置,以便MHA可以找到master的日志 $1 master_ip_failover_script 設(shè)置自動(dòng)failover時(shí)候的切換腳本 $1 master_ip_online_change_script 設(shè)置手動(dòng)切換時(shí)候的切換腳本 $1 password 設(shè)置mysql中root用戶的密碼 $1 user 設(shè)置監(jiān)控用戶root $1 ping_interval 設(shè)置監(jiān)控主庫(kù)啃炸,發(fā)送ping包的時(shí)間間隔南用,默認(rèn)是3秒掏湾,嘗試三次沒(méi)有回應(yīng)的時(shí)候自動(dòng)進(jìn)行railover $1 remote_workdir 設(shè)置遠(yuǎn)端mysql在發(fā)生切換時(shí)binlog的保存位置 $1 repl_password 設(shè)置復(fù)制用戶的密碼 $1 repl_user 設(shè)置復(fù)制環(huán)境中的復(fù)制用戶名 $1 report_script 設(shè)置發(fā)生切換后發(fā)送的報(bào)警的腳本 $1 shutdown_script 設(shè)置故障發(fā)生后關(guān)閉故障主機(jī)腳本(該腳本的主要作用是關(guān)閉主機(jī)防止發(fā)生腦裂,這里沒(méi)有使用) $1 ssh_user 設(shè)置ssh的登錄用戶名 $1 hostname 主機(jī)名/IP $1 port 端口號(hào) $1 candidate_master 設(shè)置為候選master融击,如果設(shè)置該參數(shù)以后砚嘴,發(fā)生主從切換以后將會(huì)將此從庫(kù)提升為主庫(kù)涩拙,即使這個(gè)主庫(kù)不是集群中事件最新的slave $1 check_repl_delay 默認(rèn)情況下如果一個(gè)slave落后master 100M的relay logs的話,MHA將不會(huì)選擇該slave作為一個(gè)新的master工育,因?yàn)閷?duì)于這個(gè)slave的恢復(fù)需要花費(fèi)很長(zhǎng)時(shí)間如绸,通過(guò)設(shè)置check_repl_delay=0,MHA觸發(fā)切換在選擇一個(gè)新的master的時(shí)候?qū)?huì)忽略復(fù)制延時(shí)旭贬,這個(gè)參數(shù)對(duì)于設(shè)置了candidate_master=1的主機(jī)非常有用,因?yàn)檫@個(gè)候選主在切換的過(guò)程中一定是新的master $1
-
-
環(huán)境測(cè)試
-
檢查 MHA Manager 到所有 MHA Node 的 ssh 連接狀態(tài)
root@mha-manager:~# masterha_check_ssh --conf=/etc/masterha/app1.cnf Sat Nov 4 05:30:41 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Sat Nov 4 05:30:41 2017 - [info] Reading application default configurations from /etc/masterha/app1.cnf.. Sat Nov 4 05:30:41 2017 - [info] Reading server configurations from /etc/masterha/app1.cnf.. Sat Nov 4 05:30:41 2017 - [info] Starting SSH connection tests.. Sat Nov 4 05:30:43 2017 - [debug] Sat Nov 4 05:30:42 2017 - [debug] Connecting via SSH from root@172.17.0.4(172.17.0.4:22) to root@172.17.0.3(172.17.0.3:22).. Warning: Permanently added '172.17.0.4' (ECDSA) to the list of known hosts. Warning: Permanently added '172.17.0.3' (ECDSA) to the list of known hosts. Sat Nov 4 05:30:43 2017 - [debug] ok. Sat Nov 4 05:30:43 2017 - [debug] Connecting via SSH from root@172.17.0.4(172.17.0.4:22) to root@172.17.0.5(172.17.0.5:22).. Warning: Permanently added '172.17.0.5' (ECDSA) to the list of known hosts. Sat Nov 4 05:30:43 2017 - [debug] ok. Sat Nov 4 05:30:44 2017 - [debug] Sat Nov 4 05:30:41 2017 - [debug] Connecting via SSH from root@172.17.0.3(172.17.0.3:22) to root@172.17.0.4(172.17.0.4:22).. Warning: Permanently added '172.17.0.3' (ECDSA) to the list of known hosts. Warning: Permanently added '172.17.0.4' (ECDSA) to the list of known hosts. Sat Nov 4 05:30:43 2017 - [debug] ok. Sat Nov 4 05:30:43 2017 - [debug] Connecting via SSH from root@172.17.0.3(172.17.0.3:22) to root@172.17.0.5(172.17.0.5:22).. Warning: Permanently added '172.17.0.5' (ECDSA) to the list of known hosts. Sat Nov 4 05:30:44 2017 - [debug] ok. Sat Nov 4 05:30:44 2017 - [debug] Sat Nov 4 05:30:43 2017 - [debug] Connecting via SSH from root@172.17.0.5(172.17.0.5:22) to root@172.17.0.3(172.17.0.3:22).. Warning: Permanently added '172.17.0.5' (ECDSA) to the list of known hosts. Warning: Permanently added '172.17.0.3' (ECDSA) to the list of known hosts. Sat Nov 4 05:30:43 2017 - [debug] ok. Sat Nov 4 05:30:43 2017 - [debug] Connecting via SSH from root@172.17.0.5(172.17.0.5:22) to root@172.17.0.4(172.17.0.4:22).. Warning: Permanently added '172.17.0.4' (ECDSA) to the list of known hosts. Sat Nov 4 05:30:44 2017 - [debug] ok. Sat Nov 4 05:30:44 2017 - [info] All SSH connection tests passed successfully.
各個(gè)節(jié)點(diǎn)狀態(tài)都是可以的。
-
檢查整個(gè)復(fù)制環(huán)境狀況
root@mha-manager:~# masterha_check_repl --conf=/etc/masterha/app1.cnf Sat Nov 4 05:46:50 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Sat Nov 4 05:46:50 2017 - [info] Reading application default configurations from /etc/masterha/app1.cnf.. Sat Nov 4 05:46:50 2017 - [info] Reading server configurations from /etc/masterha/app1.cnf.. Sat Nov 4 05:46:50 2017 - [info] MHA::MasterMonitor version 0.56. Sat Nov 4 05:46:50 2017 - [info] Dead Servers: Sat Nov 4 05:46:50 2017 - [info] Alive Servers: Sat Nov 4 05:46:50 2017 - [info] 172.17.0.3(172.17.0.3:3306) Sat Nov 4 05:46:50 2017 - [info] 172.17.0.4(172.17.0.4:3306) Sat Nov 4 05:46:50 2017 - [info] 172.17.0.5(172.17.0.5:3306) Sat Nov 4 05:46:50 2017 - [info] Alive Slaves: Sat Nov 4 05:46:50 2017 - [info] 172.17.0.4(172.17.0.4:3306) Version=5.7.18 (oldest major version between slaves) log-bin:disabled Sat Nov 4 05:46:50 2017 - [info] GTID ON Sat Nov 4 05:46:50 2017 - [info] Replicating from 172.17.0.3(172.17.0.3:3306) Sat Nov 4 05:46:50 2017 - [info] Primary candidate for the new Master (candidate_master is set) Sat Nov 4 05:46:50 2017 - [info] 172.17.0.5(172.17.0.5:3306) Version=5.7.18 (oldest major version between slaves) log-bin:disabled Sat Nov 4 05:46:50 2017 - [info] GTID ON Sat Nov 4 05:46:50 2017 - [info] Replicating from 172.17.0.3(172.17.0.3:3306) Sat Nov 4 05:46:50 2017 - [info] Current Alive Master: 172.17.0.3(172.17.0.3:3306) Sat Nov 4 05:46:50 2017 - [info] Checking slave configurations.. Sat Nov 4 05:46:50 2017 - [info] read_only=1 is not set on slave 172.17.0.4(172.17.0.4:3306). Sat Nov 4 05:46:50 2017 - [warning] relay_log_purge=0 is not set on slave 172.17.0.4(172.17.0.4:3306). Sat Nov 4 05:46:50 2017 - [warning] log-bin is not set on slave 172.17.0.4(172.17.0.4:3306). This host can not be a master. Sat Nov 4 05:46:50 2017 - [info] read_only=1 is not set on slave 172.17.0.5(172.17.0.5:3306). Sat Nov 4 05:46:50 2017 - [warning] relay_log_purge=0 is not set on slave 172.17.0.5(172.17.0.5:3306). Sat Nov 4 05:46:50 2017 - [warning] log-bin is not set on slave 172.17.0.5(172.17.0.5:3306). This host can not be a master. Sat Nov 4 05:46:50 2017 - [info] Checking replication filtering settings.. Sat Nov 4 05:46:50 2017 - [info] binlog_do_db= , binlog_ignore_db= Sat Nov 4 05:46:50 2017 - [info] Replication filtering check ok. Sat Nov 4 05:46:50 2017 - [error][/usr/local/share/perl/5.18.2/MHA/MasterMonitor.pm, ln341] None of slaves can be master. Check failover configuration file or log-bin settings in my.cnf Sat Nov 4 05:46:50 2017 - [error][/usr/local/share/perl/5.18.2/MHA/MasterMonitor.pm, ln401] Error happend on checking configurations. at /usr/local/bin/masterha_check_repl line 48. Sat Nov 4 05:46:50 2017 - [error][/usr/local/share/perl/5.18.2/MHA/MasterMonitor.pm, ln500] Error happened on monitoring servers. Sat Nov 4 05:46:50 2017 - [info] Got exit code 1 (Not master dead). MySQL Replication Health is NOT OK!
出現(xiàn)一條報(bào)錯(cuò)信息:
None of slaves can be master. Check failover configuration file or log-bin settings in my.cnf
前面說(shuō)了,MHA 至少需要三臺(tái)機(jī)器佣谐,一臺(tái)當(dāng)作 master,兩臺(tái)當(dāng)作 slave罚攀,其中一臺(tái) slave 必須做 master 備份雌澄,這個(gè)地方的報(bào)錯(cuò)意思是沒(méi)有 slave 能夠變成 master,我們需要對(duì) my.cnf 做適當(dāng)修改是己。
另外任柜,需要將 MySQL 的自動(dòng)清理中繼日志,即relay_log_purge=0
宙地,因?yàn)?MHA 依賴中繼日志來(lái)提升 slave 為 master ,需要 MHA 來(lái)進(jìn)行管理参袱。作為熱備的 slave 機(jī)器的my.cnf
最終配置如下:root@mha_slave01:/# cat /etc/mysql/my.cnf [mysqld] user = mysql pid-file = /var/run/mysqld/mysqld.pid socket = /var/run/mysqld/mysqld.sock port = 3306 basedir = /usr datadir = /var/lib/mysql tmpdir = /tmp server-id = 3 log-bin = master-bin log-bin-index = master-bin.index relay_log_purge = 0 relay-log-index = slave-relay-bin.index relay-log = slave-relay-bin gtid-mode = ON log-slave-updates enforce-gtid-consistency
重新檢查整個(gè)復(fù)制環(huán)境狀況:
root@mha-manager:~# masterha_check_repl --conf=/etc/masterha/app1.cnf Sat Nov 4 05:57:14 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Sat Nov 4 05:57:14 2017 - [info] Reading application default configurations from /etc/masterha/app1.cnf.. Sat Nov 4 05:57:14 2017 - [info] Reading server configurations from /etc/masterha/app1.cnf.. Sat Nov 4 05:57:14 2017 - [info] MHA::MasterMonitor version 0.56. Sat Nov 4 05:57:14 2017 - [info] Dead Servers: Sat Nov 4 05:57:14 2017 - [info] Alive Servers: Sat Nov 4 05:57:14 2017 - [info] 172.17.0.3(172.17.0.3:3306) Sat Nov 4 05:57:14 2017 - [info] 172.17.0.4(172.17.0.4:3306) Sat Nov 4 05:57:14 2017 - [info] 172.17.0.5(172.17.0.5:3306) Sat Nov 4 05:57:14 2017 - [info] Alive Slaves: Sat Nov 4 05:57:14 2017 - [info] 172.17.0.4(172.17.0.4:3306) Version=5.7.18-log (oldest major version between slaves) log-bin:enabled Sat Nov 4 05:57:14 2017 - [info] GTID ON Sat Nov 4 05:57:14 2017 - [info] Replicating from 172.17.0.3(172.17.0.3:3306) Sat Nov 4 05:57:14 2017 - [info] Primary candidate for the new Master (candidate_master is set) Sat Nov 4 05:57:14 2017 - [info] 172.17.0.5(172.17.0.5:3306) Version=5.7.18 (oldest major version between slaves) log-bin:disabled Sat Nov 4 05:57:14 2017 - [info] GTID ON Sat Nov 4 05:57:14 2017 - [info] Replicating from 172.17.0.3(172.17.0.3:3306) Sat Nov 4 05:57:14 2017 - [info] Current Alive Master: 172.17.0.3(172.17.0.3:3306) Sat Nov 4 05:57:14 2017 - [info] Checking slave configurations.. Sat Nov 4 05:57:14 2017 - [info] read_only=1 is not set on slave 172.17.0.4(172.17.0.4:3306). Sat Nov 4 05:57:14 2017 - [info] read_only=1 is not set on slave 172.17.0.5(172.17.0.5:3306). Sat Nov 4 05:57:14 2017 - [warning] relay_log_purge=0 is not set on slave 172.17.0.5(172.17.0.5:3306). Sat Nov 4 05:57:14 2017 - [warning] log-bin is not set on slave 172.17.0.5(172.17.0.5:3306). This host can not be a master. Sat Nov 4 05:57:14 2017 - [info] Checking replication filtering settings.. Sat Nov 4 05:57:14 2017 - [info] binlog_do_db= , binlog_ignore_db= Sat Nov 4 05:57:14 2017 - [info] Replication filtering check ok. Sat Nov 4 05:57:14 2017 - [info] GTID is supported. Skipping all SSH and Node package checking. Sat Nov 4 05:57:14 2017 - [info] Checking SSH publickey authentication settings on the current master.. Sat Nov 4 05:57:14 2017 - [info] HealthCheck: SSH to 172.17.0.3 is reachable. Sat Nov 4 05:57:14 2017 - [info] 172.17.0.3 (current master) +--172.17.0.4 +--172.17.0.5 Sat Nov 4 05:57:14 2017 - [info] Checking replication health on 172.17.0.4.. Sat Nov 4 05:57:14 2017 - [info] ok. Sat Nov 4 05:57:14 2017 - [info] Checking replication health on 172.17.0.5.. Sat Nov 4 05:57:14 2017 - [info] ok. Sat Nov 4 05:57:14 2017 - [warning] master_ip_failover_script is not defined. Sat Nov 4 05:57:14 2017 - [warning] shutdown_script is not defined. Sat Nov 4 05:57:14 2017 - [info] Got exit code 0 (Not master dead). MySQL Replication Health is OK.
環(huán)境測(cè)試完畢。
-
-
-
啟動(dòng) MHA Manager
root@mha-manager:~# nohup masterha_manager --conf=/etc/masterha/app1.cnf > /var/log/masterha/app1/manager.log 2>&1 &
啟動(dòng)參數(shù)介紹:
--remove_dead_master_conf 該參數(shù)代表當(dāng)發(fā)生主從切換后企垦,老的主庫(kù)的ip將會(huì)從配置文件中移除。 --manger_log 日志存放位置 --ignore_last_failover 在缺省情況下郑现,如果MHA檢測(cè)到連續(xù)發(fā)生宕機(jī)荧降,且兩次宕機(jī)間隔不足8小時(shí)的話,則不會(huì)進(jìn)行Failover辛友,之所以這樣限制是為了避免ping-pong效應(yīng)剪返。該參數(shù)代表忽略上次MHA觸發(fā)切換產(chǎn)生的文件,默認(rèn)情況下九默,MHA發(fā)生切換后會(huì)在日志目錄宾毒,也就是上面我設(shè)置的/data產(chǎn)生app1.failover.complete文件,下次再次切換的時(shí)候如果發(fā)現(xiàn)該目錄下存在該文件將不允許觸發(fā)切換乙各,除非在第一次切換后收到刪除該文件幢竹,為了方便,這里設(shè)置為--ignore_last_failover蹲坷。
查看 MHA Manager 監(jiān)控是否正常:
root@mha-manager:~# masterha_check_status --conf=/etc/masterha/app1.cnf app1 (pid:4649) is running(0:PING_OK), master:172.17.0.3
可以看到 MHA Manager 已經(jīng)跑起來(lái)了。
-
驗(yàn)證
-
測(cè)試master(201)宕機(jī)后级乐,是否會(huì)自動(dòng)切換县匠?
停掉 mha_master
? ~ docker stop mha_master mha_master
會(huì)有如下輸出:
----- Failover Report ----- app1: MySQL Master failover 172.17.0.3 to 172.17.0.4 succeeded Master 172.17.0.3 is down! Check MHA Manager logs at mha-manager:/var/log/masterha/app1/manager.log for details. Started automated(non-interactive) failover. Selected 172.17.0.4 as a new master. 172.17.0.4: OK: Applying all logs succeeded. 172.17.0.5: OK: Slave started, replicating from 172.17.0.4. 172.17.0.4: Resetting slave info succeeded. Master failover to 172.17.0.4(172.17.0.4:3306) completed successfully.
說(shuō)明切換成功。切換后 MHA Manager 會(huì)退出贼穆。
這里需要注意:
一旦發(fā)生切換管理進(jìn)程(Manager)將會(huì)退出扮惦,無(wú)法進(jìn)行再次測(cè)試,需將故障數(shù)據(jù)庫(kù)解決掉之后崖蜜,重新change加入到MHA環(huán)境中來(lái)豫领,并且要保證app1.failover.complete不存在或則加上--ignore_last_failover參數(shù)忽略舔琅,才能再次開啟管理進(jìn)程。
-