前言
搭建完成 MHA 環(huán)境篮绿,然后模擬 master 故障汉额,驗證是否正確切換成了新的 master
參考1
搭建思路
- 搭建好一主二從的復(fù)制結(jié)構(gòu)
- 配置各個服務(wù)器間的 SSH 免登陸
- 在各個服務(wù)器中都安裝 mha-node国葬,在 服務(wù)器4
slave2
上安裝 mha-manager
- 使用 MHA 提供的腳本對 SSH 免登陸的配置心包、集群的復(fù)制狀態(tài)進(jìn)行驗證跪腹,看是否正確畏梆。
- 如果驗證通過,啟動 manager 坏挠。
- 測試一下芍躏,把 master 停掉,看是否自動選出了新的 master
詳細(xì)過程
因為資源有限降狠,現(xiàn)在使用3臺服務(wù)器進(jìn)行搭建
192.168.1.11 作為 master
192.168.1.12 作為 slave1
192.168.1.13 作為 slave2对竣,同時把 MHA manager 也安裝到這臺服務(wù)器
配置 SSH 免登陸
- 在每臺服務(wù)器上都執(zhí)行以下命令
ssh-keygen
(執(zhí)行后會有多個輸入提示,不用輸入任何內(nèi)容榜配,全部直接回車即可) - 免登陸授權(quán)
ssh-copy-id -i /root/.ssh/id_rsa -p 22 root@192.168.1.11
ssh-copy-id -i /root/.ssh/id_rsa -p 22 root@192.168.1.12
ssh-copy-id -i /root/.ssh/id_rsa -p 22 root@192.168.1.13
驗證 ssh 192.168.0.x
搭建環(huán)境
- 主節(jié)點(diǎn)配置 安裝可參考:mysql搭建手冊
vi /etc/my.cnf
[mysqld]
log-bin=mysql-bin
server-id=11
gtid_mode=on
#開啟gtid, 必須主從全開
enforce_gtid_consistency=1
log_slave_updates=1
#開啟半同步復(fù)制 否則自動切換主從的時候會報主鍵錯誤
plugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
loose_rpl_semi_sync_master_enabled = 1
loose_rpl_semi_sync_slave_enabled = 1
loose_rpl_semi_sync_master_timeout = 5000
#需要同步的數(shù)據(jù)庫
binlog-do-db=test
##需要忽略的數(shù)據(jù)庫
binlog-ignore-db=information_schema
binlog-ignore-db=mysql
binlog-ignore-db=performance_schema
binlog-ignore-db=sys
- 重啟
service mysqld restart
- 在主服務(wù)器上建立帳戶并授權(quán)slave:
mysql -uroot -proot
use mysql;
#mha測試賬號否纬,可以也使用root
CREATE USER 'mha'@'%' IDENTIFIED BY '123456';
#在所有Node上創(chuàng)建Manage監(jiān)控用戶
grant all privileges on *.* to 'mha'@'192.168.1.%' identified by '123456';
flush privileges;
#創(chuàng)建同步賬號
CREATE USER 'repl'@'%' IDENTIFIED BY '123456';
#給同步賬號賦權(quán)
grant replication slave,replication client on *.* to 'repl'@'192.168.1.%' identified by "123456";
//grant ALL PRIVILEGES on *.* to repl@"%" identified by "123456";
#生效
flush privileges;
#查看指定用戶,IP權(quán)限
show grants for repl@'192.168.1.%';
- 登錄主服務(wù)器的mysql蛋褥,查詢master的狀態(tài)
show master status;
+------------------+----------+--------------+-------------------------------------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+-------------------------------------------------+-------------------+
| mysql-bin.000002 | 151 | test | information_schema,mysql,performance_schema,sys | |
+------------------+----------+--------------+-------------------------------------------------+-------------------+
1 row in set (0.00 sec)
- 在 master 上查詢?nèi)罩灸夸?br> show variables like 'log_bin_basename%';
從節(jié)點(diǎn)配置
- 兩個從節(jié)點(diǎn)的server-id需要改一下
vi /etc/my.cnf
server-id=12
server-id=13
[mysqld]
log-bin=mysql-bin
server-id=12
gtid_mode=on
#開啟gtid, 必須主從全開
enforce_gtid_consistency=1
log_slave_updates=1
#開啟半同步復(fù)制 否則自動切換主從的時候會報主鍵錯誤
plugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
loose_rpl_semi_sync_master_enabled = 1
loose_rpl_semi_sync_slave_enabled = 1
loose_rpl_semi_sync_master_timeout = 5000
#需要同步的數(shù)據(jù)庫
binlog-do-db=test
##需要忽略的數(shù)據(jù)庫
binlog-ignore-db=information_schema
binlog-ignore-db=mysql
binlog-ignore-db=performance_schema
binlog-ignore-db=sys
#從服務(wù)器临燃,要加上relay_log_purge=0,不加的話烙心,會報出warning,relay_log_purge=0 is not set on slave
relay_log_purge=0
- 重啟
service mysqld restart
- 192.168.0.12上執(zhí)行
mysql -uroot -proot
use mysql;
#mha測試賬號谬俄,可以也使用root
CREATE USER 'mha'@'%' IDENTIFIED BY '123456';
#在所有Node上創(chuàng)建Manage監(jiān)控用戶
grant all privileges on *.* to 'mha'@'192.168.1.%' identified by '123456';
flush privileges;
#從服務(wù)的id Master_Server_Id: 0
change master to master_auto_position=0;
#指定master
change master to master_host='192.168.1.11',master_user='repl',master_password='123456',master_log_file='mysql-bin.000009',master_log_pos=191;
flush privileges;
#啟動復(fù)制
start slave;
#從節(jié)點(diǎn)上運(yùn)行
#set global read_only=1;
- 查看狀態(tài)
show slave status \G;
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.1.11
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000002
Read_Master_Log_Pos: 151
Relay_Log_File: localhost-relay-bin.000002
Relay_Log_Pos: 361
Relay_Master_Log_File: mysql-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
如上,當(dāng)IO和SQL線程的狀態(tài)均為Yes弃理,則表示主從已實(shí)現(xiàn)同步了!
- 192.168.0.13上執(zhí)行(MHA-manager管理節(jié)點(diǎn))
mysql -uroot -proot
use mysql;
#mha測試賬號屎蜓,可以也使用root
CREATE USER 'mha'@'%' IDENTIFIED BY '123456';
grant ALL PRIVILEGES on *.* to mha@"%" Identified by "123456";
flush privileges;
#從服務(wù)的id Master_Server_Id: 0
change master to master_auto_position=0;
#指定master
change master to master_host='192.168.1.11',master_user='repl',master_password='123456',master_log_file='mysql-bin.000003',master_log_pos=151;
#啟動復(fù)制
start slave;
#從節(jié)點(diǎn)上運(yùn)行
#set global read_only=1;
- 查看狀態(tài)
show slave status \G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.1.11
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000003
Read_Master_Log_Pos: 151
Relay_Log_File: localhost-relay-bin.000002
Relay_Log_Pos: 361
Relay_Master_Log_File: mysql-bin.000003
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
安裝 MHA
創(chuàng)建安裝目錄
- Node服務(wù)器安裝
mkdir -p /usr/local/mha - manage服務(wù)器安裝
mkdir -p /usr/local/mha/ha1/fail_script
mkdir -p /usr/local/mha/ha1/workdir
/usr/local/mha:程序安裝目錄
/usr/local/mha/ha1:用于區(qū)別每一個mha方案痘昌,當(dāng)前方案ha1
/usr/local/mha/ha1/fail_script:方案ha1的failover腳本保存路徑
/usr/local/mha/ha1/workdir:方案ha1的的日志和failover產(chǎn)生的binlog保存路徑
安裝 MHA-node
-
在每臺服務(wù)器上都執(zhí)行以下命令
安裝epel源(所有節(jié)點(diǎn))
yum -y install perl-DBD-MySQL ncftp perl-DBI.x86
若遇到如圖報錯則:yum -y install perl-DBI.x*
下載需翻墻:
cd /usr/local/src/
wget https://raw.githubusercontent.com/linyue515/mysql-master-ha/master/mha4mysql-node-0.57-0.el7.noarch.rpm
#安裝并在安裝過程中顯示正在安裝的文件信息及安裝進(jìn)度
rpm -ivh mha4mysql-node-0.57-0.el7.noarch.rpm
安裝 MHA-manager
在 slave2上執(zhí)行命令
wget https://raw.githubusercontent.com/linyue515/mysql-master-ha/master/mha4mysql-manager-0.57-0.el7.noarch.rpm
rpm -ivh mha4mysql-manager-0.57-0.el7.noarch.rpm
- 錯誤:依賴檢測失敗:
perl(Config::Tiny) 被 mha4mysql-manager-0.57-0.el7.noarch 需要
perl(Log::Dispatch) 被 mha4mysql-manager-0.57-0.el7.noarch 需要
perl(Log::Dispatch::File) 被 mha4mysql-manager-0.57-0.el7.noarch 需要
perl(Log::Dispatch::Screen) 被 mha4mysql-manager-0.57-0.el7.noarch 需要
perl(Parallel::ForkManager) 被 mha4mysql-manager-0.57-0.el7.noarch 需要
- yum安裝炬转,先更新yum源
yum -y update
yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
沒有可用軟件包 perl-Log-Dispatch辆苔。沒有可用軟件包 perl-Parallel-ForkManager。
注意:這兩步的安裝方式為centos 系統(tǒng)扼劈,在其他系統(tǒng)中驻啤,需要自己調(diào)整安裝方法
yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes -y
yum -y install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager
yum install -y rrdtool perl-rrdtool rrdtool-devel perl-Params-Validate
創(chuàng)建 MHA 配置文件
- slave2管理節(jié)點(diǎn)上,位置可以自定義
mkdir /usr/local/mha
vi /usr/local/mha/mha.cnf
配置文件添加如下:
[server default]
# mha監(jiān)聽用戶
hostname=192.168.1.13
user=mha
password=123456
ssh_user=root
# 配置主從復(fù)制時創(chuàng)建的復(fù)制用戶
repl_user=repl
repl_password=123456
#監(jiān)控mater荐吵,ping的頻率
ping_interval=1
# 在各臺服務(wù)器上創(chuàng)建目錄mkdir 如:/usr/local/mha
manager_workdir=/usr/local/mha
# 日志位置
manager_log=/usr/local/mha/manager.log
remote_workdir=/usr/local/mha
master_binlog_dir=/usr/local/mysql/data/mysql
#master_ip_failover:自動切換時vip管理的腳本
#master_ip_online_change:手動切換使用的腳本
#power_manager:故障發(fā)生后關(guān)閉主機(jī)的腳本
#send_report:發(fā)送報警的腳本骑冗。
master_ip_failover_script=/usr/local/mha/ha1/fail_script/master_ip_failover
master_ip_online_change_script=/usr/local/mha/ha1/fail_script/master_ip_online_change
secondary_check_script=/usr/bin/masterha_secondary_check -s 192.168.1.12 -s 192.168.1.13
report_script=/usr/local/mha/ha1/fail_script/send_report
shutdown_script=""
[server1]
hostname=192.168.1.11
port=3306
[server2]
hostname=192.168.1.12
port=3306
candidate_master=1
check_repl_delay=0
[server3]
hostname=192.168.1.13
port=3306
ignore_fail=1
no_master=1
基本配置部分
- mnager監(jiān)控用的mysql赊瞬,mha用戶
一般設(shè)置為root用戶
user=mha
password=123456 - ssh登入用戶名
ssh_user=root - 配置主從復(fù)制時創(chuàng)建的復(fù)制用戶,每個node服務(wù)器都需要存在
repl_user=repl
repl_password=123456 - mha_manager項目的主目錄
前面創(chuàng)建的mha放腳本的目錄
manager_workdir=/usr/local/mha - mha_manager記錄日志
manager_log=/usr/local/mha/manager.log - master服務(wù)器上查詢 show variables like 'log_bin_basename%';
例如:值為 /data/mysql/mysql-bin贼涩,需要的是 /data/mysql
master_binlog_dir=/usr/local/mysql/data/mysql - 監(jiān)控mater巧涧,ping的頻率
ping_interval=1 - node服務(wù)器在發(fā)生master切換時,binlog保持的路徑遥倦,每個node都會在該目錄下保存一份差異的binlog谤绳,除非沒有差異。
remote_workdir=/usr/local/mha - 設(shè)置為候選master袒哥,如果設(shè)置該參數(shù)以后缩筛,發(fā)生主從切換以后將會將此從庫提升為主庫,即使這個主庫不是集群中事件最新的slave
candidate_master=1 - 默認(rèn)情況下如果一個slave落后master 100M的relay logs的話堡称,MHA將不會選擇該slave作為一個新的master瞎抛,因為對于這個slave的恢復(fù)需要花費(fèi)很長時間,通過設(shè)置check_repl_delay=0,MHA觸發(fā)切換在選擇一個新master的時候?qū)雎詮?fù)制延時粮呢,這個參數(shù)對于設(shè)置了candidate_master=1的主機(jī)非常有用婿失,因為這個候選主在切換的過程中一定是新的master
check_repl_delay=0 - 如果不加上該參數(shù),當(dāng)該slave主機(jī)故障了啄寡,mha將無法啟動豪硅,加上該參數(shù)會忽略該主機(jī)是否正常,在mha啟動的時候加上參數(shù)--ignore_fail_on_start
ignore_fail=1 - 不將該主機(jī)轉(zhuǎn)換為master
no_master=1
高可用配置部分
- mha在線自動failover時處理VIP的配置文件
master_ip_failover_script=/usr/local/mha/ha1/fail_script/master_ip_failover - 在線手動執(zhí)行master切換時VIP的處理文件
master_ip_online_change_script=/usr/local/mha/ha1/fail_script/master_ip_online_change - 一旦MHA到新master之間的網(wǎng)絡(luò)出現(xiàn)問題挺物,manager會嘗試從backup登入到masger
secondary_check_script=/usr/bin/masterha_secondary_check -s 192.168.1.12 -s 192.168.1.13 - 發(fā)生切換后執(zhí)行的報警腳本
report_script=/usr/local/mha/ha1/fail_script/send_report - 故障后關(guān)閉master主機(jī)的腳本(主要是使用keepalive做VIP時會出現(xiàn)腦裂導(dǎo)致VIP頻繁切換所以會將故障的master關(guān)閉)
shutdown_script=""
ha1腳本
- master_ip_failover
VIP的配置可以使用keepalived也可以寫腳本懒浮,keepalived對網(wǎng)絡(luò)的要求很高否則容易腦裂砚著,在我前面搭建雙主環(huán)境講過keepalived的搭建方法舌镶,我這里使用腳本的方式否灾。
需要修改VIP惩阶、網(wǎng)卡ens33
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';
use Getopt::Long;
my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);
my $vip = '192.168.0.100/24';
my $key = '1';
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);
exit &main();
sub main {
print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
if ( $command eq "stop" || $command eq "stopssh" ) {
my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
exit 0;
}
else {
&usage();
exit 1;
}
}
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
sub stop_vip() {
return 0 unless ($ssh_user);
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
- master_ip_online_change
perl腳本
#!/usr/bin/env perl
use strict;
use warnings FATAL =>'all';
use Getopt::Long;
my $vip = '192.168.0.100/24'; # Virtual IP
my $key = "1";
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";
my $exit_code = 0;
my (
$command, $orig_master_is_new_slave, $orig_master_host,
$orig_master_ip, $orig_master_port, $orig_master_user,
$orig_master_password, $new_master_host,
$new_master_ip, $new_master_port, $new_master_user,
$new_master_password,
);
GetOptions(
'command=s' => \$command,
'orig_master_is_new_slave' => \$orig_master_is_new_slave,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'orig_master_user=s' => \$orig_master_user,
'orig_master_password=s' => \$orig_master_password,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
'new_master_user=s' => \$new_master_user,
'new_master_password=s' => \$new_master_password,
);
exit &main();
sub main {
#print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
if ( $command eq "stop" || $command eq "stopssh" ) {
# $orig_master_host, $orig_master_ip, $orig_master_port are passed.
# If you manage master ip address at global catalog database,
# invalidate orig_master_ip here.
my $exit_code = 1;
eval {
print "\n\n\n***************************************************************\n";
print "Disabling the VIP - $vip on old master: $orig_master_host\n";
print "***************************************************************\n\n\n\n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
# all arguments are passed.
# If you manage master ip address at global catalog database,
# activate new_master_ip here.
# You can also grant write access (create user, set read_only=0, etc) here.
my $exit_code = 10;
eval {
print "\n\n\n***************************************************************\n";
print "Enabling the VIP - $vip on new master: $new_master_host \n";
print "***************************************************************\n\n\n\n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
`ssh $orig_master_user\@$orig_master_host \" $ssh_start_vip \"`;
exit 0;
}
else {
&usage();
exit 1;
}
}
# A simple system call that enable the VIP on the new master
sub start_vip() {
`ssh $new_master_user\@$new_master_host \" $ssh_start_vip \"`;
}
# A simple system call that disable the VIP on the old_master
sub stop_vip() {
`ssh $orig_master_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
- shell腳本
#/bin/bash
#source /root/.bash_profile
vip=`echo '192.168.0.100/24'` # Virtual IP
key=`echo '1'`
command=`echo "$1" | awk -F = '{print $2}'`
orig_master_host=`echo "$2" | awk -F = '{print $2}'`
new_master_host=`echo "$7" | awk -F = '{print $2}'`
stop_vip=`echo "ssh root@$orig_master_host /sbin/ifconfig ens33:$key down"`
start_vip=`echo "ssh root@$new_master_host /sbin/ifconfig ens33:$key $vip"`
if [ $command = 'stop' ]
then
echo -e "\n\n\n***************************************************************\n"
echo -e "Disabling the VIP - $vip on old master: $orig_master_host\n"
$stop_vip
if [ $? -eq 0 ]
then
echo "Disabled the VIP successfully"
else
echo "Disabled the VIP failed"
fi
echo -e "***************************************************************\n\n\n\n"
fi
if [ $command = 'start' -o $command = 'status' ]
then
echo -e "\n\n\n***************************************************************\n"
echo -e "Enabling the VIP - $vip on new master: $new_master_host \n"
$start_vip
if [ $? -eq 0 ]
then
echo "Enabled the VIP successfully"
else
echo "Enabled the VIP failed"
fi
echo -e "***************************************************************\n\n\n\n"
fi
- send_report
#!/usr/bin/perl
# Copyright (C) 2011 DeNA Co.,Ltd.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc.,
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
## Note: This is a sample script and is not complete. Modify the script based on your environment.
use strict;
use warnings FATAL => 'all';
use Mail::Sender;
use Getopt::Long;
#new_master_host and new_slave_hosts are set only when recovering master succeeded
my ( $dead_master_host, $new_master_host, $new_slave_hosts, $subject, $body );
my $smtp='1019533934@qq.com';
my $mail_from='xxxx';
my $mail_user='xxxxx';
my $mail_pass='xxxxx';
my $mail_to=['xxxx','xxxx'];
GetOptions(
'orig_master_host=s' => \$dead_master_host,
'new_master_host=s' => \$new_master_host,
'new_slave_hosts=s' => \$new_slave_hosts,
'subject=s' => \$subject,
'body=s' => \$body,
);
mailToContacts($smtp,$mail_from,$mail_user,$mail_pass,$mail_to,$subject,$body);
sub mailToContacts {
my ( $smtp, $mail_from, $user, $passwd, $mail_to, $subject, $msg ) = @_;
open my $DEBUG, "> /tmp/monitormail.log"
or die "Can't open the debug file:$!\n";
my $sender = new Mail::Sender {
ctype => 'text/plain; charset=utf-8',
encoding => 'utf-8',
smtp => $smtp,
from => $mail_from,
auth => 'LOGIN',
TLS_allowed => '0',
authid => $user,
authpwd => $passwd,
to => $mail_to,
subject => $subject,
debug => $DEBUG
};
$sender->MailMsg(
{ msg => $msg,
debug => $DEBUG
}
) or print $Mail::Sender::Error;
return 1;
}
# Do whatever you want here
exit 0;
- 修改用戶稠炬,默認(rèn)監(jiān)控用戶是mha
cd /usr/bin
vi masterha_secondary_check - 給上面腳本賦權(quán):
chmod -R 755 /usr/local/mha/ha1/fail_script
master手動添加虛擬VIP
注意:需要手動先在master服務(wù)器上面添加VIP
yum install -y net-tools.x86_64
查看網(wǎng)卡:
ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:c4:82:82 brd ff:ff:ff:ff:ff:ff
inet 192.168.0.11/24 brd 192.168.0.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fec4:8282/64 scope link
valid_lft forever preferred_lft forever
添加虛擬VIP:
ifconfig ens33:1 192.168.1.100/24
ifconfig ens33:1 192.168.1.100/24 down
配置relay_log的清除方式(在每個Node上)
- 所有Node的cnf配置文件加上
relay_log_purge=0
MHA在發(fā)生切換的過程中,從庫的恢復(fù)過程中依賴于relay log的相關(guān)信息冕象,所以這里要將relay log的自動清除設(shè)置為OFF膀估,采用手動清除relay log的方式饼记。
在默認(rèn)情況下低斋,從服務(wù)器上的中繼日志會在SQL線程執(zhí)行完畢后被自動刪除巴比。但是在MHA環(huán)境中,這些中繼日志在恢復(fù)其他從服務(wù)器時可能會被用到礁遵,因此需要禁用中繼日志的自動刪除功能轻绞。定期清除中繼日志需要考慮到復(fù)制延時的問題。在ext3的文件系統(tǒng)下佣耐,刪除大的文件需要一定的時間政勃,會導(dǎo)致嚴(yán)重的復(fù)制延時。為了避免復(fù)制延時兼砖,需要暫時為中繼日志創(chuàng)建硬鏈接奸远,因為在linux系統(tǒng)中通過硬鏈接刪除大文件速度會很快。
提示:在mysql數(shù)據(jù)庫中讽挟,刪除大表時懒叛,通常也采用建立硬鏈接的方式
MHA節(jié)點(diǎn)中包含了pure_relay_logs命令工具,它可以為中繼日志創(chuàng)建硬鏈接耽梅,執(zhí)行SET GLOBAL relay_log_purge=1,等待幾秒鐘以便SQL線程切換到新的中繼日志薛窥,再執(zhí)行SET GLOBAL relay_log_purge=0。
pure_relay_logs腳本參數(shù)如下所示:
--user mysql 用戶名
--password mysql 密碼
--port 端口號
--workdir 指定創(chuàng)建relay log的硬鏈接的位置,默認(rèn)是/var/tmp诅迷,由于系統(tǒng)不同分區(qū)創(chuàng)建硬鏈接文件會失敗佩番,故需要執(zhí)行硬鏈接具體位置,成功執(zhí)行腳本后罢杉,硬鏈接的中繼日志文件被刪除
--disable_relay_log_purge 默認(rèn)情況下趟畏,如果relay_log_purge=1,腳本會什么都不清理滩租,自動退出赋秀,通過設(shè)定這個參數(shù),當(dāng)relay_log_purge=1的情況下會將relay_log_purge設(shè)置為0持际。清理relay log之后沃琅,最后將參數(shù)設(shè)置為OFF。
在每臺slave Node上創(chuàng)建
vim /usr/local/mha/purge_relay_log.sh
#!/bin/bash
user=root
passwd=root
#確保用戶和密碼能通過127.0.0.1登入(本地的IP)
host='127.0.0.1'
port=3306
work_dir='/usr/local/mysql/data/mysql'
purge='/usr/bin/purge_relay_logs'
$purge --user=$user --password=$passwd --host=$host --disable_relay_log_purge --port=$port --workdir=$work_dir >> /usr/local/mha/purge_relay_logs.log 2>&1
- 授權(quán)
chmod u+x /usr/local/mha/purge_relay_log.sh
cd /usr/local/mha
./purge_relay_log.sh
2019-06-23 12:07:51: purge_relay_logs script started.
relay_log_purge is enabled. Disabling..
Found relay_log.info: /usr/local/mysql/data/mysql/relay-log.info
Opening /usr/local/mysql/data/mysql/localhost-relay-bin.000001 ..
Opening /usr/local/mysql/data/mysql/localhost-relay-bin.000002 ..
Executing SET GLOBAL relay_log_purge=1; FLUSH LOGS; sleeping a few seconds so that SQL thread can delete older relay log files (if it keeps up); SET GLOBAL relay_log_purge=0; .. ok.
2019-06-23 12:07:54: All relay log purging operations succeeded.
將腳本加入到os定時任務(wù)中
啟動 manager
check檢查
- 檢查SSH配置
masterha_check_ssh --conf=/usr/local/mha/mha.cnf - 檢查復(fù)制
masterha_check_repl --conf=/usr/local/mha/mha.cnf - 檢查狀態(tài)
masterha_check_status --conf=/usr/local/mha/mha.cnf
必需保證所有的檢查都通過 - 錯誤
在驗證時蜘欲,我遇到過這個錯誤:Can't exec "mysqlbinlog" ......
解決方法是在所有服務(wù)器上執(zhí)行:
ln -s /usr/local/mysql/bin/mysqlbinlog /usr/local/bin/mysqlbinlog
ln -s /usr/local/mysql/bin/mysql /usr/local/bin/mysql
注意:加粗位置益眉,需要根據(jù)自己環(huán)境修改路徑 - 驗證成功
MySQL Replication Health is OK.
- 啟動MHA
nohup masterha_manager --conf=/usr/local/mha/mha.cnf --ignore_fail_on_start --ignore_last_failover < /dev/null > /usr/local/mha/start.log 2>&1 & - 查看啟動日志
tail -f manager.log
IN SCRIPT TEST====/sbin/ifconfig eth0:1 down==/sbin/ifconfig eth0:1 192.168.0.100/24===
Checking the Status of the script.. OK
Sun Jun 23 09:44:20 2019 - [info] OK.
Sun Jun 23 09:44:20 2019 - [warning] shutdown_script is not defined.
Sun Jun 23 09:44:20 2019 - [info] Set master ping interval 1 seconds.
Sun Jun 23 09:44:20 2019 - [info] Set secondary check script: /usr/bin/masterha_secondary_check -s backup -s master --user=root --master_host=master --master_ip=192.168.0.11 --master_port=3306
Sun Jun 23 09:44:20 2019 - [info] Starting ping health check on 192.168.0.11(192.168.0.11:3306)..
Sun Jun 23 09:44:20 2019 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..
- 狀態(tài)檢查
ps -ef|grep masterha_manager
masterha_check_status --conf=/usr/local/mha/mha.cnf
mha (pid:29238) is running(0:PING_OK), master:192.168.0.11
- 處理故障master,將其配置為從庫chang到新的master姥份,可以從manager.log找到change語句郭脂。
grep "CHANGE MASTER TO MASTER" /usr/local/mha/manager.log | tail -1 - 停掉MHA監(jiān)控
masterha_stop --conf=/usr/local/mha/mha.cnf
故障轉(zhuǎn)移驗證
驗證MHA
驗證的方式是先停掉 master,因為之前的配置文件中澈歉,把 slave1 作為了候選人展鸡,那么就到 slave2 上查看 master 的 IP 是否變?yōu)榱?slave1 的 IP
- 停掉 master,在 master(192.168.0.11) 上把 mysql 停掉
service mysqld stop - 查看 MHA 日志埃难,上面的配置文件中指定了日志位置為 /usr/local/mha/manager.log
tail -f /usr/local/mha/manager.log
Started automated(non-interactive) failover.
Selected 192.168.0.12(192.168.0.12:3306) as a new master.
192.168.0.12(192.168.0.12:3306): OK: Applying all logs succeeded.
192.168.0.13(192.168.0.13:3306): OK: Slave started, replicating from 192.168.0.12(192.168.0.12:3306)
192.168.0.12(192.168.0.12:3306): Resetting slave info succeeded.
Master failover to 192.168.0.12(192.168.0.12:3306) completed successfully.
-
刪除fail文件(非必需)
由于啟動mha的時候加上了--ignore_last_failover參數(shù)莹弊,所以不刪除failower生成的文件也能啟動,否則需要刪除failower生成的文件“mha.failover.complete”涡尘。
rm -rf /usr/local/mha/mha.failover.complete
檢查新的slave
登陸 slave(192.168.0.12)的mysql忍弛,查看 slave 狀態(tài)
show slave status \G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.0.12
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000006
Read_Master_Log_Pos: 473
Relay_Log_File: localhost-relay-bin.000002
Relay_Log_Pos: 408
Relay_Master_Log_File: mysql-bin.000006
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
可以看到 master 的 IP 現(xiàn)在為 12,說明 MHA 已經(jīng)把 slave1 提升為了新的 master考抄,IO線程和SQL線程也正確運(yùn)行细疚,MHA 搭建成功。
IN SCRIPT TEST====/sbin/ifconfig eth0:1 down==/sbin/ifconfig eth0:1 192.168.1.100/24===
Checking the Status of the script.. OK
Sun Jun 23 13:52:00 2019 - [info] OK.
Sun Jun 23 13:52:00 2019 - [warning] shutdown_script is not defined.
Sun Jun 23 13:52:00 2019 - [info] Got exit code 0 (Not master dead).
MySQL Replication Health is OK.
failover處理
dead master處理階段
down掉主服務(wù)器的VIP
/usr/local/mha/ha1/fail_script/master_ip_failover --orig_master_host=192.168.1.11 --orig_master_ip=192.168.1.100 --orig_master_port=3306 --command=stopssh --ssh_user=root
- 問題1:
slave replicates is not defined in the configuration file!
MySQL徹底清除slave信息
stop slave;
reset slave all;
- 問題2:
Tue Feb 23 13:47:18 2016 - [info] read_only=1 is not set on slave db
Tue Feb 23 13:47:18 2016 - [warning] relay_log_purge=0 is not set on slave db
解決辦法:
備庫執(zhí)行
set global read_only=1;
set global relay_log_purge=0;
keepalived配置
配置keepalived的配置文件川梅,在master上配置(192.168.1.11)
vi /etc/keepalived/keepalived.conf
global_defs {
notification_email {
saltstack@163.com
}
notification_email_from dba@dbserver.com
smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id MySQL-HA
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
virtual_router_id 51 priority 150 advert_int 1 nopreempt
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.1.100
}
}
其中router_id MySQL HA表示設(shè)定keepalived組的名稱疯兼,將192.168.1.100這個虛擬ip綁定到該主機(jī)的eth1網(wǎng)卡上,并且設(shè)置了狀態(tài)為backup模式贫途,將keepalived的模式設(shè)置為非搶占模式(nopreempt)吧彪,priority 150表示設(shè)置的優(yōu)先級為150。下面的配置略有不同丢早,但是都是一個意思来氧。
在候選master上配置(192.168.1.12)
vi /etc/keepalived/keepalived.conf
global_defs {
notification_email {
saltstack@163.com
}
notification_email_from dba@dbserver.com
smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id MySQL-HA
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
virtual_router_id 51 priority 120 advert_int 1 nopreempt
authentication {
auth_type PASS
auth_pass 1111 }
virtual_ipaddress {
192.168.1.100
}
}
/etc/init.d/keepalived start ; tail -f /var/log/messages