mysql高可用集群 MHA 搭建實(shí)例

前言

搭建完成 MHA 環(huán)境篮绿,然后模擬 master 故障汉额,驗證是否正確切換成了新的 master
參考1

搭建思路
  1. 搭建好一主二從的復(fù)制結(jié)構(gòu)
  2. 配置各個服務(wù)器間的 SSH 免登陸
  3. 在各個服務(wù)器中都安裝 mha-node国葬,在 服務(wù)器4slave2 上安裝 mha-manager
    image.png
  • 使用 MHA 提供的腳本對 SSH 免登陸的配置心包、集群的復(fù)制狀態(tài)進(jìn)行驗證跪腹,看是否正確畏梆。
  • 如果驗證通過,啟動 manager 坏挠。
  • 測試一下芍躏,把 master 停掉,看是否自動選出了新的 master

詳細(xì)過程

因為資源有限降狠,現(xiàn)在使用3臺服務(wù)器進(jìn)行搭建
192.168.1.11 作為 master
192.168.1.12 作為 slave1
192.168.1.13 作為 slave2对竣,同時把 MHA manager 也安裝到這臺服務(wù)器

配置 SSH 免登陸

  1. 在每臺服務(wù)器上都執(zhí)行以下命令
    ssh-keygen
    (執(zhí)行后會有多個輸入提示,不用輸入任何內(nèi)容榜配,全部直接回車即可)
  2. 免登陸授權(quán)
ssh-copy-id -i /root/.ssh/id_rsa -p 22 root@192.168.1.11
ssh-copy-id -i /root/.ssh/id_rsa -p 22 root@192.168.1.12
ssh-copy-id -i /root/.ssh/id_rsa -p 22 root@192.168.1.13

驗證 ssh 192.168.0.x

搭建環(huán)境

  1. 主節(jié)點(diǎn)配置 安裝可參考:mysql搭建手冊
    vi /etc/my.cnf
[mysqld]
log-bin=mysql-bin
server-id=11
gtid_mode=on
#開啟gtid, 必須主從全開
enforce_gtid_consistency=1
log_slave_updates=1
#開啟半同步復(fù)制 否則自動切換主從的時候會報主鍵錯誤
plugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
loose_rpl_semi_sync_master_enabled = 1
loose_rpl_semi_sync_slave_enabled = 1
loose_rpl_semi_sync_master_timeout = 5000
#需要同步的數(shù)據(jù)庫
binlog-do-db=test
##需要忽略的數(shù)據(jù)庫
binlog-ignore-db=information_schema
binlog-ignore-db=mysql
binlog-ignore-db=performance_schema
binlog-ignore-db=sys
  • 重啟
    service mysqld restart
  1. 在主服務(wù)器上建立帳戶并授權(quán)slave:
mysql -uroot -proot  
use mysql;
#mha測試賬號否纬,可以也使用root
CREATE USER 'mha'@'%' IDENTIFIED BY '123456';
#在所有Node上創(chuàng)建Manage監(jiān)控用戶
grant all privileges on *.* to 'mha'@'192.168.1.%' identified  by '123456';
flush privileges;
#創(chuàng)建同步賬號
CREATE USER 'repl'@'%' IDENTIFIED BY '123456';
#給同步賬號賦權(quán)
grant replication slave,replication client on *.* to 'repl'@'192.168.1.%' identified by "123456";
//grant ALL PRIVILEGES on *.* to repl@"%" identified by "123456";
#生效
flush privileges;
#查看指定用戶,IP權(quán)限
show grants for repl@'192.168.1.%'; 
  1. 登錄主服務(wù)器的mysql蛋褥,查詢master的狀態(tài)
    show master status;
+------------------+----------+--------------+-------------------------------------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB                                | Executed_Gtid_Set |
+------------------+----------+--------------+-------------------------------------------------+-------------------+
| mysql-bin.000002 |      151 | test         | information_schema,mysql,performance_schema,sys |                   |
+------------------+----------+--------------+-------------------------------------------------+-------------------+
1 row in set (0.00 sec)

  1. 在 master 上查詢?nèi)罩灸夸?br> show variables like 'log_bin_basename%';
image.png

從節(jié)點(diǎn)配置

  1. 兩個從節(jié)點(diǎn)的server-id需要改一下
    vi /etc/my.cnf
    server-id=12
    server-id=13
[mysqld]
log-bin=mysql-bin
server-id=12
gtid_mode=on
#開啟gtid, 必須主從全開
enforce_gtid_consistency=1
log_slave_updates=1
#開啟半同步復(fù)制 否則自動切換主從的時候會報主鍵錯誤
plugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
loose_rpl_semi_sync_master_enabled = 1
loose_rpl_semi_sync_slave_enabled = 1
loose_rpl_semi_sync_master_timeout = 5000
#需要同步的數(shù)據(jù)庫
binlog-do-db=test
##需要忽略的數(shù)據(jù)庫
binlog-ignore-db=information_schema
binlog-ignore-db=mysql
binlog-ignore-db=performance_schema
binlog-ignore-db=sys
#從服務(wù)器临燃,要加上relay_log_purge=0,不加的話烙心,會報出warning,relay_log_purge=0 is not set on slave
relay_log_purge=0
  • 重啟
    service mysqld restart
  1. 192.168.0.12上執(zhí)行
mysql -uroot -proot  
use mysql;
#mha測試賬號谬俄,可以也使用root
CREATE USER 'mha'@'%' IDENTIFIED BY '123456';
#在所有Node上創(chuàng)建Manage監(jiān)控用戶
grant all privileges on *.* to 'mha'@'192.168.1.%' identified  by '123456';
flush privileges;
#從服務(wù)的id Master_Server_Id: 0
change master to master_auto_position=0;
#指定master 
change  master to master_host='192.168.1.11',master_user='repl',master_password='123456',master_log_file='mysql-bin.000009',master_log_pos=191;
flush privileges;
#啟動復(fù)制
start slave;  
#從節(jié)點(diǎn)上運(yùn)行
#set global read_only=1;
  • 查看狀態(tài)
    show slave status \G;
 Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.11
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000002
          Read_Master_Log_Pos: 151
               Relay_Log_File: localhost-relay-bin.000002
                Relay_Log_Pos: 361
        Relay_Master_Log_File: mysql-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

如上,當(dāng)IO和SQL線程的狀態(tài)均為Yes弃理,則表示主從已實(shí)現(xiàn)同步了!

  1. 192.168.0.13上執(zhí)行(MHA-manager管理節(jié)點(diǎn))
mysql -uroot -proot  
use mysql;
#mha測試賬號屎蜓,可以也使用root
CREATE USER 'mha'@'%' IDENTIFIED BY '123456';
grant ALL PRIVILEGES on *.* to mha@"%" Identified by "123456";
flush privileges;
#從服務(wù)的id Master_Server_Id: 0
change master to master_auto_position=0;
#指定master
change  master to master_host='192.168.1.11',master_user='repl',master_password='123456',master_log_file='mysql-bin.000003',master_log_pos=151;
#啟動復(fù)制
start slave;  
#從節(jié)點(diǎn)上運(yùn)行
#set global read_only=1;
  • 查看狀態(tài)
    show slave status \G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.11
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000003
          Read_Master_Log_Pos: 151
               Relay_Log_File: localhost-relay-bin.000002
                Relay_Log_Pos: 361
        Relay_Master_Log_File: mysql-bin.000003
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

安裝 MHA

創(chuàng)建安裝目錄
  1. Node服務(wù)器安裝
    mkdir -p /usr/local/mha
  2. manage服務(wù)器安裝
    mkdir -p /usr/local/mha/ha1/fail_script
    mkdir -p /usr/local/mha/ha1/workdir

/usr/local/mha:程序安裝目錄
/usr/local/mha/ha1:用于區(qū)別每一個mha方案痘昌,當(dāng)前方案ha1
/usr/local/mha/ha1/fail_script:方案ha1的failover腳本保存路徑
/usr/local/mha/ha1/workdir:方案ha1的的日志和failover產(chǎn)生的binlog保存路徑

安裝 MHA-node
  1. 在每臺服務(wù)器上都執(zhí)行以下命令
    安裝epel源(所有節(jié)點(diǎn))
    yum -y install perl-DBD-MySQL ncftp perl-DBI.x86


    image.png
  2. 若遇到如圖報錯則:yum -y install perl-DBI.x*
    下載需翻墻:

cd /usr/local/src/
wget https://raw.githubusercontent.com/linyue515/mysql-master-ha/master/mha4mysql-node-0.57-0.el7.noarch.rpm
#安裝并在安裝過程中顯示正在安裝的文件信息及安裝進(jìn)度
rpm -ivh mha4mysql-node-0.57-0.el7.noarch.rpm 
安裝 MHA-manager

在 slave2上執(zhí)行命令

wget https://raw.githubusercontent.com/linyue515/mysql-master-ha/master/mha4mysql-manager-0.57-0.el7.noarch.rpm
rpm -ivh mha4mysql-manager-0.57-0.el7.noarch.rpm
  • 錯誤:依賴檢測失敗:

    perl(Config::Tiny) 被 mha4mysql-manager-0.57-0.el7.noarch 需要
    perl(Log::Dispatch) 被 mha4mysql-manager-0.57-0.el7.noarch 需要
    perl(Log::Dispatch::File) 被 mha4mysql-manager-0.57-0.el7.noarch 需要
    perl(Log::Dispatch::Screen) 被 mha4mysql-manager-0.57-0.el7.noarch 需要
    perl(Parallel::ForkManager) 被 mha4mysql-manager-0.57-0.el7.noarch 需要

  1. yum安裝炬转,先更新yum源
yum -y update
yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm

沒有可用軟件包 perl-Log-Dispatch辆苔。沒有可用軟件包 perl-Parallel-ForkManager。
注意:這兩步的安裝方式為centos 系統(tǒng)扼劈,在其他系統(tǒng)中驻啤,需要自己調(diào)整安裝方法

yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes -y
yum -y install  perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch   perl-Parallel-ForkManager  
yum install -y rrdtool perl-rrdtool rrdtool-devel perl-Params-Validate

創(chuàng)建 MHA 配置文件

  1. slave2管理節(jié)點(diǎn)上,位置可以自定義
    mkdir /usr/local/mha
    vi /usr/local/mha/mha.cnf
    配置文件添加如下:
[server default]
# mha監(jiān)聽用戶
hostname=192.168.1.13
user=mha
password=123456
ssh_user=root
# 配置主從復(fù)制時創(chuàng)建的復(fù)制用戶
repl_user=repl
repl_password=123456
#監(jiān)控mater荐吵,ping的頻率
ping_interval=1
# 在各臺服務(wù)器上創(chuàng)建目錄mkdir 如:/usr/local/mha
manager_workdir=/usr/local/mha
# 日志位置
manager_log=/usr/local/mha/manager.log
remote_workdir=/usr/local/mha
master_binlog_dir=/usr/local/mysql/data/mysql

#master_ip_failover:自動切換時vip管理的腳本
#master_ip_online_change:手動切換使用的腳本
#power_manager:故障發(fā)生后關(guān)閉主機(jī)的腳本
#send_report:發(fā)送報警的腳本骑冗。
master_ip_failover_script=/usr/local/mha/ha1/fail_script/master_ip_failover
master_ip_online_change_script=/usr/local/mha/ha1/fail_script/master_ip_online_change
secondary_check_script=/usr/bin/masterha_secondary_check  -s 192.168.1.12 -s 192.168.1.13  
report_script=/usr/local/mha/ha1/fail_script/send_report
shutdown_script=""
[server1]
hostname=192.168.1.11
port=3306
[server2]
hostname=192.168.1.12
port=3306
candidate_master=1
check_repl_delay=0
[server3]
hostname=192.168.1.13
port=3306
ignore_fail=1
no_master=1
基本配置部分
  1. mnager監(jiān)控用的mysql赊瞬,mha用戶 一般設(shè)置為root用戶
    user=mha
    password=123456
  2. ssh登入用戶名
    ssh_user=root
  3. 配置主從復(fù)制時創(chuàng)建的復(fù)制用戶,每個node服務(wù)器都需要存在
    repl_user=repl
    repl_password=123456
  4. mha_manager項目的主目錄 前面創(chuàng)建的mha放腳本的目錄
    manager_workdir=/usr/local/mha
  5. mha_manager記錄日志
    manager_log=/usr/local/mha/manager.log
  6. master服務(wù)器上查詢 show variables like 'log_bin_basename%';
    例如:值為 /data/mysql/mysql-bin贼涩,需要的是 /data/mysql
    master_binlog_dir=/usr/local/mysql/data/mysql
  7. 監(jiān)控mater巧涧,ping的頻率
    ping_interval=1
  8. node服務(wù)器在發(fā)生master切換時,binlog保持的路徑遥倦,每個node都會在該目錄下保存一份差異的binlog谤绳,除非沒有差異。
    remote_workdir=/usr/local/mha
  9. 設(shè)置為候選master袒哥,如果設(shè)置該參數(shù)以后缩筛,發(fā)生主從切換以后將會將此從庫提升為主庫,即使這個主庫不是集群中事件最新的slave
    candidate_master=1
  10. 默認(rèn)情況下如果一個slave落后master 100M的relay logs的話堡称,MHA將不會選擇該slave作為一個新的master瞎抛,因為對于這個slave的恢復(fù)需要花費(fèi)很長時間,通過設(shè)置check_repl_delay=0,MHA觸發(fā)切換在選擇一個新master的時候?qū)雎詮?fù)制延時粮呢,這個參數(shù)對于設(shè)置了candidate_master=1的主機(jī)非常有用婿失,因為這個候選主在切換的過程中一定是新的master
    check_repl_delay=0
  11. 如果不加上該參數(shù),當(dāng)該slave主機(jī)故障了啄寡,mha將無法啟動豪硅,加上該參數(shù)會忽略該主機(jī)是否正常,在mha啟動的時候加上參數(shù)--ignore_fail_on_start
    ignore_fail=1
  12. 不將該主機(jī)轉(zhuǎn)換為master
    no_master=1
高可用配置部分
  1. mha在線自動failover時處理VIP的配置文件
    master_ip_failover_script=/usr/local/mha/ha1/fail_script/master_ip_failover
  2. 在線手動執(zhí)行master切換時VIP的處理文件
    master_ip_online_change_script=/usr/local/mha/ha1/fail_script/master_ip_online_change
  3. 一旦MHA到新master之間的網(wǎng)絡(luò)出現(xiàn)問題挺物,manager會嘗試從backup登入到masger
    secondary_check_script=/usr/bin/masterha_secondary_check -s 192.168.1.12 -s 192.168.1.13
  4. 發(fā)生切換后執(zhí)行的報警腳本
    report_script=/usr/local/mha/ha1/fail_script/send_report
  5. 故障后關(guān)閉master主機(jī)的腳本(主要是使用keepalive做VIP時會出現(xiàn)腦裂導(dǎo)致VIP頻繁切換所以會將故障的master關(guān)閉)
    shutdown_script=""

ha1腳本

  1. master_ip_failover
    VIP的配置可以使用keepalived也可以寫腳本懒浮,keepalived對網(wǎng)絡(luò)的要求很高否則容易腦裂砚著,在我前面搭建雙主環(huán)境講過keepalived的搭建方法舌镶,我這里使用腳本的方式否灾。
    需要修改VIP惩阶、網(wǎng)卡ens33
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';
use Getopt::Long;

my (
    $command,          $ssh_user,        $orig_master_host, $orig_master_ip,
    $orig_master_port, $new_master_host, $new_master_ip,    $new_master_port
);

my $vip = '192.168.0.100/24';
my $key = '1';
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";

GetOptions(
    'command=s'          => \$command,
    'ssh_user=s'         => \$ssh_user,
    'orig_master_host=s' => \$orig_master_host,
    'orig_master_ip=s'   => \$orig_master_ip,
    'orig_master_port=i' => \$orig_master_port,
    'new_master_host=s'  => \$new_master_host,
    'new_master_ip=s'    => \$new_master_ip,
    'new_master_port=i'  => \$new_master_port,
);

exit &main();

sub main {

    print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
if ( $command eq "stop" || $command eq "stopssh" ) {

        my $exit_code = 1;
        eval {
            print "Disabling the VIP on old master: $orig_master_host \n";
            &stop_vip();
            $exit_code = 0;
        };
        if ($@) {
            warn "Got Error: $@\n";
            exit $exit_code;
        }
        exit $exit_code;
    }
    elsif ( $command eq "start" ) {

        my $exit_code = 10;
        eval {
            print "Enabling the VIP - $vip on the new master - $new_master_host \n";
            &start_vip();
            $exit_code = 0;
        };
        if ($@) {
            warn $@;
            exit $exit_code;
        }
        exit $exit_code;
    }
elsif ( $command eq "status" ) {
        print "Checking the Status of the script.. OK \n";
        exit 0;
    }
    else {
        &usage();
        exit 1;
    }
}

sub start_vip() {
    `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
sub stop_vip() {
     return 0  unless  ($ssh_user);
    `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}

sub usage {
    print
    "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
  1. master_ip_online_change
    perl腳本
#!/usr/bin/env perl  
use strict;
use warnings FATAL =>'all';

use Getopt::Long;

my $vip = '192.168.0.100/24';  # Virtual IP  
my $key = "1";
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";
my $exit_code = 0;

my (
  $command,              $orig_master_is_new_slave, $orig_master_host,
  $orig_master_ip,       $orig_master_port,         $orig_master_user,
  $orig_master_password, $new_master_host,
  $new_master_ip,        $new_master_port,          $new_master_user,
  $new_master_password,
);
GetOptions(
  'command=s'                => \$command,
  'orig_master_is_new_slave' => \$orig_master_is_new_slave,
  'orig_master_host=s'       => \$orig_master_host,
  'orig_master_ip=s'         => \$orig_master_ip,
  'orig_master_port=i'       => \$orig_master_port,
  'orig_master_user=s'       => \$orig_master_user,
  'orig_master_password=s'   => \$orig_master_password,
  'new_master_host=s'        => \$new_master_host,
  'new_master_ip=s'          => \$new_master_ip,
  'new_master_port=i'        => \$new_master_port,
  'new_master_user=s'        => \$new_master_user,
  'new_master_password=s'    => \$new_master_password,
);

exit &main();

sub main {

#print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";  

if ( $command eq "stop" || $command eq "stopssh" ) {

        # $orig_master_host, $orig_master_ip, $orig_master_port are passed.  
        # If you manage master ip address at global catalog database,  
        # invalidate orig_master_ip here.  
        my $exit_code = 1;
        eval {
            print "\n\n\n***************************************************************\n";
            print "Disabling the VIP - $vip on old master: $orig_master_host\n";
            print "***************************************************************\n\n\n\n";
&stop_vip();
            $exit_code = 0;
        };
        if ($@) {
            warn "Got Error: $@\n";
            exit $exit_code;
        }
        exit $exit_code;
}
elsif ( $command eq "start" ) {

        # all arguments are passed.  
        # If you manage master ip address at global catalog database,  
        # activate new_master_ip here.  
        # You can also grant write access (create user, set read_only=0, etc) here.  
my $exit_code = 10;
        eval {
            print "\n\n\n***************************************************************\n";
            print "Enabling the VIP - $vip on new master: $new_master_host \n";
            print "***************************************************************\n\n\n\n";
&start_vip();
            $exit_code = 0;
        };
        if ($@) {
            warn $@;
            exit $exit_code;
        }
        exit $exit_code;
}

elsif ( $command eq "status" ) {
        print "Checking the Status of the script.. OK \n";
        `ssh $orig_master_user\@$orig_master_host \" $ssh_start_vip \"`;
        exit 0;
}
else {
&usage();
        exit 1;
}
}

# A simple system call that enable the VIP on the new master  
sub start_vip() {
`ssh $new_master_user\@$new_master_host \" $ssh_start_vip \"`;
}
# A simple system call that disable the VIP on the old_master  
sub stop_vip() {
`ssh $orig_master_user\@$orig_master_host \" $ssh_stop_vip \"`;
}

sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
  1. shell腳本
#/bin/bash  
#source /root/.bash_profile  
  
vip=`echo '192.168.0.100/24'`  # Virtual IP  
key=`echo '1'`  
  
command=`echo "$1" | awk -F = '{print $2}'`  
orig_master_host=`echo "$2" | awk -F = '{print $2}'`  
new_master_host=`echo "$7" | awk -F = '{print $2}'`    
  
stop_vip=`echo "ssh root@$orig_master_host /sbin/ifconfig  ens33:$key  down"`  
start_vip=`echo "ssh root@$new_master_host /sbin/ifconfig  ens33:$key  $vip"`  
  
if [ $command = 'stop' ]  
   then  
   echo -e "\n\n\n***************************************************************\n"  
   echo -e "Disabling the VIP - $vip on old master: $orig_master_host\n"  
   $stop_vip  
   if [ $? -eq 0 ]  
      then  
      echo "Disabled the VIP successfully"  
   else  
      echo "Disabled the VIP failed"  
   fi  
   echo -e "***************************************************************\n\n\n\n"  
fi  
  
if [ $command = 'start' -o $command = 'status' ]  
   then  
   echo -e "\n\n\n***************************************************************\n"  
   echo -e "Enabling the VIP - $vip on new master: $new_master_host \n"  
   $start_vip  
   if [ $? -eq 0 ]  
      then  
      echo "Enabled the VIP successfully"  
   else  
      echo "Enabled the VIP failed"  
   fi  
   echo -e "***************************************************************\n\n\n\n"  
fi 
  1. send_report
#!/usr/bin/perl

#  Copyright (C) 2011 DeNA Co.,Ltd.
#
#  This program is free software; you can redistribute it and/or modify
#  it under the terms of the GNU General Public License as published by
#  the Free Software Foundation; either version 2 of the License, or
#  (at your option) any later version.
#
#  This program is distributed in the hope that it will be useful,
#  but WITHOUT ANY WARRANTY; without even the implied warranty of
#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#  GNU General Public License for more details.
#
#  You should have received a copy of the GNU General Public License
#   along with this program; if not, write to the Free Software
#  Foundation, Inc.,
#  51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA

## Note: This is a sample script and is not complete. Modify the script based on your environment.

use strict;
use warnings FATAL => 'all';
use Mail::Sender;
use Getopt::Long;

#new_master_host and new_slave_hosts are set only when recovering master succeeded
my ( $dead_master_host, $new_master_host, $new_slave_hosts, $subject, $body );
my $smtp='1019533934@qq.com';
my $mail_from='xxxx';
my $mail_user='xxxxx';
my $mail_pass='xxxxx';
my $mail_to=['xxxx','xxxx'];
GetOptions(
  'orig_master_host=s' => \$dead_master_host,
  'new_master_host=s'  => \$new_master_host,
  'new_slave_hosts=s'  => \$new_slave_hosts,
  'subject=s'          => \$subject,
  'body=s'             => \$body,
);

mailToContacts($smtp,$mail_from,$mail_user,$mail_pass,$mail_to,$subject,$body);

sub mailToContacts {
    my ( $smtp, $mail_from, $user, $passwd, $mail_to, $subject, $msg ) = @_;
    open my $DEBUG, "> /tmp/monitormail.log"
        or die "Can't open the debug      file:$!\n";
    my $sender = new Mail::Sender {
        ctype       => 'text/plain; charset=utf-8',
        encoding    => 'utf-8',
        smtp        => $smtp,
        from        => $mail_from,
        auth        => 'LOGIN',
        TLS_allowed => '0',
        authid      => $user,
        authpwd     => $passwd,
        to          => $mail_to,
        subject     => $subject,
        debug       => $DEBUG
    };

    $sender->MailMsg(
        {   msg   => $msg,
            debug => $DEBUG
        }
    ) or print $Mail::Sender::Error;
    return 1;
}
# Do whatever you want here
exit 0;
  • 修改用戶稠炬,默認(rèn)監(jiān)控用戶是mha
    cd /usr/bin
    vi masterha_secondary_check
  • 給上面腳本賦權(quán):
    chmod -R 755 /usr/local/mha/ha1/fail_script
master手動添加虛擬VIP

注意:需要手動先在master服務(wù)器上面添加VIP
yum install -y net-tools.x86_64
查看網(wǎng)卡:
ip a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:0c:29:c4:82:82 brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.11/24 brd 192.168.0.255 scope global noprefixroute ens33
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:fec4:8282/64 scope link 
       valid_lft forever preferred_lft forever

添加虛擬VIP:
ifconfig ens33:1 192.168.1.100/24
ifconfig ens33:1 192.168.1.100/24 down

配置relay_log的清除方式(在每個Node上)

  1. 所有Node的cnf配置文件加上
    relay_log_purge=0
    MHA在發(fā)生切換的過程中,從庫的恢復(fù)過程中依賴于relay log的相關(guān)信息冕象,所以這里要將relay log的自動清除設(shè)置為OFF膀估,采用手動清除relay log的方式饼记。
    在默認(rèn)情況下低斋,從服務(wù)器上的中繼日志會在SQL線程執(zhí)行完畢后被自動刪除巴比。但是在MHA環(huán)境中,這些中繼日志在恢復(fù)其他從服務(wù)器時可能會被用到礁遵,因此需要禁用中繼日志的自動刪除功能轻绞。定期清除中繼日志需要考慮到復(fù)制延時的問題。在ext3的文件系統(tǒng)下佣耐,刪除大的文件需要一定的時間政勃,會導(dǎo)致嚴(yán)重的復(fù)制延時。為了避免復(fù)制延時兼砖,需要暫時為中繼日志創(chuàng)建硬鏈接奸远,因為在linux系統(tǒng)中通過硬鏈接刪除大文件速度會很快。
    提示:在mysql數(shù)據(jù)庫中讽挟,刪除大表時懒叛,通常也采用建立硬鏈接的方式
    MHA節(jié)點(diǎn)中包含了pure_relay_logs命令工具,它可以為中繼日志創(chuàng)建硬鏈接耽梅,執(zhí)行SET GLOBAL relay_log_purge=1,等待幾秒鐘以便SQL線程切換到新的中繼日志薛窥,再執(zhí)行SET GLOBAL relay_log_purge=0。
    pure_relay_logs腳本參數(shù)如下所示:
--user mysql                      用戶名
--password mysql                  密碼
--port                            端口號
--workdir                         指定創(chuàng)建relay log的硬鏈接的位置,默認(rèn)是/var/tmp诅迷,由于系統(tǒng)不同分區(qū)創(chuàng)建硬鏈接文件會失敗佩番,故需要執(zhí)行硬鏈接具體位置,成功執(zhí)行腳本后罢杉,硬鏈接的中繼日志文件被刪除
--disable_relay_log_purge         默認(rèn)情況下趟畏,如果relay_log_purge=1,腳本會什么都不清理滩租,自動退出赋秀,通過設(shè)定這個參數(shù),當(dāng)relay_log_purge=1的情況下會將relay_log_purge設(shè)置為0持际。清理relay log之后沃琅,最后將參數(shù)設(shè)置為OFF。

在每臺slave Node上創(chuàng)建
vim /usr/local/mha/purge_relay_log.sh

#!/bin/bash
user=root
passwd=root  
#確保用戶和密碼能通過127.0.0.1登入(本地的IP)
host='127.0.0.1'
port=3306
work_dir='/usr/local/mysql/data/mysql'
purge='/usr/bin/purge_relay_logs'
$purge --user=$user --password=$passwd --host=$host --disable_relay_log_purge --port=$port --workdir=$work_dir >> /usr/local/mha/purge_relay_logs.log 2>&1
  • 授權(quán)
    chmod u+x /usr/local/mha/purge_relay_log.sh
    cd /usr/local/mha
    ./purge_relay_log.sh
2019-06-23 12:07:51: purge_relay_logs script started.
 relay_log_purge is enabled. Disabling..
 Found relay_log.info: /usr/local/mysql/data/mysql/relay-log.info
 Opening /usr/local/mysql/data/mysql/localhost-relay-bin.000001 ..
 Opening /usr/local/mysql/data/mysql/localhost-relay-bin.000002 ..
 Executing SET GLOBAL relay_log_purge=1; FLUSH LOGS; sleeping a few seconds so that SQL thread can delete older relay log files (if it keeps up); SET GLOBAL relay_log_purge=0; .. ok.
2019-06-23 12:07:54: All relay log purging operations succeeded.

將腳本加入到os定時任務(wù)中


image

啟動 manager

check檢查
  • 檢查SSH配置
    masterha_check_ssh --conf=/usr/local/mha/mha.cnf
  • 檢查復(fù)制
    masterha_check_repl --conf=/usr/local/mha/mha.cnf
  • 檢查狀態(tài)
    masterha_check_status --conf=/usr/local/mha/mha.cnf
    必需保證所有的檢查都通過
  • 錯誤
    在驗證時蜘欲,我遇到過這個錯誤:Can't exec "mysqlbinlog" ......
    解決方法是在所有服務(wù)器上執(zhí)行:
    ln -s /usr/local/mysql/bin/mysqlbinlog /usr/local/bin/mysqlbinlog
    ln -s /usr/local/mysql/bin/mysql /usr/local/bin/mysql
    注意:加粗位置益眉,需要根據(jù)自己環(huán)境修改路徑
  • 驗證成功
    MySQL Replication Health is OK.
  1. 啟動MHA
    nohup masterha_manager --conf=/usr/local/mha/mha.cnf --ignore_fail_on_start --ignore_last_failover < /dev/null > /usr/local/mha/start.log 2>&1 &
  2. 查看啟動日志
    tail -f manager.log
IN SCRIPT TEST====/sbin/ifconfig eth0:1 down==/sbin/ifconfig eth0:1 192.168.0.100/24===
Checking the Status of the script.. OK 
Sun Jun 23 09:44:20 2019 - [info]  OK.
Sun Jun 23 09:44:20 2019 - [warning] shutdown_script is not defined.
Sun Jun 23 09:44:20 2019 - [info] Set master ping interval 1 seconds.
Sun Jun 23 09:44:20 2019 - [info] Set secondary check script: /usr/bin/masterha_secondary_check -s backup -s master --user=root --master_host=master --master_ip=192.168.0.11 --master_port=3306
Sun Jun 23 09:44:20 2019 - [info] Starting ping health check on 192.168.0.11(192.168.0.11:3306)..
Sun Jun 23 09:44:20 2019 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..
  1. 狀態(tài)檢查
    ps -ef|grep masterha_manager
    masterha_check_status --conf=/usr/local/mha/mha.cnf
mha (pid:29238) is running(0:PING_OK), master:192.168.0.11
  1. 處理故障master,將其配置為從庫chang到新的master姥份,可以從manager.log找到change語句郭脂。
    grep "CHANGE MASTER TO MASTER" /usr/local/mha/manager.log | tail -1
  2. 停掉MHA監(jiān)控
    masterha_stop --conf=/usr/local/mha/mha.cnf

故障轉(zhuǎn)移驗證


image.png

驗證MHA

驗證的方式是先停掉 master,因為之前的配置文件中澈歉,把 slave1 作為了候選人展鸡,那么就到 slave2 上查看 master 的 IP 是否變?yōu)榱?slave1 的 IP

  1. 停掉 master,在 master(192.168.0.11) 上把 mysql 停掉
    service mysqld stop
  2. 查看 MHA 日志埃难,上面的配置文件中指定了日志位置為 /usr/local/mha/manager.log
    tail -f /usr/local/mha/manager.log
Started automated(non-interactive) failover.
Selected 192.168.0.12(192.168.0.12:3306) as a new master.
192.168.0.12(192.168.0.12:3306): OK: Applying all logs succeeded.
192.168.0.13(192.168.0.13:3306): OK: Slave started, replicating from 192.168.0.12(192.168.0.12:3306)
192.168.0.12(192.168.0.12:3306): Resetting slave info succeeded.
Master failover to 192.168.0.12(192.168.0.12:3306) completed successfully.
  1. 刪除fail文件(非必需)
    由于啟動mha的時候加上了--ignore_last_failover參數(shù)莹弊,所以不刪除failower生成的文件也能啟動,否則需要刪除failower生成的文件“mha.failover.complete”涡尘。
    rm -rf /usr/local/mha/mha.failover.complete


    image.png
  2. 檢查新的slave
    登陸 slave(192.168.0.12)的mysql忍弛,查看 slave 狀態(tài)
    show slave status \G

*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.0.12
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000006
          Read_Master_Log_Pos: 473
               Relay_Log_File: localhost-relay-bin.000002
                Relay_Log_Pos: 408
        Relay_Master_Log_File: mysql-bin.000006
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

可以看到 master 的 IP 現(xiàn)在為 12,說明 MHA 已經(jīng)把 slave1 提升為了新的 master考抄,IO線程和SQL線程也正確運(yùn)行细疚,MHA 搭建成功。

IN SCRIPT TEST====/sbin/ifconfig eth0:1 down==/sbin/ifconfig eth0:1 192.168.1.100/24===

Checking the Status of the script.. OK 
Sun Jun 23 13:52:00 2019 - [info]  OK.
Sun Jun 23 13:52:00 2019 - [warning] shutdown_script is not defined.
Sun Jun 23 13:52:00 2019 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.

failover處理

dead master處理階段

down掉主服務(wù)器的VIP

/usr/local/mha/ha1/fail_script/master_ip_failover --orig_master_host=192.168.1.11 --orig_master_ip=192.168.1.100 --orig_master_port=3306 --command=stopssh --ssh_user=root
  • 問題1:

slave replicates is not defined in the configuration file!

MySQL徹底清除slave信息
stop slave;
reset slave all;

  • 問題2:

Tue Feb 23 13:47:18 2016 - [info] read_only=1 is not set on slave db
Tue Feb 23 13:47:18 2016 - [warning] relay_log_purge=0 is not set on slave db

解決辦法:
備庫執(zhí)行
set global read_only=1;
set global relay_log_purge=0;

keepalived配置

配置keepalived的配置文件川梅,在master上配置(192.168.1.11)
vi /etc/keepalived/keepalived.conf

global_defs {
     notification_email {
     saltstack@163.com
   }
   notification_email_from dba@dbserver.com
   smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id MySQL-HA
}

vrrp_instance VI_1 {
    state BACKUP
    interface ens33
    virtual_router_id 51 priority 150 advert_int 1 nopreempt

    authentication {
    auth_type PASS
    auth_pass 1111 
}

    virtual_ipaddress { 
    192.168.1.100
     }
}

其中router_id MySQL HA表示設(shè)定keepalived組的名稱疯兼,將192.168.1.100這個虛擬ip綁定到該主機(jī)的eth1網(wǎng)卡上,并且設(shè)置了狀態(tài)為backup模式贫途,將keepalived的模式設(shè)置為非搶占模式(nopreempt)吧彪,priority 150表示設(shè)置的優(yōu)先級為150。下面的配置略有不同丢早,但是都是一個意思来氧。

在候選master上配置(192.168.1.12)

vi /etc/keepalived/keepalived.conf

global_defs {
     notification_email {
     saltstack@163.com
   }
   notification_email_from dba@dbserver.com
   smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id MySQL-HA
}

vrrp_instance VI_1 {
    state BACKUP
    interface ens33
    virtual_router_id 51 priority 120 advert_int 1 nopreempt

    authentication {
    auth_type PASS
    auth_pass 1111 }

    virtual_ipaddress { 
  192.168.1.100
   }
} 

/etc/init.d/keepalived start ; tail -f /var/log/messages

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
  • 序言:七十年代末,一起剝皮案震驚了整個濱河市,隨后出現(xiàn)的幾起案子啦扬,更是在濱河造成了極大的恐慌,老刑警劉巖凫碌,帶你破解...
    沈念sama閱讀 207,113評論 6 481
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件扑毡,死亡現(xiàn)場離奇詭異,居然都是意外死亡盛险,警方通過查閱死者的電腦和手機(jī),發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 88,644評論 2 381
  • 文/潘曉璐 我一進(jìn)店門,熙熙樓的掌柜王于貴愁眉苦臉地迎上來噪径,“玉大人阵赠,你說我怎么就攤上這事『追龋” “怎么了惯驼?”我有些...
    開封第一講書人閱讀 153,340評論 0 344
  • 文/不壞的土叔 我叫張陵,是天一觀的道長递瑰。 經(jīng)常有香客問我祟牲,道長,這世上最難降的妖魔是什么抖部? 我笑而不...
    開封第一講書人閱讀 55,449評論 1 279
  • 正文 為了忘掉前任说贝,我火速辦了婚禮,結(jié)果婚禮上慎颗,老公的妹妹穿的比我還像新娘乡恕。我一直安慰自己,他們只是感情好俯萎,可當(dāng)我...
    茶點(diǎn)故事閱讀 64,445評論 5 374
  • 文/花漫 我一把揭開白布傲宜。 她就那樣靜靜地躺著,像睡著了一般讯屈。 火紅的嫁衣襯著肌膚如雪蛋哭。 梳的紋絲不亂的頭發(fā)上,一...
    開封第一講書人閱讀 49,166評論 1 284
  • 那天涮母,我揣著相機(jī)與錄音谆趾,去河邊找鬼。 笑死叛本,一個胖子當(dāng)著我的面吹牛沪蓬,可吹牛的內(nèi)容都是我干的。 我是一名探鬼主播来候,決...
    沈念sama閱讀 38,442評論 3 401
  • 文/蒼蘭香墨 我猛地睜開眼跷叉,長吁一口氣:“原來是場噩夢啊……” “哼!你這毒婦竟也來了?” 一聲冷哼從身側(cè)響起云挟,我...
    開封第一講書人閱讀 37,105評論 0 261
  • 序言:老撾萬榮一對情侶失蹤梆砸,失蹤者是張志新(化名)和其女友劉穎,沒想到半個月后园欣,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體帖世,經(jīng)...
    沈念sama閱讀 43,601評論 1 300
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 36,066評論 2 325
  • 正文 我和宋清朗相戀三年沸枯,在試婚紗的時候發(fā)現(xiàn)自己被綠了日矫。 大學(xué)時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
    茶點(diǎn)故事閱讀 38,161評論 1 334
  • 序言:一個原本活蹦亂跳的男人離奇死亡绑榴,死狀恐怖哪轿,靈堂內(nèi)的尸體忽然破棺而出,到底是詐尸還是另有隱情翔怎,我是刑警寧澤窃诉,帶...
    沈念sama閱讀 33,792評論 4 323
  • 正文 年R本政府宣布,位于F島的核電站姓惑,受9級特大地震影響褐奴,放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜于毙,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 39,351評論 3 307
  • 文/蒙蒙 一敦冬、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧唯沮,春花似錦脖旱、人聲如沸。這莊子的主人今日做“春日...
    開封第一講書人閱讀 30,352評論 0 19
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽。三九已至币旧,卻和暖如春践险,著一層夾襖步出監(jiān)牢的瞬間,已是汗流浹背吹菱。 一陣腳步聲響...
    開封第一講書人閱讀 31,584評論 1 261
  • 我被黑心中介騙來泰國打工巍虫, 沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留,地道東北人鳍刷。 一個月前我還...
    沈念sama閱讀 45,618評論 2 355
  • 正文 我出身青樓占遥,卻偏偏與公主長得像,于是被迫代替她去往敵國和親输瓜。 傳聞我的和親對象是個殘疾皇子瓦胎,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 42,916評論 2 344

推薦閱讀更多精彩內(nèi)容