MySQL Group Replication

Group Replication

[TOC]

關(guān)于 Group Greplication

  • 異步復(fù)制
  • 半同步復(fù)制
  • Group Replication

Group Replication 是組復(fù)制阵赠,不是同步復(fù)制淌喻,但最終是同步的驼鹅,更確切地說砍鸠,事務(wù)以相同順序傳遞給所有組成員,但它們執(zhí)行并不同步励两,接受事務(wù)被提交之后沟饥,每個成員以自己的速度提交珠叔。

異步復(fù)制

<img src="https://dev.mysql.com/doc/refman/5.7/en/images/async-replication-diagram.png" style="zoom:50%" />

半同步復(fù)制

<img src="https://dev.mysql.com/doc/refman/5.7/en/images/semisync-replication-diagram.png" style="zoom:50%" />

Group Replication

<img src="https://dev.mysql.com/doc/refman/5.7/en/images/gr-replication-diagram.png" style="zoom:50%" />

此圖沒有包含 Paxos 消息等信息

Group Replication 要求

  • InnoDB
  • Primary Key
  • IPv4棺亭,不支持 IPv6
  • Network Performance
  • log_bin = binlog
  • log_slave_update = ON
  • binlog_format = ROW
  • GTID
  • master_info_repository = TABLE虎眨、relay_log_info_repository = TABLE
  • transaction_write_set_extraction = XXHASH64
  • Multi-threaded Appliers (slave_parallel_type、slave_parallel_workers镶摘、slave_preserve_commit_order)
  • 當(dāng)前環(huán)境要存在 root@localhost 用戶哮独,INSTALL PLUGIN 時要使用 root@localhost 創(chuàng)建和檢查 _gr_user@localhost

Group Replication 限制

  • binlog_checksum = NONE
  • Gap Locks : 認(rèn)證過程(The certification process)不考慮 Gap locks蕉鸳,因為Gap locks信息在InnoDB外不可用严嗜,詳細(xì)參考 Gap Locks
    • 除非應(yīng)用程序依賴 REPEATABLE READ疙驾,否則建議 READ COMMITTED羡疗,因為 InnoDB 在 READ COMMITTED 沒有 Gap locks串远,InnoDB本身沖突檢測和Group Replication 分布式?jīng)_突檢測協(xié)同工作
  • Table Locks and Named Locks : 認(rèn)證過程(The certification process)不考慮表鎖 [ Section 13.3.5, “LOCK TABLES and UNLOCK TABLES Syntax 和命名鎖 GET_LOCK()
  • Savepoints Not Supported (5.7.19 已經(jīng)支持)
  • SERIALIZABLE Isolation Level 多主模式不支持
  • Concurrent DDL versus DML Operations(并發(fā)DDL與DML操作) : 多主模式下汹族,同一對象在不同成員并發(fā)執(zhí)行DDL和DML語句锚国,可能造成腦裂數(shù)據(jù)不一致
  • 級聯(lián)約束外鍵 多主模式不支持,外鍵約束可能導(dǎo)致多主模式組成的級聯(lián)操作造成檢測不到的沖突俊卤,導(dǎo)致組內(nèi)成員數(shù)據(jù)不一致。單主模式不受影響害幅。
  • Very Large Transactions : 產(chǎn)生足夠大的GTID內(nèi)容的個別事務(wù)可能導(dǎo)致組通信中的故障消恍,該GTID內(nèi)容足夠大,無法通過網(wǎng)絡(luò)在組合成員之間復(fù)制5秒鐘以內(nèi)以现。為避免此問題狠怨,盡可能多地嘗試限制交易的大小。例如邑遏,將使用LOAD DATA INFILE的文件拆分成更小的塊佣赖。

常見問題

  1. 一個 MySQL Group Replication 最大成員數(shù)是多少?

    9個

  2. MySQL Group Replication 組內(nèi)成員怎么通信记盒?

    通過 P2P TCP 連接憎蛤,參數(shù) group_replication_local_address 設(shè)置,IP要求所有成員可訪問纪吮,僅用于組內(nèi)成員內(nèi)部通信和消息傳遞

  3. group_replication_bootstrap_group 參數(shù)有什么用俩檬?

    MySQL Gropu Replication 創(chuàng)建組并做為初始化種子,第二個加入的成員需要向引導(dǎo)成員請求動態(tài)更改配置碾盟,以便加入到組中棚辽。

    二種情況下使用:

    • 創(chuàng)建初始化 Group Replication時
    • 關(guān)閉并重新啟動整個組時
  4. 如果配置恢復(fù)過程?

    可以通過 CHANGE MASTER TO 語句預(yù)配置組恢復(fù)通道

  5. 可以通過組復(fù)制橫向擴展寫負(fù)載冰肴?

    可以通過組內(nèi)不同成員之間傳播<u>無沖突</u>的事務(wù)來橫向擴展寫屈藐。

  6. 相同工作負(fù)載榔组,對比簡單復(fù)制,需要更多的CPU和寬帶联逻?

    組內(nèi)各個成員要不斷交互瓷患,成員數(shù)更多對成員數(shù)少要求更多的寬帶(9>3),為了保證組內(nèi)同步和消息傳遞遣妥,會占用更多CPU和內(nèi)存

  7. 可以跨廣域網(wǎng)部署嗎擅编?

    可以,但是每個成員之間網(wǎng)絡(luò)必須可靠箫踩,為了高性能需要低延時爱态、高寬帶,如網(wǎng)絡(luò)丟包導(dǎo)致重傳和更高端到端延時境钟,吞吐量和延時都會受影響

    在延時大锦担、帶寬窄的網(wǎng)絡(luò)環(huán)境,提高 Paxos對網(wǎng)絡(luò)適應(yīng)慨削,做了壓縮和多個本地事務(wù)封裝一個數(shù)據(jù)包的優(yōu)化 Section 17.9.7.2, “Message Compression”

    當(dāng)組內(nèi)成員通信往返時間 (RTT) 為2秒或更長時洞渔,可能會遇到問題,因為內(nèi)置故障檢測機制可能會錯誤的觸發(fā)

  8. 當(dāng)出現(xiàn)臨時連接問題時缚态,成員能自動重新加入組嗎磁椒?

    取決連接問題:

    • 連接問題暫時的,恢復(fù)足夠快玫芦,故障檢測還沒檢測到就進行了重新連接浆熔,則不會移除
    • 如果長時間連接問題,故障檢測器最終檢測到問題桥帆,該成員被移除医增,當(dāng)恢復(fù)后,需要手工加入(或腳本自動加入)
  9. 什么時候從組內(nèi)移除成員老虫?

    當(dāng)某個成員無響應(yīng)時(崩潰或網(wǎng)絡(luò)問題)叶骨,系統(tǒng)會檢測到故障,其他成員把它從組配置中移除祈匙,創(chuàng)建一個不包含該成員的新配置

  10. 當(dāng)一個成員明顯的延時怎么處理忽刽?

當(dāng)一個成員明顯延時,沒有定義何時從組中自動移除成員的策略菊卷,<u>要找到延時原因并修復(fù)它或從組中刪除該成員</u>缔恳,否則觸發(fā)"流量控制",那么整個組的也將變慢洁闰。(流量控制可配置)

  1. 在懷疑組中存在問題時歉甚,組中是否有某個特定成員負(fù)責(zé)觸發(fā)重新配置組?

    沒有扑眉。

    任何成員都可以懷疑組中存在問題纸泄,所有成員需要(自動)對"某個成員故障"達成一致的意見赖钞,有一個成員負(fù)責(zé)觸發(fā)重新配置,從組中將故障成員移除聘裁,具體哪個成員不可配置雪营。

  2. Can I use Group Replication for sharding?

  3. How do I use Group Replication with SELinux?

  4. How do I use Group Replication with iptables?

  5. How do I recover the relay log for a replication channel used by a group member?

單獨的通信機制

GR 使用 Slave 的通道,只是使用通過執(zhí)行線程(Applier Thread)來執(zhí)行 Binlog Event衡便,并沒有使用通道傳輸 Binlog Event献起。

沒有使用異步復(fù)制的 Binlog Event,也沒有使用 MySQL 服務(wù)端口通信镣陕,而是創(chuàng)建一個獨立 TCP 端口通信谴餐,各個 MySQL 服務(wù)器睥 Group Replication 插件通過這個端口連接在一起,兩兩通信

Binlog Event 多線程執(zhí)行

GR 插件自動創(chuàng)建一個通道 group_replication_applier (channel) 來執(zhí)行接收到的 Binlog Event呆抑,當(dāng)加入組時岂嗓,GR 插件自動啟動 group_replication_applier 通道的執(zhí)行線程(Applier Thread)

-- 手工調(diào)整這個通道執(zhí)行線程
START SLAVE SQL_THREAD FOR CHANNEL 'group_replication_applier';
STOP SLAVE SQL_THREAD FOR CHANNEL 'group_replication_applier';
基于主鍵的并行執(zhí)行
SET GLOBAL slave_parallel_type = 'LOGICAL_CLOCK';
SET GLOBAL slave_parallel_workers = N;
SET GLOBAL slave_preserve_commit_order = ON;

GR 的 LOGCAL_CLOCK 與異步復(fù)制的算法不同,GR 并發(fā)策略的邏輯時間是基于主鍵計算出來的鹊碍,比異步復(fù)制基于鎖計算出來的邏輯時間的并發(fā)性要好很多

基于主鍵并行復(fù)制特點

  • 若兩個事務(wù)更新同一行厌殉,則要按順序執(zhí)行,否則就可以并發(fā)
  • DDL 不能和任務(wù)事務(wù)并發(fā)侈咕,必須等待它前面所有事務(wù)執(zhí)行完才能開始執(zhí)行公罕,后面的事務(wù)也要必須等等 DDL 執(zhí)行完才能執(zhí)行

為什么配置 slave_preserve_commit_order

并發(fā)執(zhí)行時,不管兩個事務(wù) Binlog Event 是不是同一 session 產(chǎn)生乎完,只要滿足上面的特點就會并發(fā)熏兄,因此同一 session 里的事務(wù)可能被安排并發(fā)執(zhí)行,會導(dǎo)致后執(zhí)行的事務(wù)先被提交的情況树姨,為了保證同一個 session 的事務(wù)按照順序提交,必須配置此參數(shù)桥状,保證 Applier 上執(zhí)行事務(wù)的提交順序和 源 MySQL 一致

Paxos 協(xié)議優(yōu)點

  • 不會腦裂 [有疑問帽揪,原主從環(huán)境有腦裂 P363] ???
  • 冗余好,保證 Binlog 至少被復(fù)制超過一半成員辅斟,只要同時宕機成員不超過一半不會導(dǎo)致數(shù)據(jù)丟失
  • 保證只要 Binlog Event 沒有被傳輸?shù)桨霐?shù)以上成員转晰,本地成員不會將事務(wù)的 Binlog Event 寫入 Binlog 文件和提交事務(wù),從而保證宕機的服務(wù)器不會有組內(nèi)在線不存在的數(shù)據(jù)士飒,宕機的服務(wù)器重啟后查邢,不再需要特殊處理就可以加入組

服務(wù)模式

  • 單主模式 (默認(rèn)模式)
  • 多主模式
-- 設(shè)置多主模式
SET GLOBAL group_replication_single_primary_mode = OFF;

如果使用多主模式,需要加入組之前將此變量置為 OFF酵幕,服務(wù)模式不能在線切換扰藕,必須組內(nèi)所有成員退出組,然后重新初始化要使用的模式芳撒,再把其他成員加進來

單主模式

  • Primary Member
  • Secondary Member

自動選舉 && Failover

  • 初始化的成員自動選舉為 Primary Member
  • Failover:group_replication_member_weight (5.7.20 更新)單主模式權(quán)重邓深,先判斷權(quán)重大的為新 Primary未桥,若一樣大,對所有在線成員的 UUID 排序芥备,選最小的為 Primary Member,復(fù)制正常進行冬耿,但要注意,客戶端獲取 Primary Memory 的 UUID萌壳,然后連接新的 Primary Memory
# 任何成員查詢 Primary Member 的 UUID
show global status like 'group_replication_primary_member';

or

SELECT * FROM performance_schema.global_status WHERE variable_name = 'group_replication_primary_member';

讀寫自動切換

成員加入默認(rèn)為 "只讀" 模式亦镶,只有選取為 Primary Member 才會是 "寫" 模式

SET GLOBAL super_read_only = 1;
SET GLOBAL super_read_only = 0;

缺點

  • Failover 后,客戶端根據(jù) UUID 判斷是不是 Primary Member

多主模式

自增字段

-- 原 MySQL 自增變量
SET GLOBAL auto_increment_offset = N;
SET GLOBAL auto_increment_increment = N;

# Group Replicaion 組復(fù)制自增步長袱瓮,默認(rèn)為 7缤骨,最大節(jié)點為 9
SET GLOBAL group_replication_auto_increment_increment = N; 

注意:
a. 如果 server-id 為 1、2懂讯、3 配置荷憋,就不需要額外配置,但若 server-id不為 1褐望、2勒庄、3 則需要配置 auto_increment_increment、auto_increment_offset

若沒有配置 auto_increment_increment瘫里、auto_increment_offset实蔽,則自動將 group_replication_auto_increment_increment 和 server-id 設(shè)置到 auto_increment_increment、auto_increment_offset 上

b. auto_increment_increment 盡量設(shè)置大于或等于成員數(shù)谨读,最好大于局装,因為方便以后增加節(jié)點,擴展時再改變自增比較麻煩

優(yōu)點

  • 當(dāng)一個成員故障劳殖,只有一部分連接失效铐尚,應(yīng)用影響小
  • 當(dāng)關(guān)閉一個 MySQL 節(jié)點時,可以先將連接平滑轉(zhuǎn)移到其他機器上再關(guān)閉這個節(jié)點哆姻,不會瞬斷
  • 性能好 [有待評估] ???

缺點

  • 自增步長要大于成員數(shù)宣增,防止以后擴展麻煩
  • 不支持串行(SERIALIZABLE) 隔離等級,單節(jié)點通過鎖實現(xiàn)
  • 不支持外鍵級聯(lián)操作
# 當(dāng)為 True矛缨,當(dāng)發(fā)現(xiàn)上面 2 個不支持就會報錯爹脾,單主模式下為必須為 OFF
group_replication_enforce_update_everywhere_checks = TRUE
  • DDL 語句并發(fā)執(zhí)行的問題

多主復(fù)制,通過沖突檢測來辨別沖突事務(wù)箕昭,再回滾灵妨,5.7 的 DDL 不是原子操作無法回滾,因此 GR 沒到 DDL 做沖突檢測落竹,如果 DDL 和有沖突的語句發(fā)生在不同成員泌霍,可能導(dǎo)致數(shù)據(jù)不一致.

<u>所以必須執(zhí)行 DDL 前必須將有沖突的事務(wù)遷移到一臺機器上執(zhí)行</u>

  • DBA 維護時要注意防止腦裂 ???

當(dāng)維護節(jié)點 s3時,從DNS下線s3筋量,但在執(zhí)行 stop group_replication時烹吵,因為DNS緩存碉熄,要注意應(yīng)用長鏈接或短鏈接是否繼續(xù)連接此實例。(普通用戶不受影響肋拔,注意用戶權(quán)限)

# super 用戶查看 read_only 并 stop group_replication锈津,super_read_only 由 ON -> OFF,read_only 一直是 ON
localhost.(none)>show variables like '%read_only%';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
| innodb_read_only | OFF   |
| read_only        | ON    |
| super_read_only  | ON    |
| tx_read_only     | OFF   |
+------------------+-------+
4 rows in set (0.00 sec)

localhost.(none)>stop group_replication;
Query OK, 0 rows affected (8.15 sec)

localhost.(none)>show variables like '%read_only%';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
| innodb_read_only | OFF   |
| read_only        | ON    |
| super_read_only  | OFF   |
| tx_read_only     | OFF   |
+------------------+-------+
4 rows in set (0.00 sec)

# 普通用戶登陸測試
mysql -utest_user -ptest_user -hlocalhost -S /tmp/mysql6666.sock test

mysql> insert into t_n values(1,'test1');
ERROR 1290 (HY000): The MySQL server is running with the --read-only option so it cannot execute this statement

# Super 用戶測試
mysqlha_login.sh 6666

localhost.(none)>use test;
Database changed
localhost.test>insert into t_n values(1,'test1');
Query OK, 1 row affected (0.00 sec)

# 怎么加入 Group Replication凉蜂,可參考(組成員事務(wù)不一致)

多主模式思考

  • 寫請求可以分發(fā)多個成員上

  • 控制 DDL琼梆,當(dāng)有 DDL 執(zhí)行時,所有寫請求轉(zhuǎn)移到同一臺 MySQL 機器 [這個實現(xiàn)有點復(fù)雜]

  • 折中方案窿吩,多主模式當(dāng)單主模式使用

    • 與單主模式比較茎杂,去掉 Failover 主從切換
    • 解決 DDL 沖突問題,防止腦裂
    • 一套 GR 為多應(yīng)用提供服務(wù)纫雁,多應(yīng)用不同的數(shù)據(jù)煌往,沒有沖突
-- Session 1: A Member
BEGIN
INSERT INTO t1 VALUES(1);

-- Session 2: B Member
TRUNCATE t1;

-- Session 1:
COMMIT;

----> 2 個 session 事務(wù)執(zhí)行順序不同

-- SESSION 1:
INSERT INTO t1 VALUES(1);
TRUNCATE t1;

-- SESSION 2:
TRUNCATE t1;
INSERT INTO t1 VALUES(1);

配置 Group Replication

必要配置

復(fù)制用戶
SET SQL_LOG_BIN=0;
CREATE USER rpl_user@'%' IDENTIFIED BY 'rpl_pass';
GRANT REPLICATION SLAVE ON *.* TO rpl_user@'%';
FLUSH PRIVILEGES;
SET SQL_LOG_BIN=1;

每個節(jié)點都要單獨配置,如果 MySQL 初始化已經(jīng)創(chuàng)建轧邪,則可以省略此步

連接到哪個成員上去復(fù)制刽脖,是由 Group Replication插件隨機選擇,因為為 group_replication_reocvery 配置的用戶每個成員上都要存在

如果沒有 SET SQL_LOG_BIN = 0忌愚,認(rèn)為有非組內(nèi)事務(wù)曲管,START GROUP_REPLICATION; 失敗,需要 set global group_replication_allow_local_disjoint_gtids_join=ON;

my.cnf 配置
# my.cnf
server_id                   = 1
log_bin                     = binlog
relay_log                   = relay-log
gtid_mode                   = ON
enforce_gtid_consistency    = ON
binlog_format               = ROW
transaction-isolation       = READ-COMMITTED
binlog_checksum             = NONE
master_info_repository      = TABLE
relay_log_info_repository   = TABLE
log_slave_update            = ON
slave_parallel_type         = LOGIAL_CLOCK
slave_parallel_workers      = 8
slave_preserve_commit_order = ON

# Group Replication
plugin-load                        = group_replication.so
transaction_write_set_extraction   = XXHASH64
loose-group_replication_group_name = 93f19c6c-6447-11e7-9323-cb37b2d517f3
loose-group_replication_start_on_boot = OFF
loose-group_replication_local_address = 'db1:3306'
loose-group_replication_group_seeds   = 'db2:3306,db3:3306'
group_replication_ip_whitelist  = '10.0.0.0/8'
loose-group_replication_bootstrap_group = OFF
# loose-group_replication_single_primary_mode = OFF       # Trun off Single primary
# loose-group_replication_enforce_update_everywhere_checks = ON # Multi-Primary Mode (靜態(tài))
loose-group_replication_transaction_size_limit = 52428800 # 5.7.19 Configures the maximum transaction size in bytes which the group accepts
# loose-group_replication_unreachable_majority_timeout
report_host = '域名'
report_port = 3306
loose-group_replication_flow_control_applier_threshold = 250000
loose-group_replication_flow_control_certifier_threshold = 250000

在現(xiàn)有初始化環(huán)境配置 MGR

# For Group Replication
transaction-isolation           = READ-COMMITTED
binlog_checksum                 = NONE
master_info_repository          = TABLE
relay_log_info_repository       = TABLE
plugin-load                     = group_replication.so
transaction_write_set_extraction = XXHASH64
loose-group_replication_group_name = 33984520-8709-11e7-b883-782bcb105915
loose-group_replication_start_on_boot = OFF
loose-group_replication_local_address = '10.13.2.29:9999'
loose-group_replication_group_seeds   = '10.13.2.29:9999,10.77.16.197:9999,10.73.25.178:9999'
loose-group_replication_bootstrap_group = OFF
# loose-group_replication_enforce_update_everywhere_checks = ON # Multi-Primary Mode
loose-group_replication_transaction_size_limit = 52428800 # 5.7.19 Configures the maximum transaction size in bytes which the group accepts
group_replication_ip_whitelist  = '10.0.0.0/8'
# loose-group_replication_single_primary_mode = OFF
report_host = '10.13.2.29'
report_port = 8888

配置說明:

  • 開啟 Binlog硕糊、Relaylog

  • 開啟 GTID 功能

  • 設(shè)置 ROW 格式的 Binlog

  • 禁用 binlog_checksum (MySQL 5.7.17 不支持帶 checksum Binlog Event)

  • 要使用多源復(fù)制院水,必須使用將 Slave 通道(Channel) 的狀態(tài)信息存儲到系統(tǒng)表

  • 開啟并行復(fù)制

  • 開啟主鍵信息采集

    GR 需要 Server 層采集更新數(shù)據(jù)的主鍵信息,被 HASH 存儲起來简十,支持兩種 HASH 算法:XXHASH64檬某、MURMUR32,默認(rèn) transaction_write_set_extraction = OFF螟蝙,所以要使用 Group Replication 每張表都要有主鍵橙喘,否則更新數(shù)據(jù)時會失敗

    <u>一個組內(nèi)的所有成員必須配置相同的 HASH 算法</u>

  • plugin-load = 'group_replication.so' 相當(dāng)執(zhí)行 INSTALL PLUGIN group_replication SONAME "group_replication.so";

  • group_replication_group_name = <uuid> 設(shè)置 Group Replication Name,可以通過 select uuid(); 獲得

  • group_replication_start_on_boot = OFF MySQL 啟動時胶逢,不自動啟動 Group Replication

  • group_replication_local_address = <ip:port> 配置 Group Replication 本地成員監(jiān)聽端口,成員之間通過這個端口通信饰潜,如果所有成員不在一臺機器上初坠,不要配置 127.0.0.1,要配置成員內(nèi)網(wǎng)IP和端口

  • group_replication_group_seeds = <ip:port,ip:port...> 配置種子成員彭雾,新成員加入組時碟刺,需要與組內(nèi)成員通信,請求配組重新配置允許它加入組薯酝,不寫所有成員也可以

  • 當(dāng)多個 Server 同時加入組時半沽,確保使用已經(jīng)存在組的成員爽柒,不要使用正在申請加入組的成員,不支持創(chuàng)建組的時候同時加入多個成員
  • 從當(dāng)前view隨機選取數(shù)據(jù)源成員通信者填,當(dāng)多個成員進入組時浩村,同一 Server幾乎不會重復(fù)選擇,如果訪問失敗則自動連接新的數(shù)據(jù)源成員占哟,一直達到連接重試限制(group_replication_recovery_retry_count = 10 可動態(tài)修改)心墅,恢復(fù)線程將被中止并報錯
  • 恢復(fù)程序不會在每次嘗試連接數(shù)據(jù)源(donor)后休眠,僅當(dāng)對所有可能的數(shù)據(jù)源進行嘗試無果后才休眠(group_replication_recovery_reconnect_interval = 60 秒 可動態(tài)修改)

增強數(shù)據(jù)源節(jié)點切換 Enhanced Automatic Donor Switchover

  • 已清除數(shù)據(jù):如果所選擇的數(shù)據(jù)源成員榨乎,在恢復(fù)過程中所需某數(shù)據(jù)已經(jīng)刪除怎燥,則會出現(xiàn)錯誤,恢復(fù)程序檢測到此錯誤并重新選擇新的數(shù)據(jù)源成員
  • 重復(fù)數(shù)據(jù):當(dāng)新成員加入已經(jīng)包含一些與所選擇數(shù)據(jù)源成員相沖突的數(shù)據(jù)蜜暑,恢復(fù)過程中報錯铐姚,這可能新成員存在一些錯誤的事務(wù)。有人認(rèn)識可能恢復(fù)應(yīng)該失敗退出肛捍,而不是切換另一個數(shù)據(jù)源成員隐绵,但在異構(gòu)集群中一些成員共享沖突事務(wù),有些沒有篇梭,當(dāng)錯誤發(fā)生時氢橙,恢復(fù)可以選擇另一個數(shù)據(jù)源成員(donor)
  • 其他錯誤:如果任何線程恢復(fù)失敗(接收或應(yīng)用線程失敗),則會出現(xiàn)錯誤恬偷,恢復(fù)程序在組內(nèi)選擇一個新的數(shù)據(jù)源成員(donor)

在一些持續(xù)故障或短暫的故障時悍手,恢復(fù)程序自動重試到相同或新的數(shù)據(jù)源成員(the same or a new donor)

  • group_replication_ip_whitelist = <ip,net,...> <u>一定要設(shè)置白名單</u>,若不配置默認(rèn)為AUTOMATIC袍患,自動識別本機網(wǎng)口的私網(wǎng)地址和私網(wǎng)網(wǎng)段坦康,127.0.0.1 連接請求始終被允許,配置白名單一定要關(guān)閉 Group Replication
  • group_replication_bootstrap_group = OFF 如果為 ON 告訴 Group Replication 插件诡延,它是組內(nèi)第一個成員滞欠,要做初始化,初始化后改為 OFF

只在 Group Replication 初始化時或整個組崩潰后恢復(fù)的情況下使用肆良,當(dāng)多個 Server 配置此參數(shù)筛璧,可能人為造成腦裂

  • group_replication 變量加上 "loose" ,則可寫入 my.cnf 配置文件中

加載 Group Replication 插件

安裝插件

# 加載插件
INSTALL PLUGIN group_replication SONAME 'group_replication.so';

# 啟用插件
START GROUP_REPLICATION;
    > 將 MySQL 加入一個存在的 Group Replication 組或?qū)⑺跏蓟癁榻M內(nèi)第一個成員 
     
# 停用插件
STOP GROUP_REPLICATION;
    > 將 MySQL 從一個 Group Replication 組內(nèi)移除

初始化組

INSTALL PLUGIN group_replication SONAME "group_replication.so";
SET GLOBAL group_replication_group_name = "93f19c6c-6447-11e7-9323-cb37b2d517f3";  # 可以 select uuid(); 生成
SET GLOBAL group_replication_local_address = "dbaone:7777";
SET GLOBAL group_replication_bootstrap_group = ON ;
START GROUP_REPLICATION;
SET GLOBAL group_replication_bootstrap_group = OFF ;

新成員加入

INSTALL PLUGIN group_replication SONAME "group_replication.so";
SET GLOBAL group_replication_group_name = "93f19c6c-6447-11e7-9323-cb37b2d517f3";  # 可以 select uuid(); 生成
SET GLOBAL group_replication_local_address = "127.0.0.1:7777";
SET GLOBAL group_replication_group_seeds = "db2:7777";
change master to master_user = 'replica',master_password='eHnNCaQE3ND' for channel 'group_replication_recovery';
START GROUP_REPLICATION;

注:如果在 my.ncf 已經(jīng)配置惹恃,這里初始化和新成員加入有些步驟就不需要做了

初始化:

START GROUP_REPLICATION;
SET GLOBAL group_replication_bootstrap_group = OFF

新成員加入

change master to master_user = 'replica',master_password='eHnNCaQE3ND' for channel 'group_replication_recovery';
START GROUP_REPLICATION;

注:

  • 當(dāng)新成員加入夭谤,首先從其他節(jié)點,把它加入之前的數(shù)據(jù)復(fù)制過來巫糙,這些數(shù)據(jù)不能通過 Group Replication 通信協(xié)議進行復(fù)制朗儒,而是使用 group_replication_recovery 異步復(fù)制通道(Channel)
    • 要保證 Binlog 一直存在,否則需要數(shù)據(jù)初始化,將數(shù)據(jù)恢復(fù)到最近的時間點再加入組 ???(wait 測試)
  • 連接到哪個成員上去復(fù)制醉锄,是由 Group Replication插件隨機選擇乏悄,因為為 group_replication_reocvery 配置的用戶每個成員上都要存在
  • 在啟動 group_replication_recovery 之前,Group Replication 會自動為其配置 MASTER_HOST恳不、MASTER_PORT
  • 當(dāng)一個成員加入時檩小,會收到組內(nèi)其他成員的配置信息,包括主機名和端口妆够,主機名和端口是從 全局只讀變量 hostname识啦、port 獲取,如果hostname 無法解析成IP或網(wǎng)絡(luò)中使用網(wǎng)絡(luò)地址映射神妹,group_replication_recover 通道無法正常工作
    • /etc/hosts 配置所有成員的主機名和IP地址對應(yīng)關(guān)系
    • 配置 MySQL report_host颓哮、report_port,Group Replication 會優(yōu)先使用此參數(shù)

Group Replication 網(wǎng)絡(luò)分區(qū)

當(dāng)有多個節(jié)點意外故障鸵荠,法定人數(shù)可能會丟失冕茅,導(dǎo)致大多數(shù)成員從組內(nèi)被移除。

例如5個成員的GR蛹找,若同時3個成員突然沒有消息姨伤,大多數(shù)成員仲裁的規(guī)則被破壞,因為不能實現(xiàn)仲裁庸疾,事實上剩下的2個成員不能辨別其他3個server是否崩潰或網(wǎng)絡(luò)分隔成獨立的2個server乍楚,因此該組不能重新配置。

如果有成員主動退出届慈,那么它會告知組進行重新配置徒溪。實際上,一個將要離開的成員告知其他成員金顿,它將要離開臊泌,其他成員可以重新配置組,保持組內(nèi)成員關(guān)系一致揍拆,并重新計算仲裁數(shù)渠概,如果5個成員,如果3個成員一個接一個離開組嫂拴,總數(shù)從5->2播揪,同樣的時間,確保法定人數(shù)筒狠。

檢測分區(qū)

正常情況剪芍,組內(nèi)每個成員均可通過 performance_schema.replication_group_members 查詢所有成員狀態(tài),每個成員的狀態(tài)窟蓝,也是視圖中所有成員共同決議的。

如果存在網(wǎng)絡(luò)分隔,表內(nèi)訪問不到的成員狀態(tài)是 “UNREACHABLE”运挫,由 Group Replicaion 內(nèi)置的本地故障檢測器完成状共。

As such, lets assume that there is a group with these 5 servers in it:

  • Server s1 with member identifier 199b2df7-4aaf-11e6-bb16-28b2bd168d07
  • Server s2 with member identifier 199bb88e-4aaf-11e6-babe-28b2bd168d07
  • Server s3 with member identifier 1999b9fb-4aaf-11e6-bb54-28b2bd168d07
  • Server s4 with member identifier 19ab72fc-4aaf-11e6-bb51-28b2bd168d07
  • Server s5 with member identifier 19b33846-4aaf-11e6-ba81-28b2bd168d07
mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 1999b9fb-4aaf-11e6-bb54-28b2bd168d07 | 127.0.0.1   |       13002 | ONLINE       |
| group_replication_applier | 199b2df7-4aaf-11e6-bb16-28b2bd168d07 | 127.0.0.1   |       13001 | ONLINE       |
| group_replication_applier | 199bb88e-4aaf-11e6-babe-28b2bd168d07 | 127.0.0.1   |       13000 | ONLINE       |
| group_replication_applier | 19ab72fc-4aaf-11e6-bb51-28b2bd168d07 | 127.0.0.1   |       13003 | ONLINE       |
| group_replication_applier | 19b33846-4aaf-11e6-ba81-28b2bd168d07 | 127.0.0.1   |       13004 | ONLINE       |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
5 rows in set (0,00 sec)

# 當(dāng)同時3個成員失去聯(lián)系 (s3、s4谁帕、s5)峡继,此時在s1上查看

mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 1999b9fb-4aaf-11e6-bb54-28b2bd168d07 | 127.0.0.1   |       13002 | UNREACHABLE  |
| group_replication_applier | 199b2df7-4aaf-11e6-bb16-28b2bd168d07 | 127.0.0.1   |       13001 | ONLINE       |
| group_replication_applier | 199bb88e-4aaf-11e6-babe-28b2bd168d07 | 127.0.0.1   |       13000 | ONLINE       |
| group_replication_applier | 19ab72fc-4aaf-11e6-bb51-28b2bd168d07 | 127.0.0.1   |       13003 | UNREACHABLE  |
| group_replication_applier | 19b33846-4aaf-11e6-ba81-28b2bd168d07 | 127.0.0.1   |       13004 | UNREACHABLE  |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
5 rows in set (0,00 sec)

解除分隔

解決

  • 停止 s1、s2 的組復(fù)制或完成停了實例匈挖,找到其他三個成員停止的原因碾牌,然后重新啟動組復(fù)制(或服務(wù))
  • group_replication_force_members 強制這個參數(shù)列表成員組成 Group,其實成員移出組 (<u>謹(jǐn)慎操作儡循,防止腦裂</u>)

強制組成員配置

<img src="https://dev.mysql.com/doc/refman/5.7/en/images/gr-majority-lost-to-stable-group.png" style="zoom:50%" />

假設(shè)只有 s1舶吗、s2 在線,s3择膝、s4誓琼、s5 意外離開并不在線,可以強制讓 s1肴捉、s2組成一個組 (membership)

group_replication_force_members 被視為最后補救措施腹侣,一定小心使用,僅當(dāng)多數(shù)成員導(dǎo)致不能仲裁時使用齿穗,如果被濫用傲隶,可能導(dǎo)致<u>腦裂和阻塞整個系統(tǒng)</u>

<u>確定 s3、s4窃页、s5 一定不在線跺株,不可訪問,如果這三個成員分區(qū)隔離(因為占大多數(shù))腮出,強制 s1帖鸦、s2組成新組,造成人為腦裂</u>


# s1 查看

mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 1999b9fb-4aaf-11e6-bb54-28b2bd168d07 | 127.0.0.1   |       13002 | UNREACHABLE  |
| group_replication_applier | 199b2df7-4aaf-11e6-bb16-28b2bd168d07 | 127.0.0.1   |       13001 | ONLINE       |
| group_replication_applier | 199bb88e-4aaf-11e6-babe-28b2bd168d07 | 127.0.0.1   |       13000 | ONLINE       |
| group_replication_applier | 19ab72fc-4aaf-11e6-bb51-28b2bd168d07 | 127.0.0.1   |       13003 | UNREACHABLE  |
| group_replication_applier | 19b33846-4aaf-11e6-ba81-28b2bd168d07 | 127.0.0.1   |       13004 | UNREACHABLE  |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
5 rows in set (0,00 sec)

# 檢查s1胚嘲、s2 Group Replication 通信地址 group_replication_local_address

mysql> SELECT @@group_replication_local_address;
+-----------------------------------+
| @@group_replication_local_address |
+-----------------------------------+
| 127.0.0.1:10000                   |
+-----------------------------------+
1 row in set (0,00 sec)

mysql> SELECT @@group_replication_local_address;
+-----------------------------------+
| @@group_replication_local_address |
+-----------------------------------+
| 127.0.0.1:10001                   |
+-----------------------------------+
1 row in set (0,00 sec)

# 強制重新配置組成員 s1 (127.0.0.1 只是列表作儿,真實環(huán)境要看變量 group_replication_local_address)
SET GLOBAL group_replication_force_members="127.0.0.1:10000,127.0.0.1:10001";

# 檢查 s1

mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | b5ffe505-4ab6-11e6-b04b-28b2bd168d07 | 127.0.0.1   |       13000 | ONLINE       |
| group_replication_applier | b60907e7-4ab6-11e6-afb7-28b2bd168d07 | 127.0.0.1   |       13001 | ONLINE       |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
2 rows in set (0,00 sec)

# 檢查 s2

mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | b5ffe505-4ab6-11e6-b04b-28b2bd168d07 | 127.0.0.1   |       13000 | ONLINE       |
| group_replication_applier | b60907e7-4ab6-11e6-afb7-28b2bd168d07 | 127.0.0.1   |       13001 | ONLINE       |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
2 rows in set (0,00 sec)

多主模式

多主模式不支持在線更改。

my.cnf 與單主模式配置不同

loose-group_replication_single_primary_mode = OFF
loose-group_replication_enforce_update_everywhere_checks = ON # start group_replication后不可修改
auto_increment_increment = 
auto_increment_offset    = 

auto_increment_increment馋劈、auto_increment_offset 如果不配置攻锰,auto_increment_offset 使用 server_id,auto_increment_increment 使用group_replication_auto_increment_increment(默認(rèn)為7)

多主模式妓雾,變量 group_replication_primary_member 為空

localhost.(none)>show global status like 'group_replication_primary_member';
+----------------------------------+-------+
| Variable_name                    | Value |
+----------------------------------+-------+
| group_replication_primary_member |       |
+----------------------------------+-------+
1 row in set (0.00 sec)

read_only娶吞、super_only_only 均為 OFF

localhost.(none)>show variables like '%read_only%';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
| innodb_read_only | OFF   |
| read_only        | OFF   |
| super_read_only  | OFF   |
| tx_read_only     | OFF   |
+------------------+-------+
4 rows in set (0.00 sec)

監(jiān)控

# 存儲組內(nèi)成員的基本信息,任何一個成員械姻,都能看到此信息
SELECT * FROM performance_schema.replication_group_members
     狀態(tài)有五種
     * OFFLINE:當(dāng)MySQL Group Replication 沒有啟用時
     * RECOVERING: 當(dāng) MySQL Group Replication 啟動后妒蛇,首先設(shè)置為 RECOVERING,開始復(fù)制加入前的數(shù)據(jù)
     * ONLINE:當(dāng) RECOVERING 完成后,狀態(tài)為 ONLINE绣夺,開始對外提供服務(wù)
     * ERROR: 當(dāng)本地成員發(fā)生錯誤后吏奸,Group Replication 無法工作時,當(dāng)前成員狀態(tài)會變成 ERROR (組內(nèi)正常成員可能看不到 ERROR 成員)
     * UNRECAHABLE: 當(dāng)網(wǎng)絡(luò)故障或其他成員宕機時陶耍,其他成員的狀態(tài)會被設(shè)置為 UNREACHABLE

+---------------------------+--------------------------------------+--------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST  | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| group_replication_applier | 817b0415-661a-11e7-842a-782bcb479deb | 10.55.28.64  |        6666 | ONLINE       |
| group_replication_applier | c745233b-6614-11e7-a738-40f2e91dc960 | 10.13.2.29   |        6666 | ONLINE       |
| group_replication_applier | e99a9946-6619-11e7-9b07-70e28406ebea | 10.77.16.197 |        6666 | ONLINE       |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
3 rows in set (0.00 sec)


# 存儲本地成員詳細(xì)信息奋蔚,每個成員只能查詢到自己的詳細(xì)信息
SELECT * FROM performance_schema.replication_group_member_stats  # 跟多主模式?jīng)_突檢測相關(guān)

localhost.(none)>SELECT CHANNEL_NAME,VIEW_ID,MEMBER_ID FROM performance_schema.replication_group_member_stats ;
+---------------------------+----------------------+--------------------------------------+
| CHANNEL_NAME              | VIEW_ID              | MEMBER_ID                            |
+---------------------------+----------------------+--------------------------------------+
| group_replication_applier | 14997660180761984:11 | e99a9946-6619-11e7-9b07-70e28406ebea |
+---------------------------+----------------------+--------------------------------------+
1 row in set (0.00 sec)


# 當(dāng)新成員加入時,首先通過異步通道(Channel) group_replication_recovery 把加入組之前組內(nèi)產(chǎn)生的數(shù)據(jù)同步過來烈钞,這個通道狀態(tài)信息和其他異步復(fù)制通道一樣泊碑,通過此表來監(jiān)控。
# Group Replication 的通道不會在 SHOW SLAVE STATUS 顯示
SELECT * FROM performance_schema.replication_connection_status

localhost.(none)>SELECT CHANNEL_NAME,GROUP_NAME,THREAD_ID,SERVICE_STATE FROM performance_schema.replication_connection_status;
+----------------------------+--------------------------------------+-----------+---------------+
| CHANNEL_NAME               | GROUP_NAME                           | THREAD_ID | SERVICE_STATE |
+----------------------------+--------------------------------------+-----------+---------------+
| group_replication_applier  | 93f19c6c-6447-11e7-9323-cb37b2d517f3 |      NULL | ON            |
| group_replication_recovery |                                      |      NULL | OFF           |
+----------------------------+--------------------------------------+-----------+---------------+


# Group Replication 通過 group_replication_applier 通道來執(zhí)行 Binlog_Event毯欣,group_replication_applier 跟其他異步通道一樣可以通過此表來查詢
SELECT * FROM performance_schema.replication_applier_status

+----------------------------+---------------+-----------------+----------------------------+
| CHANNEL_NAME               | SERVICE_STATE | REMAINING_DELAY | COUNT_TRANSACTIONS_RETRIES |
+----------------------------+---------------+-----------------+----------------------------+
| group_replication_applier  | ON            |            NULL |                          0 |
| group_replication_recovery | OFF           |            NULL |                          0 |
+----------------------------+---------------+-----------------+----------------------------+
2 rows in set (0.00 sec)

# Group Replication 線程的狀態(tài)信息
SELECT * FROM performance_schema.threads
     
     * thread/group_rpl/THD_applier_module_receiver
     * thread/group_rpl/THD_certifier_broadcast
     * thread/group_rpl/THD_recovery

SELECT * FROM mysql.`slave_master_info`
SELECT * FROM mysql.`slave_relay_log_info`

replication 相關(guān)視圖

SELECT * FROM performance_schema.replication_group_members;
SELECT * FROM performance_schema.replication_group_member_stats;
SELECT * FROM performance_schema.replication_applier_configuration;
SELECT * FROM performance_schema.replication_applier_status;
SELECT * FROM performance_schema.replication_applier_status_by_coordinator;
SELECT * FROM performance_schema.replication_applier_status_by_worker;
SELECT * FROM performance_schema.replication_connection_configuration;
SELECT * FROM performance_schema.replication_connection_status;

參考:

mysql-group-replication-monitoring

官方文檔 group-replication-monitoring

Group Replication 原理

MGR 事務(wù)執(zhí)行過程

事務(wù)執(zhí)行過程

  • 本地事務(wù)控制模塊
  • 成員間的通信模塊
  • 全局事務(wù)認(rèn)證模塊
  • 異地事務(wù)執(zhí)行模塊

Flow Control

MySQL Group Replication: understanding Flow Control

問題處理

1. 網(wǎng)絡(luò)中斷/分區(qū)處理

可以參考 Quest for Better Replication in MySQL: Galera vs. Group Replication

當(dāng)一個成員因為為網(wǎng)絡(luò)問題被清除 Group Replication馒过,需要手工處理,當(dāng)網(wǎng)絡(luò)恢復(fù)不能自動加入仪媒,PXC是可以的

error log 日志顯示如下

2017-07-14T11:58:57.208677+08:00 0 [Note] Plugin group_replication reported: '[GCS] Removing members that have failed while processing new view.'
2017-07-14T11:58:57.273764+08:00 0 [Note] Plugin group_replication reported: 'getstart group_id 6c6c3761'
2017-07-14T11:58:57.274693+08:00 0 [Note] Plugin group_replication reported: 'getstart group_id 6c6c3761'
2017-07-14T11:59:00.531055+08:00 0 [Note] Plugin group_replication reported: 'state 4330 action xa_terminate'
2017-07-14T11:59:00.531314+08:00 0 [Note] Plugin group_replication reported: 'new state x_start'
2017-07-14T11:59:00.531338+08:00 0 [Note] Plugin group_replication reported: 'state 4257 action xa_exit'
2017-07-14T11:59:00.531365+08:00 0 [Note] Plugin group_replication reported: 'Exiting xcom thread'
2017-07-14T11:59:00.531365+08:00 0 [Note] Plugin group_replication reported: 'Exiting xcom thread'
2017-07-14T11:59:00.531401+08:00 0 [Note] Plugin group_replication reported: 'new state x_start'

當(dāng)前節(jié)點狀態(tài)

SELECT * FROM performance_schema.replication_group_members;

+---------------------------+--------------------------------------+--------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST  | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| group_replication_applier | 817b0415-661a-11e7-842a-782bcb479deb | 10.55.28.64  |        6666 | ERROR       |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
1 row in set (0.00 sec)

處理步驟:

localhost.(none)>start group_replication; 
ERROR 3093 (HY000): The START GROUP_REPLICATION command failed since the group is already running.

localhost.(none)>stop group_replication;
Query OK, 0 rows affected (11.35 sec)

localhost.(none)>start group_replication; 
Query OK, 0 rows affected (5.53 sec)

localhost.(none)>SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST  | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| group_replication_applier | 817b0415-661a-11e7-842a-782bcb479deb | 10.55.28.64  |        6666 | ONLINE       |
| group_replication_applier | c745233b-6614-11e7-a738-40f2e91dc960 | 10.13.2.29   |        6666 | ONLINE       |
| group_replication_applier | e99a9946-6619-11e7-9b07-70e28406ebea | 10.77.16.197 |        6666 | ONLINE       |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
3 rows in set (0.00 sec)

2. 組成員事務(wù)不一致

被隔離成員沉桌,有不是 Group Replication 組內(nèi)的事務(wù),默認(rèn)是不能加入到 Group算吩,<u>若人為可知是一致性留凭,可強制加入,謹(jǐn)慎使用</u>


localhost.(none)>stop group_replication;

localhost.(none)>SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | c745233b-6614-11e7-a738-40f2e91dc960 | 10.13.2.29  |        6666 | OFFLINE      |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
1 row in set (0.00 sec)

localhost.test>create table t2 (id int primary key ,name varchar(10));
Query OK, 0 rows affected (0.01 sec)

localhost.test>show variables like '%read_only%';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
| innodb_read_only | OFF   |
| read_only        | OFF   |
| super_read_only  | OFF   |
| tx_read_only     | OFF   |
+------------------+-------+
4 rows in set (0.00 sec)

localhost.test>start group_replication;
ERROR 3092 (HY000): The server is not configured properly to be an active member of the group. Please see more details on error log.

error 日志

2017-07-14T15:00:14.966294+08:00 0 [Note] Plugin group_replication reported: 'A new primary was elected, enabled conflict detection until the new primary applies all relay logs'
2017-07-14T15:00:14.966366+08:00 0 [ERROR] Plugin group_replication reported: 'This member has more executed transactions than those present in the group. Local transactions: 5860d02e-4b55-11e7-8721-40f2e91dc960:1-788,
93f19c6c-6447-11e7-9323-cb37b2d517f3:1-16,
c745233b-6614-11e7-a738-40f2e91dc960:1 > Group transactions: 5860d02e-4b55-11e7-8721-40f2e91dc960:1-788,
93f19c6c-6447-11e7-9323-cb37b2d517f3:1-16'
2017-07-14T15:00:14.966383+08:00 0 [ERROR] Plugin group_replication reported: 'The member contains transactions not present in the group. The member will now exit the group.'
2017-07-14T15:00:14.966366+08:00 0 [ERROR] Plugin group_replication reported: 'This member has more executed transactions than those present in the group. Local transactions: 5860d02e-4b55-11e7-8721-40f2e91dc960:1-788,
93f19c6c-6447-11e7-9323-cb37b2d517f3:1-16,
c745233b-6614-11e7-a738-40f2e91dc960:1 > Group transactions: 5860d02e-4b55-11e7-8721-40f2e91dc960:1-788,
93f19c6c-6447-11e7-9323-cb37b2d517f3:1-16'
2017-07-14T15:00:14.966383+08:00 0 [ERROR] Plugin group_replication reported: 'The member contains transactions not present in the group. The member will now exit the group.'
2017-07-14T15:00:14.966392+08:00 0 [Note] Plugin group_replication reported: 'To force this member into the group you can use the group_replication_allow_local_disjoint_gtids_join option'
2017-07-14T15:00:14.966473+08:00 88181 [Note] Plugin group_replication reported: 'Going to wait for view modification'
2017-07-14T15:00:14.968087+08:00 0 [Note] Plugin group_replication reported: 'getstart group_id 6c6c3761'
2017-07-14T15:00:18.464227+08:00 0 [Note] Plugin group_replication reported: 'state 4330 action xa_terminate'
2017-07-14T15:00:18.464840+08:00 0 [Note] Plugin group_replication reported: 'new state x_start'
2017-07-14T15:00:18.464875+08:00 0 [Note] Plugin group_replication reported: 'state 4257 action xa_exit'
2017-07-14T15:00:18.465367+08:00 0 [Note] Plugin group_replication reported: 'Exiting xcom thread'
2017-07-14T15:00:18.465382+08:00 0 [Note] Plugin group_replication reported: 'new state x_start'
2017-07-14T15:00:23.486593+08:00 88181 [Note] Plugin group_replication reported: 'auto_increment_increment is reset to 1'
2017-07-14T15:00:23.486642+08:00 88181 [Note] Plugin group_replication reported: 'auto_increment_offset is reset to 1'
2017-07-14T15:00:23.486838+08:00 88211 [Note] Error reading relay log event for channel 'group_replication_applier': slave SQL thread was killed
2017-07-14T15:00:23.487244+08:00 88208 [Note] Plugin group_replication reported: 'The group replication applier thread was killed'

強制加入偎巢,一定要確定不會導(dǎo)致數(shù)據(jù)不一致蔼夜,謹(jǐn)慎操作

localhost.test>SET GLOBAL group_replication_allow_local_disjoint_gtids_join = ON ;
Query OK, 0 rows affected (0.00 sec)

localhost.test>start group_replication;
Query OK, 0 rows affected (2.26 sec)

localhost.test>SET GLOBAL group_replication_allow_local_disjoint_gtids_join = OFF ;
Query OK, 0 rows affected (0.00 sec)

localhost.test>show variables like '%read_only%';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
| innodb_read_only | OFF   |
| read_only        | ON    |
| super_read_only  | ON    |
| tx_read_only     | OFF   |
+------------------+-------+
4 rows in set (0.00 sec)

3. 大事務(wù)導(dǎo)致組成員被移除或切換

測試如下,在 5.7.17压昼、5.7.18 都有此問題求冷,fix 5.7.19 加了一個限制大事務(wù)的參數(shù) group_replication_transaction_size_limit #84785

# 單主模式 5.7.18
CREATE TABLE `kafkaoffset_api_log` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `developer` varchar(20) NOT NULL DEFAULT '' ,
  ctime timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT 'request api time',
  PRIMARY KEY (`id`),
  KEY idx_time(ctime,developer)
);

localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 1 row affected (0.01 sec)
Records: 1  Duplicates: 0  Warnings: 0

localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 2 rows affected (0.01 sec)
Records: 2  Duplicates: 0  Warnings: 0

localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 8 rows affected (0.01 sec)
Records: 8  Duplicates: 0  Warnings: 0

localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 16 rows affected (0.00 sec)
Records: 16  Duplicates: 0  Warnings: 0

localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 32 rows affected (0.01 sec)
Records: 32  Duplicates: 0  Warnings: 0

localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 64 rows affected (0.00 sec)
Records: 64  Duplicates: 0  Warnings: 0

localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 128 rows affected (0.00 sec)
Records: 128  Duplicates: 0  Warnings: 0

localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 256 rows affected (0.02 sec)
Records: 256  Duplicates: 0  Warnings: 0

localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 512 rows affected (0.02 sec)
Records: 512  Duplicates: 0  Warnings: 0

localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 1024 rows affected (0.05 sec)
Records: 1024  Duplicates: 0  Warnings: 0

localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 2048 rows affected (0.08 sec)
Records: 2048  Duplicates: 0  Warnings: 0

localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 4096 rows affected (0.17 sec)
Records: 4096  Duplicates: 0  Warnings: 0

localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 8192 rows affected (0.33 sec)
Records: 8192  Duplicates: 0  Warnings: 0

localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 16384 rows affected (0.62 sec)
Records: 16384  Duplicates: 0  Warnings: 0

localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 32768 rows affected (1.09 sec)
Records: 32768  Duplicates: 0  Warnings: 0

localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 65536 rows affected (2.29 sec)
Records: 65536  Duplicates: 0  Warnings: 0

localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 131072 rows affected (5.00 sec)
Records: 131072  Duplicates: 0  Warnings: 0

localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 262144 rows affected (8.44 sec)
Records: 262144  Duplicates: 0  Warnings: 0

localhost.test>insert into kafkaoffset_api_log(developer,ctime) select devloper,now() from kafkaoffset_api_log;
ERROR 1054 (42S22): Unknown column 'devloper' in 'field list'
localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 524288 rows affected (15.79 sec)
Records: 524288  Duplicates: 0  Warnings: 0

localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 1048576 rows affected (36.33 sec)
Records: 1048576  Duplicates: 0  Warnings: 0

localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
ERROR 3101 (HY000): Plugin instructed the server to rollback the current transaction.

# 當(dāng)前成員
localhost.test>SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 817b0415-661a-11e7-842a-782bcb479deb | 10.55.28.64 |        6666 | ERROR        |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
1 row in set (0.00 sec)

# 其他成員
localhost.test>SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST  | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| group_replication_applier | c745233b-6614-11e7-a738-40f2e91dc960 | 10.13.2.29   |        6666 | ONLINE       |
| group_replication_applier | e99a9946-6619-11e7-9b07-70e28406ebea | 10.77.16.197 |        6666 | ONLINE       |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
2 rows in set (0.00 sec)

因為執(zhí)行大事務(wù),在內(nèi)存分配窍霞、網(wǎng)絡(luò)寬帶開銷上面匠题,導(dǎo)致故障檢測觸發(fā)使成員處于不可訪問狀態(tài)。

解決:5.7.19 引進一個參數(shù) group_replication_transaction_size_limit 但金,默認(rèn)是0韭山,是沒有限制,需要根據(jù) MGR 工作負(fù)載設(shè)置比較合理的值冷溃。

# 恢復(fù)組成員
localhost.test>stop group_replication ;
Query OK, 0 rows affected (5.67 sec)

localhost.test>start group_replication;
Query OK, 0 rows affected (2.40 sec)

localhost.test>SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST  | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| group_replication_applier | 817b0415-661a-11e7-842a-782bcb479deb | 10.55.28.64  |        6666 | ONLINE       |
| group_replication_applier | c745233b-6614-11e7-a738-40f2e91dc960 | 10.13.2.29   |        6666 | ONLINE       |
| group_replication_applier | e99a9946-6619-11e7-9b07-70e28406ebea | 10.77.16.197 |        6666 | ONLINE       |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
3 rows in set (0.00 sec)
  1. 沒有主鍵的表
localhost.test>create table t1(id int,name varchar(10));
Query OK, 0 rows affected (0.02 sec)

localhost.test>insert into t1 values(1,'test1');
ERROR 3098 (HY000): The table does not comply with the requirements by an external plugin.

localhost.test>alter table t1 add primary key(id);
Query OK, 0 rows affected (0.02 sec)
Records: 0  Duplicates: 0  Warnings: 0

localhost.test>insert into t1 values(1,'test1');
Query OK, 1 row affected (0.00 sec)

error log

2017-08-22T19:19:38.693424+08:00 4521 [ERROR] Plugin group_replication reported: 'Table t1 does not have any PRIMARY KEY. This is not compatible with Group Replication'

MySQL Change log

1伪冰、5.7.20

  • group_replication_member_weight琳钉,默認(rèn)50告匠,范圍[0-100]趣苏, <u>單主模式</u>增加權(quán)重參數(shù),以前是 uuid 最小選舉為新的 Primary凿歼,現(xiàn)在先判斷權(quán)重大的為新的Primary褪迟,如果權(quán)重相同冗恨,再 uuid 最小為新的 Primary

  • STOP GROUP_REPLICATION 時,以前 stop group replication 還可以接受事務(wù)牵咙,現(xiàn)在為立刻 super_read_only 生效派近,對集群一致性更好的保障

  • STOP GROUP_REPLICATION 時,mgr 通訊停止了洁桌,但異步通道沒有停止,5.7.20以后異步通道也停止

  • 參考 group_replication_force_members 加上限制侯嘀,只有多數(shù)成員不可達時另凌,才可以使用

  • server_uuid 不可能與 group_replication_group_name 相同

    ?

自動化兼容

監(jiān)控

  1. 所有節(jié)點 group_replication_single_primary_mode 值不同
  2. 單主模式下,切庫上下線域名ip
mysql> show variables like 'group_replication_single_primary_mode';
+---------------------------------------+-------+
| Variable_name                         | Value |
+---------------------------------------+-------+
| group_replication_single_primary_mode | ON    |
+---------------------------------------+-------+
1 row in set (0.00 sec)
  1. 檢查組成員不一致 performance_schema.replication_group_members
  2. ?

備份

mb 備份

innobackupex 備份

擴容

slave 擴容 (stop group_replication)

innobackupex 擴容

參考

MySQL High Availabilitywith Group Replication - 宋利兵 IT大咖說

Group Replication: The Sweet and the Sour

MySQL Group Replication: A Quick Start Guide

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
  • 序言:七十年代末戒幔,一起剝皮案震驚了整個濱河市吠谢,隨后出現(xiàn)的幾起案子,更是在濱河造成了極大的恐慌诗茎,老刑警劉巖工坊,帶你破解...
    沈念sama閱讀 216,496評論 6 501
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件,死亡現(xiàn)場離奇詭異敢订,居然都是意外死亡王污,警方通過查閱死者的電腦和手機,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 92,407評論 3 392
  • 文/潘曉璐 我一進店門楚午,熙熙樓的掌柜王于貴愁眉苦臉地迎上來昭齐,“玉大人,你說我怎么就攤上這事矾柜≮寮荩” “怎么了?”我有些...
    開封第一講書人閱讀 162,632評論 0 353
  • 文/不壞的土叔 我叫張陵怪蔑,是天一觀的道長里覆。 經(jīng)常有香客問我,道長缆瓣,這世上最難降的妖魔是什么喧枷? 我笑而不...
    開封第一講書人閱讀 58,180評論 1 292
  • 正文 為了忘掉前任,我火速辦了婚禮捆愁,結(jié)果婚禮上割去,老公的妹妹穿的比我還像新娘。我一直安慰自己昼丑,他們只是感情好呻逆,可當(dāng)我...
    茶點故事閱讀 67,198評論 6 388
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著菩帝,像睡著了一般咖城。 火紅的嫁衣襯著肌膚如雪茬腿。 梳的紋絲不亂的頭發(fā)上,一...
    開封第一講書人閱讀 51,165評論 1 299
  • 那天宜雀,我揣著相機與錄音切平,去河邊找鬼。 笑死辐董,一個胖子當(dāng)著我的面吹牛悴品,可吹牛的內(nèi)容都是我干的。 我是一名探鬼主播简烘,決...
    沈念sama閱讀 40,052評論 3 418
  • 文/蒼蘭香墨 我猛地睜開眼苔严,長吁一口氣:“原來是場噩夢啊……” “哼!你這毒婦竟也來了孤澎?” 一聲冷哼從身側(cè)響起届氢,我...
    開封第一講書人閱讀 38,910評論 0 274
  • 序言:老撾萬榮一對情侶失蹤,失蹤者是張志新(化名)和其女友劉穎覆旭,沒想到半個月后退子,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體,經(jīng)...
    沈念sama閱讀 45,324評論 1 310
  • 正文 獨居荒郊野嶺守林人離奇死亡型将,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點故事閱讀 37,542評論 2 332
  • 正文 我和宋清朗相戀三年寂祥,在試婚紗的時候發(fā)現(xiàn)自己被綠了。 大學(xué)時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片茶敏。...
    茶點故事閱讀 39,711評論 1 348
  • 序言:一個原本活蹦亂跳的男人離奇死亡壤靶,死狀恐怖,靈堂內(nèi)的尸體忽然破棺而出惊搏,到底是詐尸還是另有隱情贮乳,我是刑警寧澤,帶...
    沈念sama閱讀 35,424評論 5 343
  • 正文 年R本政府宣布恬惯,位于F島的核電站向拆,受9級特大地震影響,放射性物質(zhì)發(fā)生泄漏酪耳。R本人自食惡果不足惜浓恳,卻給世界環(huán)境...
    茶點故事閱讀 41,017評論 3 326
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望碗暗。 院中可真熱鬧颈将,春花似錦、人聲如沸言疗。這莊子的主人今日做“春日...
    開封第一講書人閱讀 31,668評論 0 22
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽噪奄。三九已至死姚,卻和暖如春人乓,著一層夾襖步出監(jiān)牢的瞬間,已是汗流浹背都毒。 一陣腳步聲響...
    開封第一講書人閱讀 32,823評論 1 269
  • 我被黑心中介騙來泰國打工色罚, 沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留,地道東北人账劲。 一個月前我還...
    沈念sama閱讀 47,722評論 2 368
  • 正文 我出身青樓戳护,卻偏偏與公主長得像,于是被迫代替她去往敵國和親瀑焦。 傳聞我的和親對象是個殘疾皇子姑尺,可洞房花燭夜當(dāng)晚...
    茶點故事閱讀 44,611評論 2 353

推薦閱讀更多精彩內(nèi)容