① OpenStack高可用集群部署方案(train版)—基礎(chǔ)配置
② OpenStack高可用集群部署方案(train版)—Keystone
③ OpenStack高可用集群部署方案(train版)—Glance
一屈张、硬件配置參考
共12臺服務(wù)器資源 ip+主機(jī)名+cpu數(shù)+核數(shù)+硬盤容量
按照硬件資源大小分各類節(jié)點(diǎn)区宇,不按照ip順序
10.15.253.225 cs8srv01-c2m8h300.esxi01.rd.zxbj01
10.15.253.226 cs8srv02-c2m16h600.esxi01.rd.zxbj01
10.15.253.227 cs8srv03-c2m32h1200.esxi01.rd.zxbj01
10.15.253.193 cs8srv01-c2m8h300.esxi02.rd.zxbj01
10.15.253.194 cs8srv02-c2m16h600.esxi02.rd.zxbj01
10.15.253.195 cs8srv03-c2m32h1200.esxi02.rd.zxbj01
10.15.253.161 cs8srv01-c2m8h300.esxi03.rd.zxbj01
10.15.253.162 cs8srv02-c2m16h600.esxi03.rd.zxbj01
10.15.253.163 cs8srv03-c2m32h1200.esxi03.rd.zxbj01
10.15.253.129 cs8srv01-c2m8h300.esxi04.rd.zxbj01 ×不可用
10.15.253.130 cs8srv02-c2m16h600.esxi04.rd.zxbj01 ×不可用
10.15.253.131 cs8srv03-c2m32h1200.esxi04.rd.zxbj01 ×不可用
#root密碼填硕,部署完成后請刪除
Zx******
系統(tǒng)環(huán)境
#所有虛擬機(jī)內(nèi)核版本
[root@cs8srv01 ~]# uname -r
4.18.0-193.14.2.el8_2.x86_64
#所有虛擬機(jī)系統(tǒng)版本
[root@cs8srv01 ~]# cat /etc/redhat-release
CentOS Linux release 8.2.2004 (Core)
二恨狈、節(jié)點(diǎn)整體規(guī)劃
openstack高可用環(huán)境測試需要9臺虛擬機(jī),控制、計算、網(wǎng)絡(luò)计技、存儲、ceph共享存儲集群共9臺山橄,后續(xù)資源充足可以將網(wǎng)絡(luò)節(jié)點(diǎn)和存儲節(jié)點(diǎn)進(jìn)行分離垮媒,單獨(dú)準(zhǔn)備節(jié)點(diǎn)部署。
因控制節(jié)點(diǎn)需要運(yùn)行服務(wù)較多航棱,所以選擇內(nèi)存較大的虛擬機(jī)睡雇,生產(chǎn)中,建議將大磁盤掛載到ceph存儲上
host | IP | Service | 備注 |
---|---|---|---|
controller01 | ens192:10.15.253.163 管理網(wǎng)絡(luò)丧诺,外部網(wǎng)絡(luò)<br />ens224:172.31.253.163 vlan網(wǎng)絡(luò) | 1. keystone 2. glance-api , glance-registry 3. nova-api, nova-conductor, nova-consoleauth, nova-scheduler, nova-novncproxy 4. neutron-api, neutron-linuxbridge-agent, neutron-dhcp-agent, neutron-metadata-agent, neutron-l3-agent 5. cinder-api, cinder-schedulera 6. dashboard 7. mariadb, rabbitmq, memcached,Haproxy 等 |
1.控制節(jié)點(diǎn): keystone, glance, horizon, nova&neutron管理組件入桂;<br />2.網(wǎng)絡(luò)節(jié)點(diǎn):虛機(jī)網(wǎng)絡(luò)奄薇,L2/L3驳阎,dhcp,route馁蒂,nat等呵晚;2核 32線程1.2T硬盤<br />3.存儲節(jié)點(diǎn):調(diào)度,監(jiān)控(ceph)等組件沫屡;2核 32線程1.2T硬盤<br />4.openstack基礎(chǔ)服務(wù) |
controller02 | ens192:10.15.253.195 管理網(wǎng)絡(luò)饵隙,外部網(wǎng)絡(luò)<br />ens224:172.31.253.195 vlan網(wǎng)絡(luò) | 1. keystone 2. glance-api , glance-registry 3. nova-api, nova-conductor, nova-consoleauth, nova-scheduler, nova-novncproxy 4. neutron-api, neutron-linuxbridge-agent, neutron-dhcp-agent, neutron-metadata-agent, neutron-l3-agent 5. cinder-api, cinder-schedulera 6. dashboard 7. mariadb, rabbitmq, memcached,Haproxy 等 |
1.控制節(jié)點(diǎn): keystone, glance, horizon, nova&neutron管理組件;<br />2.網(wǎng)絡(luò)節(jié)點(diǎn):虛機(jī)網(wǎng)絡(luò)沮脖,L2/L3金矛,dhcp芯急,route,nat等驶俊;2核 32線程1.2T硬盤<br />3.存儲節(jié)點(diǎn):調(diào)度娶耍,監(jiān)控(ceph)等組件;2核 32線程1.2T硬盤<br />4.openstack基礎(chǔ)服務(wù) |
controller03 | ens192:10.15.253.227 管理網(wǎng)絡(luò)饼酿,外部網(wǎng)絡(luò)<br />ens224:172.31.253.227 vlan網(wǎng)絡(luò) | 1. keystone 2. glance-api , glance-registry 3. nova-api, nova-conductor, nova-consoleauth, nova-scheduler, nova-novncproxy 4. neutron-api, neutron-linuxbridge-agent, neutron-dhcp-agent, neutron-metadata-agent, neutron-l3-agent 5. cinder-api, cinder-schedulera 6. dashboard 7. mariadb, rabbitmq, memcached,Haproxy 等 |
1.控制節(jié)點(diǎn): keystone, glance, horizon, nova&neutron管理組件榕酒;<br />2.網(wǎng)絡(luò)節(jié)點(diǎn):虛機(jī)網(wǎng)絡(luò),L2/L3故俐,dhcp想鹰,route,nat等药版;2核 32線程1.2T硬盤<br />3.存儲節(jié)點(diǎn):調(diào)度辑舷,監(jiān)控(ceph)等組件;2核 32線程1.2T硬盤<br />4.openstack基礎(chǔ)服務(wù) |
compute01 | ens192:10.15.253.162 管理網(wǎng)絡(luò)槽片,外部網(wǎng)絡(luò)<br />ens224:172.31.253.162 vlan網(wǎng)絡(luò) | 1. nova-compute 2. neutron-linuxbridge-agent 3. cinder-volume(如果后端使用共享存儲惩妇,建議部署在controller節(jié)點(diǎn)) |
1.計算節(jié)點(diǎn):hypervisor(kvm);<br />2.網(wǎng)絡(luò)節(jié)點(diǎn):虛機(jī)網(wǎng)絡(luò)等筐乳;<br />3.存儲節(jié)點(diǎn):卷服務(wù)等組件 |
compute02 | ens192:10.15.253.194 管理網(wǎng)絡(luò)歌殃,外部網(wǎng)絡(luò)<br />ens224:172.31.253.194 vlan網(wǎng)絡(luò) | 1. nova-compute 2. neutron-linuxbridge-agent 3. cinder-volume(如果后端使用共享存儲,建議部署在controller節(jié)點(diǎn)) |
1.計算節(jié)點(diǎn):hypervisor(kvm)蝙云;<br />2.網(wǎng)絡(luò)節(jié)點(diǎn):虛機(jī)網(wǎng)絡(luò)等氓皱;<br />3.存儲節(jié)點(diǎn):卷服務(wù)等組件 |
compute03 | ens192:10.15.253.226 管理網(wǎng)絡(luò),外部網(wǎng)絡(luò)<br />ens224:172.31.253.226 vlan網(wǎng)絡(luò) | 1. nova-compute 2. neutron-linuxbridge-agent 3. cinder-volume(如果后端使用共享存儲勃刨,建議部署在controller節(jié)點(diǎn)) |
1.計算節(jié)點(diǎn):hypervisor(kvm)波材;<br />2.網(wǎng)絡(luò)節(jié)點(diǎn):虛機(jī)網(wǎng)絡(luò)等;<br />3.存儲節(jié)點(diǎn):卷服務(wù)等組件 |
cephnode01 | ens192:10.15.253.161 <br />ens224:172.31.253.161 | ceph-mon, ceph-mgr | 卷服務(wù),塊存儲等組件 |
cephnode02 | ens192:10.15.253.193<br />ens224:172.31.253.193 | ceph-mon, ceph-mgr, ceph-osd | 卷服務(wù),塊存儲等組件 |
cephnode03 | ens192:10.15.253.225<br />ens224:172.31.253.225 | ceph-mon, ceph-mgr, ceph-osd | 卷服務(wù),塊存儲等組件 |
使用Haproxy 做負(fù)載均衡優(yōu)點(diǎn)在于每個節(jié)點(diǎn)分布在不同的服務(wù)器上身隐,某臺物理服務(wù)器宕掉后廷区,對于無狀態(tài)應(yīng)用來說,不至于都無法運(yùn)行贾铝,實現(xiàn)HA隙轻。
控制、網(wǎng)絡(luò)垢揩、存儲玖绿、3節(jié)點(diǎn)
10.15.253.163 c2m32h1200 controller01
10.15.253.195 c2m32h1200 controller02
10.15.253.227 c2m32h1200 controller03
計算、網(wǎng)絡(luò)叁巨、存儲斑匪、3節(jié)點(diǎn)
10.15.253.162 c2m16h600 compute01
10.15.253.194 c2m16h600 compute02
10.15.253.226 c2m16h600 compute03
ceph共享存儲3節(jié)點(diǎn)
10.15.253.161 c2m8h300 cephnode01
10.15.253.193 c2m8h300 cephnode02
10.15.253.225 c2m8h300 cephnode03
負(fù)載均衡1節(jié)點(diǎn),放到某臺上
10.15.253.225 c2m8h300 cephnode03
高可用虛擬ip
10.15.253.88 設(shè)置vip
三锋勺、集群高可用說明
參考官方文檔
https://docs.openstack.org/ha-guide/
https://docs.openstack.org/arch-design/design-control-plane.html
https://docs.openstack.org/arch-design/design-control-plane.html#table-deployment-scenarios
https://docs.openstack.org/ha-guide/intro-os-ha-cluster.html
https://docs.openstack.org/ha-guide/storage-ha.html
OpenStack體系結(jié)構(gòu)設(shè)計指南: https://docs.openstack.org/arch-design/
OpenStack API文檔: https://docs.openstack.org/api-quick-start/
無狀態(tài)服務(wù)
可在提出請求后提供響應(yīng)蚀瘸,然后無需進(jìn)一步關(guān)注狡蝶。為了使無狀態(tài)服務(wù)高度可用,需要提供冗余節(jié)點(diǎn)并對其進(jìn)行負(fù)載
包括nova-api
贮勃, nova-conductor
牢酵,glance-api
,keystone-api
衙猪,neutron-api
馍乙,nova-scheduler
。
有狀態(tài)服務(wù)
對服務(wù)的后續(xù)請求取決于第一個請求的結(jié)果垫释。有狀態(tài)服務(wù)更難管理丝格,因為單個動作通常涉及多個請求。使?fàn)顟B(tài)服務(wù)高度可用可能取決于您選擇主動/被動配置還是主動/主動配置棵譬。包括OpenStack的數(shù)據(jù)庫和消息隊列
OpenStack集群的高可用方案
三臺 Controller 節(jié)點(diǎn)分別部署OpenStack服務(wù)显蝌,共享數(shù)據(jù)庫和消息隊列,由haproxy負(fù)載均衡請求到后端處理订咸。
前端代理
前端代理可以采用
Haproxy + KeepAlived
或者Haproxy + pacemaker
方式曼尊,OpenStack控制節(jié)點(diǎn)各服務(wù),對外暴露VIP提供API訪問脏嚷。建議將Haproxy單獨(dú)部署Openstack官網(wǎng)使用開源的
pacemaker cluster stack
做為集群高可用資源管理軟件
數(shù)據(jù)庫集群
https://docs.openstack.org/ha-guide/control-plane-stateful.html
https://blog.csdn.net/educast/article/details/78678152
采用MariaDB + Galera
組成三個Active節(jié)點(diǎn)骆撇,外部訪問通過Haproxy的active + backend方式代理。平時主庫為A父叙,當(dāng)A出現(xiàn)故障神郊,則切換到B或C節(jié)點(diǎn)。目前測試將MariaDB三個節(jié)點(diǎn)部署到了控制節(jié)點(diǎn)上趾唱。
官方推薦:三個節(jié)點(diǎn)的MariaDB和Galera集群涌乳,建議每個集群具有4個vCPU和8 GB RAM
RabbitMQ集群
RabbitMQ采用原生Cluster集群,所有節(jié)點(diǎn)同步鏡像隊列甜癞。三臺物理機(jī)夕晓,其中2個Mem節(jié)點(diǎn)主要提供服務(wù),1個Disk節(jié)點(diǎn)用于持久化消息悠咱,客戶端根據(jù)需求分別配置主從策略蒸辆。
目前測試將RabbitMQ三個節(jié)點(diǎn)部署到了控制節(jié)點(diǎn)上。
四乔煞、基礎(chǔ)環(huán)境
1. 設(shè)置SSH秘鑰分發(fā)與hosts文件
ssh
#為控制節(jié)點(diǎn)controller01 配置ssh免密吁朦,先作為一臺管理機(jī)
yum install sshpass -y
mkdir -p /extend/shell
#執(zhí)行分發(fā)腳本
cat >>/extend/shell/fenfa_pub.sh<< EOF
#!/bin/bash
ssh-keygen -t rsa -f ~/.ssh/id_rsa -P ''
for ip in 161 162 163 193 194 195 225 226 227
do
sshpass -pZx****** ssh-copy-id -o StrictHostKeyChecking=no 10.15.253.\$ip
done
EOF
#測試
[root@controller01 ~]# ssh controller03 hostname
controller03
[root@controller01 ~]# ssh compute03 hostname
compute03
[root@controller01 ~]# ssh cephnode03 hostname
cephnode03
hosts
#所有節(jié)點(diǎn)保持一致的hosts即可
cat >>/etc/hosts <<EOF
10.15.253.163 controller01
10.15.253.195 controller02
10.15.253.227 controller03
10.15.253.162 compute01
10.15.253.194 compute02
10.15.253.226 compute03
10.15.253.161 cephnode01
10.15.253.193 cephnode02
10.15.253.225 cephnode03
EOF
#發(fā)送到所有節(jié)點(diǎn)
for ip in 161 162 163 193 194 195 225 226 227 ;do scp -rp /etc/hosts root@10.15.253.$ip:/etc/hosts ;done
2. 設(shè)置時間同步
#chrony時間同步:設(shè)置controller01節(jié)點(diǎn)做server節(jié)點(diǎn)
yum install chrony -y
vim /etc/chrony.conf
server ntp1.aliyun.com iburst
allow 10.15.253.163/12
systemctl restart chronyd.service
systemctl enable chronyd.service
chronyc sources
#其他所有節(jié)點(diǎn)都設(shè)置時間同步服務(wù)器controller01節(jié)點(diǎn)
yum install chrony -y
vim /etc/chrony.conf
server controller01 iburst
systemctl restart chronyd.service
systemctl enable chronyd.service
chronyc sources
3. 內(nèi)核參數(shù)、selinux渡贾、iptables
注意:線上生產(chǎn)環(huán)境請使用iptable規(guī)則放行端口的方式
#內(nèi)核參數(shù)優(yōu)化
echo 'net.ipv4.ip_forward = 1' >>/etc/sysctl.conf
echo 'net.bridge.bridge-nf-call-iptables=1' >>/etc/sysctl.conf
echo 'net.bridge.bridge-nf-call-ip6tables=1' >>/etc/sysctl.conf
#在控制節(jié)點(diǎn)上添加,允許非本地IP綁定雄右,允許運(yùn)行中的HAProxy實例綁定到VIP
echo 'net.ipv4.ip_nonlocal_bind = 1' >>/etc/sysctl.conf
sysctl -p
#關(guān)閉selinux
setenforce 0
sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/selinux/config
#關(guān)閉firewalld
systemctl disable firewalld.service
systemctl stop firewalld.service
4. 下載Train版的軟件包
安裝train版的yum源:對于CentOS8空骚,要啟用PowerTools存儲庫和高可用倉庫
yum install centos-release-openstack-train -y
#啟用HighAvailability repo
yum install yum-utils -y
yum config-manager --set-enabled HighAvailability
yum config-manager --set-enabled PowerTools
#安裝epel-repo成
rpm -Uvh https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
yum clean all
yum makecache
安裝客戶端:對于CentOS8已升級為python3-openstackclient
yum install python3-openstackclient -y
openstack-utils能夠讓openstack安裝更加簡單纺讲,直接在命令行修改配置文件(全部節(jié)點(diǎn))
#創(chuàng)建一個存放壓縮包文件的目錄
mkdir -p /opt/tools
#需要下載依賴軟件
yum install crudini -y
wget -P /opt/tools https://cbs.centos.org/kojifiles/packages/openstack-utils/2017.1/1.el7/noarch/openstack-utils-2017.1-1.el7.noarch.rpm
rpm -ivh /opt/tools/openstack-utils-2017.1-1.el7.noarch.rpm
可選:selinux開啟時需要安裝openstack-selinux,當(dāng)前環(huán)境已將seliux設(shè)置為默認(rèn)關(guān)閉
yum install openstack-selinux -y
#重啟連接后如果會報一些錯,下載此應(yīng)用可以解決此問題
yum install libibverbs -y
5. 各服務(wù)組件的密碼
密碼名稱 | 描述 |
---|---|
Zx***** | admin管理員用戶密碼 |
Zx***** | 塊設(shè)備存儲服務(wù)的數(shù)據(jù)庫密碼 |
Zx***** | 塊設(shè)備存儲服務(wù)的 cinder 密碼 |
Zx***** | 儀表板的數(shù)據(jù)庫密碼 |
Zx***** | 鏡像服務(wù)的數(shù)據(jù)庫密碼 |
Zx***** | 鏡像服務(wù)的 glance 用戶密碼 |
Zx***** | 認(rèn)證服務(wù)的數(shù)據(jù)庫密碼 |
Zx***** | 元數(shù)據(jù)代理的密碼 |
Zx***** | 網(wǎng)絡(luò)服務(wù)的數(shù)據(jù)庫密碼 |
Zx***** | 網(wǎng)絡(luò)服務(wù)的 neutron 用戶密碼 |
Zx***** | 計算服務(wù)的數(shù)據(jù)庫密碼 |
Zx***** | 計算服務(wù)的 nova 用戶的密碼 |
Zx***** | 展示位置服務(wù)placement用戶的密碼 |
Zx***** | RabbitMQ服務(wù)的openstack管理用戶的密碼 |
Zx***** | pacemaker高可用用戶密碼 |
五囤屹、Mariadb集群(控制節(jié)點(diǎn))
1. 安裝與配置修改
1.1 在全部controller節(jié)點(diǎn)安裝mariadb熬甚,以controller01節(jié)點(diǎn)為例
yum install mariadb mariadb-server python2-PyMySQL -y
1.2 安裝galera相關(guān)插件,利用galera搭建集群
yum install mariadb-server-galera mariadb-galera-common galera xinetd rsync -y
systemctl restart mariadb.service
systemctl enable mariadb.service
1.3 初始化mariadb肋坚,在全部控制節(jié)點(diǎn)初始化數(shù)據(jù)庫密碼乡括,以controller01節(jié)點(diǎn)為例
[root@controller01 ~]# mysql_secure_installation
#輸入root用戶的當(dāng)前密碼(不輸入密碼)
Enter current password for root (enter for none):
#設(shè)置root密碼?
Set root password? [Y/n] y
#新密碼:
New password:
#重新輸入新的密碼:
Re-enter new password:
#刪除匿名用戶智厌?
Remove anonymous users? [Y/n] y
#禁止遠(yuǎn)程root登錄诲泌?
Disallow root login remotely? [Y/n] n
#刪除測試數(shù)據(jù)庫并訪問它?
Remove test database and access to it? [Y/n] y
#現(xiàn)在重新加載特權(quán)表铣鹏?
Reload privilege tables now? [Y/n] y
1.4 修改mariadb配置文件
在全部控制節(jié)點(diǎn)/etc/my.cnf.d/目錄下新增openstack.cnf配置文件敷扫,主要設(shè)置集群同步相關(guān)參數(shù),以controller01節(jié)點(diǎn)為例诚卸,個別涉及ip地址/host名等參數(shù)根據(jù)當(dāng)前節(jié)點(diǎn)實際情況修改
創(chuàng)建和編輯/etc/my.cnf.d/openstack.cnf
文件
#bind-address 主機(jī)ip
#wsrep_node_name 主機(jī)名
#wsrep_node_address 主機(jī)ip
[root@controller01 ~]# cat /etc/my.cnf.d/openstack.cnf
[server]
[mysqld]
bind-address = 10.15.253.163
max_connections = 1000
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
log-error=/var/log/mariadb/mariadb.log
pid-file=/run/mariadb/mariadb.pid
[galera]
wsrep_on=ON
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_cluster_name="mariadb_galera_cluster"
wsrep_cluster_address="gcomm://controller01,controller02,controller03"
wsrep_node_name="controller01"
wsrep_node_address="10.15.253.163"
binlog_format=ROW
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
wsrep_slave_threads=4
innodb_flush_log_at_trx_commit=2
innodb_buffer_pool_size=1024M
wsrep_sst_method=rsync
[embedded]
[mariadb]
[mariadb-10.3]
wsrep_sync_wait:默認(rèn)值是0葵第,如果需要保證讀寫一致性可以設(shè)置為1。但是需要注意的是合溺,該設(shè)置會帶來相應(yīng)的延遲性
1.5 將controller01的配置文件分別拷貝到其他兩個主機(jī)
修改兩臺節(jié)點(diǎn)對應(yīng)的地址和主機(jī)名:wsrep_node_name卒密、wsrep_node_address,bind-address
scp -rp /etc/my.cnf.d/openstack.cnf controller02:/etc/my.cnf.d/openstack.cnf
scp -rp /etc/my.cnf.d/openstack.cnf controller03:/etc/my.cnf.d/openstack.cnf
以上的安裝配置操作在所有控制節(jié)點(diǎn)執(zhí)行完畢以后棠赛,就可以開始構(gòu)建集群栅受,可以在任一控制節(jié)點(diǎn)執(zhí)行
2. 構(gòu)建集群
2.1 停止全部控制節(jié)點(diǎn)的mariadb服務(wù),以controller01節(jié)點(diǎn)為例
systemctl stop mariadb
2.2 在controller01節(jié)點(diǎn)通過如下方式啟動mariadb服務(wù)
[root@controller01 ~]# /usr/libexec/mysqld --wsrep-new-cluster --user=root &
[1] 8255
[root@controller01 ~]# 2020-08-28 14:02:44 0 [Note] /usr/libexec/mysqld (mysqld 10.3.20-MariaDB) starting as process 8255 ...
2.3 其他控制節(jié)點(diǎn)加入mariadb集群
以controller02節(jié)點(diǎn)為例恭朗;啟動后加入集群屏镊,controller02節(jié)點(diǎn)從controller01節(jié)點(diǎn)同步數(shù)據(jù),也可同步查看mariadb日志/var/log/mariadb/mariadb.log
[root@controller02 ~]# systemctl start mariadb.service
2.4 回到controller01節(jié)點(diǎn)重新配置mariadb
#重啟controller01節(jié)點(diǎn)痰腮;并在啟動前刪除contrller01節(jié)點(diǎn)之前的數(shù)據(jù)
[root@controller01 ~]# pkill -9 mysqld
[root@controller01 ~]# rm -rf /var/lib/mysql/*
#注意以system unit方式啟動mariadb服務(wù)時的權(quán)限
[root@controller01 ~]# chown mysql:mysql /var/run/mariadb/mariadb.pid
## 啟動后查看節(jié)點(diǎn)所在服務(wù)狀態(tài)而芥,controller01節(jié)點(diǎn)從controller02節(jié)點(diǎn)同步數(shù)據(jù)
[root@controller01 ~]# systemctl start mariadb.service
[root@controller01 ~]# systemctl status mariadb.service
2.5 查看集群狀態(tài)
[root@controller01 ~]# mysql -uroot -p
MariaDB [(none)]> show status like "wsrep_cluster_size";
+--------------------+-------+
| Variable_name | Value |
+--------------------+-------+
| wsrep_cluster_size | 3 |
+--------------------+-------+
1 row in set (0.001 sec)
MariaDB [(none)]> SHOW status LIKE 'wsrep_ready';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| wsrep_ready | ON |
+---------------+-------+
1 row in set (0.001 sec)
2.6 在controller01創(chuàng)建數(shù)據(jù)庫,到另外兩臺節(jié)點(diǎn)上查看是否可以同步
[root@controller01 ~]# mysql -uroot -p
MariaDB [(none)]> create database cluster_test charset utf8mb4;
Query OK, 1 row affected (0.005 sec)
MariaDB [(none)]> show databases;
+--------------------+
| Database |
+--------------------+
| cluster_test |
| information_schema |
| mysql |
| performance_schema |
+--------------------+
另外兩臺查看
[root@controller02 ~]# mysql -uroot -pZx***** -e 'show databases'
+--------------------+
| Database |
+--------------------+
| cluster_test | √
| information_schema |
| mysql |
| performance_schema |
+--------------------+
[root@controller03 ~]# mysql -uroot -pZx***** -e 'show databases'
+--------------------+
| Database |
+--------------------+
| cluster_test | √
| information_schema |
| mysql |
| performance_schema |
+--------------------+
3. 設(shè)置心跳檢測clustercheck
3.1 下載clustercheck腳本
在全部控制節(jié)點(diǎn)下載修改此腳本
wget -P /extend/shell/ https://raw.githubusercontent.com/olafz/percona-clustercheck/master/clustercheck
注意賬號/密碼與腳本中的賬號/密碼對應(yīng)膀值,這里用的是腳本默認(rèn)的賬號/密碼棍丐,否則需要修改clustercheck腳本
[root@controller01 ~]# vim /extend/shell/clustercheck
MYSQL_USERNAME="clustercheck"
MYSQL_PASSWORD="Zx*****"
MYSQL_HOST="localhost"
MYSQL_PORT="3306"
...
#添加執(zhí)行權(quán)限并復(fù)制到/usr/bin/下
[root@controller01 ~]# chmod +x /extend/shell/clustercheck
[root@controller01 ~]# \cp /extend/shell/clustercheck /usr/bin/
3.2 創(chuàng)建心跳檢測用戶
在任意一個控制節(jié)點(diǎn)的數(shù)據(jù)庫中創(chuàng)建clustercheck_user用戶并賦權(quán); 其他兩臺節(jié)點(diǎn)會自動同步
GRANT PROCESS ON *.* TO 'clustercheck'@'localhost' IDENTIFIED BY 'Zx*****';
flush privileges;
3.3 創(chuàng)建心跳檢測文件
在全部控制節(jié)點(diǎn)新增心跳檢測服務(wù)配置文件/etc/xinetd.d/mysqlchk
,以controller01節(jié)點(diǎn)為例
[root@controller01 ~]# touch /etc/xinetd.d/galera-monitor
[root@controller01 ~]# cat >/etc/xinetd.d/galera-monitor <<EOF
# default:on
# description: galera-monitor
service galera-monitor
{
port = 9200
disable = no
socket_type = stream
protocol = tcp
wait = no
user = root
group = root
groups = yes
server = /usr/bin/clustercheck
type = UNLISTED
per_source = UNLIMITED
log_on_success =
log_on_failure = HOST
flags = REUSE
}
EOF
3.4 啟動心跳檢測服務(wù)
在全部控制 節(jié)點(diǎn)修改/etc/services沧踏,變更tcp9200端口用途歌逢,以controller01節(jié)點(diǎn)為例
[root@controller01 ~]# vim /etc/services
...
#wap-wsp 9200/tcp # WAP connectionless session service
galera-monitor 9200/tcp # galera-monitor
啟動 xinetd 服務(wù)
#全部控制節(jié)點(diǎn)都需要啟動
systemctl daemon-reload
systemctl enable xinetd
systemctl start xinetd
3.5 測試心跳檢測腳本
在全部控制節(jié)點(diǎn)驗證,以controller01節(jié)點(diǎn)為例
[root@controller01 ~]# /usr/bin/clustercheck
HTTP/1.1 200 OK
Content-Type: text/plain
Connection: close
Content-Length: 40
Percona XtraDB Cluster Node is synced.
4. 異常關(guān)機(jī)或異常斷電后的修復(fù)
當(dāng)突然停電翘狱,所有g(shù)alera主機(jī)都非正常關(guān)機(jī)秘案,來電后開機(jī),會導(dǎo)致galera集群服務(wù)無法正常啟動。以下為處理辦法
第1步:開啟galera集群的群主主機(jī)的mariadb服務(wù)阱高。
第2步:開啟galera集群的成員主機(jī)的mariadb服務(wù)赚导。
異常處理:galera集群的群主主機(jī)和成員主機(jī)的mysql服務(wù)無法啟動,如何處理赤惊?
#解決方法一:
第1步吼旧、刪除garlera群主主機(jī)的/var/lib/mysql/grastate.dat狀態(tài)文件
/bin/galera_new_cluster啟動服務(wù)。啟動正常未舟。登錄并查看wsrep狀態(tài)圈暗。
第2步:刪除galera成員主機(jī)中的/var/lib/mysql/grastate.dat狀態(tài)文件
systemctl restart mariadb重啟服務(wù)。啟動正常裕膀。登錄并查看wsrep狀態(tài)员串。
#解決方法二:
第1步、修改garlera群主主機(jī)的/var/lib/mysql/grastate.dat狀態(tài)文件中的0為1
/bin/galera_new_cluster啟動服務(wù)魂角。啟動正常昵济。登錄并查看wsrep狀態(tài)。
第2步:修改galera成員主機(jī)中的/var/lib/mysql/grastate.dat狀態(tài)文件中的0為1
systemctl restart mariadb重啟服務(wù)野揪。啟動正常访忿。登錄并查看wsrep狀態(tài)。
六斯稳、RabbitMQ集群(控制節(jié)點(diǎn))
https://www.rabbitmq.com/which-erlang.html
1. 下載相關(guān)軟件包(所有節(jié)點(diǎn))
以controller01節(jié)點(diǎn)為例海铆,RabbbitMQ基與erlang開發(fā),首先安裝erlang挣惰,采用yum方式
[root@controller01 ~]# yum install erlang rabbitmq-server -y
[root@controller01 ~]# systemctl enable rabbitmq-server.service
2. 構(gòu)建rabbitmq集群
2.1 任選1個控制節(jié)點(diǎn)首先啟動rabbitmq服務(wù)
這里選擇controller01節(jié)點(diǎn)
[root@controller01 ~]# systemctl start rabbitmq-server.service
[root@controller01 ~]# rabbitmqctl cluster_status
2.2 分發(fā).erlang.cookie到其他控制節(jié)點(diǎn)
scp /var/lib/rabbitmq/.erlang.cookie controller02:/var/lib/rabbitmq/
scp /var/lib/rabbitmq/.erlang.cookie controller03:/var/lib/rabbitmq/
2.3 修改controller02和03節(jié)點(diǎn).erlang.cookie文件的用戶/組
[root@controller02 ~]# chown rabbitmq:rabbitmq /var/lib/rabbitmq/.erlang.cookie
[root@controller03 ~]# chown rabbitmq:rabbitmq /var/lib/rabbitmq/.erlang.cookie
注意:修改全部控制節(jié)點(diǎn).erlang.cookie文件的權(quán)限卧斟,默認(rèn)為400權(quán)限,可用不修改
2.4 啟動controller02和03節(jié)點(diǎn)的rabbitmq服務(wù)
[root@controller02 ~]# systemctl start rabbitmq-server
[root@controller03 ~]# systemctl start rabbitmq-server
2.5 構(gòu)建集群憎茂,controller02和03節(jié)點(diǎn)以ram節(jié)點(diǎn)的形式加入集群
[root@controller02 ~]# rabbitmqctl stop_app
[root@controller02 ~]# rabbitmqctl join_cluster --ram rabbit@controller01
[root@controller02 ~]# rabbitmqctl start_app
[root@controller03 ~]# rabbitmqctl stop_app
[root@controller03 ~]# rabbitmqctl join_cluster --ram rabbit@controller01
[root@controller03 ~]# rabbitmqctl start_app
2.6 任意控制節(jié)點(diǎn)查看RabbitMQ集群狀態(tài)
[root@controller01 ~]# rabbitmqctl cluster_status
Basics
Cluster name: rabbit@controller01
Disk Nodes
rabbit@controller01
RAM Nodes
rabbit@controller02
rabbit@controller03
Running Nodes
rabbit@controller01
rabbit@controller02
rabbit@controller03
Versions
rabbit@controller01: RabbitMQ 3.8.3 on Erlang 22.3.4.1
rabbit@controller02: RabbitMQ 3.8.3 on Erlang 22.3.4.1
rabbit@controller03: RabbitMQ 3.8.3 on Erlang 22.3.4.1
.....
2.7 創(chuàng)建rabbitmq管理員賬號
# 在任意節(jié)點(diǎn)新建賬號并設(shè)置密碼珍语,以controller01節(jié)點(diǎn)為例
[root@controller01 ~]# rabbitmqctl add_user openstack Zx*****
# 設(shè)置新建賬號的狀態(tài)
[root@controller01 ~]# rabbitmqctl set_user_tags openstack administrator
# 設(shè)置新建賬號的權(quán)限
[root@controller01 ~]# rabbitmqctl set_permissions -p "/" openstack ".*" ".*" ".*"
# 查看賬號
[root@controller01 ~]# rabbitmqctl list_users
Listing users ...
user tags
openstack [administrator]
guest [administrator]
2.8 鏡像隊列的ha
設(shè)置鏡像隊列高可用
[root@controller01 ~]# rabbitmqctl set_policy ha-all "^" '{"ha-mode":"all"}'
任意控制節(jié)點(diǎn)查看鏡像隊列策略
[root@controller01 ~]# rabbitmqctl list_policies
Listing policies for vhost "/" ...
vhost name pattern apply-to definition priority
/ ha-all ^ all {"ha-mode":"all"} 0
2.9 安裝web管理插件
在全部控制節(jié)點(diǎn)安裝web管理插件,以controller01節(jié)點(diǎn)為例
[root@controller01 ~]# rabbitmq-plugins enable rabbitmq_management
[16:02 root@controller01 ~]# netstat -lntup|grep 5672
tcp 0 0 0.0.0.0:25672 0.0.0.0:* LISTEN 10461/beam.smp
tcp 0 0 0.0.0.0:15672 0.0.0.0:* LISTEN 10461/beam.smp
tcp6 0 0 :::5672 :::* LISTEN 10461/beam.smp
訪問任意節(jié)點(diǎn)竖幔,如:http://10.15.253.163:15672
七板乙、Memcached和Etcd集群(控制節(jié)點(diǎn))
Memcached是一款開源、高性能拳氢、分布式內(nèi)存對象緩存系統(tǒng)募逞,可應(yīng)用各種需要緩存的場景,其主要目的是通過降低對Database的訪問來加速web應(yīng)用程序馋评。
Memcached一般的使用場景是:通過緩存數(shù)據(jù)庫查詢的結(jié)果放接,減少數(shù)據(jù)庫訪問次數(shù),以提高動態(tài)Web應(yīng)用的速度留特、提高可擴(kuò)展性纠脾。
本質(zhì)上玛瘸,memcached是一個基于內(nèi)存的key-value存儲,用于存儲數(shù)據(jù)庫調(diào)用乳乌、API調(diào)用或頁面引用結(jié)果的直接數(shù)據(jù)捧韵,如字符串市咆、對象等小塊任意數(shù)據(jù)汉操。
Memcached是無狀態(tài)的,各控制節(jié)點(diǎn)獨(dú)立部署蒙兰,openstack各服務(wù)模塊統(tǒng)一調(diào)用多個控制節(jié)點(diǎn)的memcached服務(wù)即可
1 安裝memcache的軟件包
在全部控制節(jié)點(diǎn)安裝磷瘤;centos8系統(tǒng)更新為python3-memcached
yum install memcached python3-memcached -y
2 設(shè)置memcached
在全部安裝memcached服務(wù)的節(jié)點(diǎn)設(shè)置服務(wù)監(jiān)聽本地地址
sed -i 's|127.0.0.1,::1|0.0.0.0|g' /etc/sysconfig/memcached
3 設(shè)置開機(jī)啟動
systemctl enable memcached.service
systemctl start memcached.service
systemctl status memcached.service
[root@controller01 ~]# netstat -lntup|grep memcached
tcp 0 0 0.0.0.0:11211 0.0.0.0:* LISTEN 13982/memcached
4. 安裝etcd的軟件包
OpenStack服務(wù)可以使用Etcd,這是一種分布式可靠的鍵值存儲搜变,用于分布式密鑰鎖定采缚、存儲配置、跟蹤服務(wù)生存周期和其他場景挠他;用于共享配置和服務(wù)發(fā)現(xiàn)特點(diǎn)是扳抽,安全,具有可選客戶端證書身份驗證的自動TLS殖侵;快速贸呢,基準(zhǔn)測試10,000次/秒;可靠拢军,使用Raft正確分發(fā)楞陷。
在全部控制節(jié)點(diǎn)安裝;
yum install -y etcd
5. 設(shè)置etcd文件
修改配置文件為控制節(jié)點(diǎn)的管理IP地址,使其他節(jié)點(diǎn)能夠通過管理網(wǎng)絡(luò)進(jìn)行訪問:
ETCD_NAME
根據(jù)當(dāng)前實例主機(jī)名進(jìn)行修改
cp -a /etc/etcd/etcd.conf{,.bak}
[root@controller01 ~]# cat > /etc/etcd/etcd.conf <<EOF
#[Member]
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="http://0.0.0.0:2380"
ETCD_LISTEN_CLIENT_URLS="http://10.15.253.163:2379,http://127.0.0.1:2379"
ETCD_NAME="controller01"
#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://10.15.253.163:2380"
ETCD_ADVERTISE_CLIENT_URLS="http://10.15.253.163:2379"
ETCD_INITIAL_CLUSTER="controller01=http://10.15.253.163:2380,controller02=http://10.15.253.195:2380,controller03=http://10.15.253.227:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster-01"
ETCD_INITIAL_CLUSTER_STATE="new"
EOF
[root@controller02 ~]# cat > /etc/etcd/etcd.conf <<EOF
#[Member]
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="http://0.0.0.0:2380"
ETCD_LISTEN_CLIENT_URLS="http://10.15.253.195:2379,http://127.0.0.1:2379"
ETCD_NAME="controller02"
#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://10.15.253.195:2380"
ETCD_ADVERTISE_CLIENT_URLS="http://10.15.253.195:2379"
ETCD_INITIAL_CLUSTER="controller01=http://10.15.253.163:2380,controller02=http://10.15.253.195:2380,controller03=http://10.15.253.227:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster-01"
ETCD_INITIAL_CLUSTER_STATE="new"
EOF
[root@controller03 ~]# cat > /etc/etcd/etcd.conf <<EOF
#[Member]
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="http://0.0.0.0:2380"
ETCD_LISTEN_CLIENT_URLS="http://10.15.253.227:2379,http://127.0.0.1:2379"
ETCD_NAME="controller03"
#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://10.15.253.227:2380"
ETCD_ADVERTISE_CLIENT_URLS="http://10.15.253.227:2379"
ETCD_INITIAL_CLUSTER="controller01=http://10.15.253.163:2380,controller02=http://10.15.253.195:2380,controller03=http://10.15.253.227:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster-01"
ETCD_INITIAL_CLUSTER_STATE="new"
EOF
6. 修改etcd.service
以controller01為例;
[root@controller01 ~]# vim /usr/lib/systemd/system/etcd.service
[Service]
Type=notify
WorkingDirectory=/var/lib/etcd/
EnvironmentFile=-/etc/etcd/etcd.conf
User=etcd
# set GOMAXPROCS to number of processors
ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /usr/bin/etcd \
--name=\"${ETCD_NAME}\" \
--data-dir=\"${ETCD_DATA_DIR}\" \
--listen-peer-urls=\"${ETCD_LISTEN_PEER_URLS}\" \
--listen-client-urls=\"${ETCD_LISTEN_CLIENT_URLS}\" \
--initial-advertise-peer-urls=\"${ETCD_INITIAL_ADVERTISE_PEER_URLS}\" \
--advertise-client-urls=\"${ETCD_ADVERTISE_CLIENT_URLS}\" \
--initial-cluster=\"${ETCD_INITIAL_CLUSTER}\" \
--initial-cluster-token=\"${ETCD_INITIAL_CLUSTER_TOKEN}\" \
--initial-cluster-state=\"${ETCD_INITIAL_CLUSTER_STATE}\""
Restart=on-failure
LimitNOFILE=65536
拷貝到其他兩臺控制節(jié)點(diǎn)黑界;
scp -rp /usr/lib/systemd/system/etcd.service controller02:/usr/lib/systemd/system/
scp -rp /usr/lib/systemd/system/etcd.service controller03:/usr/lib/systemd/system/
7. 設(shè)置開機(jī)啟動
全部控制節(jié)點(diǎn)同時執(zhí)行讥此;
systemctl enable etcd
systemctl restart etcd
systemctl status etcd
驗證
etcdctl cluster-health
etcdctl member list
八、配置Pacemaker高可用集群
https://docs.openstack.org/ha-guide/index.html
服務(wù) | 作用 |
---|---|
pacemaker | 資源管理器(CRM)陈肛,負(fù)責(zé)啟動與停止服務(wù),位于 HA 集群架構(gòu)中資源管理、資源代理層 |
corosync | 消息層組件(Messaging Layer)趾诗,管理成員關(guān)系、消息與仲裁鸿竖,為高可用環(huán)境中提供通訊服務(wù)沧竟,位于高可用集群架構(gòu)的底層,為各節(jié)點(diǎn)(node)之間提供心跳信息 |
resource-agents | 資源代理缚忧,在節(jié)點(diǎn)上接收CRM的調(diào)度悟泵,對某一資源進(jìn)行管理的工具,管理工具通常為腳本 |
pcs | 命令行工具集 |
fence-agents | 在一個節(jié)點(diǎn)不穩(wěn)定或無答復(fù)時將其關(guān)閉闪水,使其不會損壞集群的其它資源糕非,其主要作用是消除腦裂 |
Openstack官網(wǎng)使用開源的pacemaker cluster stack做為集群高可用資源管理軟件。
1 安裝相關(guān)軟件
在全部控制節(jié)點(diǎn)安裝相關(guān)服務(wù);以controller01節(jié)點(diǎn)為例
[root@controller01 ~]# yum install pacemaker pcs corosync fence-agents resource-agents -y
2 構(gòu)建集群
2.1 啟動pcs服務(wù)
在全部控制節(jié)點(diǎn)執(zhí)行朽肥,以controller01節(jié)點(diǎn)為例
[root@controller01 ~]# systemctl enable pcsd
[root@controller01 ~]# systemctl start pcsd
2.2 修改集群管理員hacluster(默認(rèn)生成)密碼
在全部控制節(jié)點(diǎn)執(zhí)行禁筏,以controller01節(jié)點(diǎn)為例
[root@controller01 ~]# echo Zx***** | passwd --stdin hacluster
2.3 認(rèn)證操作
認(rèn)證配置在任意節(jié)點(diǎn)操作,以controller01節(jié)點(diǎn)為例衡招;
節(jié)點(diǎn)認(rèn)證篱昔,組建集群,需要采用上一步設(shè)置的password
[root@controller01 ~]# pcs host auth controller01 controller02 controller03 -u hacluster -p Zx*****
controller01: Authorized
controller03: Authorized
controller02: Authorized
#centos7的命令(僅作為記錄)
pcs cluster auth controller01 controller02 controller03 -u hacluster -p Zx***** --force
2.4 創(chuàng)建并命名集群始腾,
在任意節(jié)點(diǎn)操作州刽;以controller01節(jié)點(diǎn)為例;
[root@controller01 ~]# pcs cluster setup openstack-cluster-01 --start controller01 controller02 controller03
No addresses specified for host 'controller01', using 'controller01'
No addresses specified for host 'controller02', using 'controller02'
No addresses specified for host 'controller03', using 'controller03'
Destroying cluster on hosts: 'controller01', 'controller02', 'controller03'...
controller02: Successfully destroyed cluster
controller03: Successfully destroyed cluster
controller01: Successfully destroyed cluster
Requesting remove 'pcsd settings' from 'controller01', 'controller02', 'controller03'
controller01: successful removal of the file 'pcsd settings'
controller02: successful removal of the file 'pcsd settings'
controller03: successful removal of the file 'pcsd settings'
Sending 'corosync authkey', 'pacemaker authkey' to 'controller01', 'controller02', 'controller03'
controller01: successful distribution of the file 'corosync authkey'
controller01: successful distribution of the file 'pacemaker authkey'
controller02: successful distribution of the file 'corosync authkey'
controller02: successful distribution of the file 'pacemaker authkey'
controller03: successful distribution of the file 'corosync authkey'
controller03: successful distribution of the file 'pacemaker authkey'
Sending 'corosync.conf' to 'controller01', 'controller02', 'controller03'
controller01: successful distribution of the file 'corosync.conf'
controller02: successful distribution of the file 'corosync.conf'
controller03: successful distribution of the file 'corosync.conf'
Cluster has been successfully set up.
Starting cluster on hosts: 'controller01', 'controller02', 'controller03'...
#centos7的命令(僅作為記錄)
pcs cluster setup --force --name openstack-cluster-01 controller01 controller02 controller03
2.5 pcemaker集群啟動
[root@controller01 ~]# pcs cluster start --all
controller03: Starting Cluster...
controller01: Starting Cluster...
controller02: Starting Cluster...
[root@controller01 ~]# pcs cluster enable --all
controller01: Cluster Enabled
controller02: Cluster Enabled
controller03: Cluster Enabled
2.6 查看pacemaker集群狀態(tài)
查看集群狀態(tài)浪箭,也可使用crm_mon -1
命令穗椅;
[root@controller01 ~]# pcs cluster status
Cluster Status:
Cluster Summary:
* Stack: corosync
* Current DC: controller02 (version 2.0.3-5.el8_2.1-4b1f869f0f) - partition with quorum
* Last updated: Sat Aug 29 00:37:11 2020
* Last change: Sat Aug 29 00:31:57 2020 by hacluster via crmd on controller02
* 3 nodes configured
* 0 resource instances configured
Node List:
* Online: [ controller01 controller02 controller03 ]
PCSD Status:
controller01: Online
controller03: Online
controller02: Online
通過cibadmin --query --scope nodes
可查看節(jié)點(diǎn)配置
[root@controller01 ~]# cibadmin --query --scope nodes
<nodes>
<node id="1" uname="controller01"/>
<node id="2" uname="controller02"/>
<node id="3" uname="controller03"/>
</nodes>
2.7 查看corosync狀態(tài)
corosync
表示一種底層狀態(tài)等信息的同步方式
[root@controller01 ~]# pcs status corosync
Membership information
----------------------
Nodeid Votes Name
1 1 controller01 (local)
2 1 controller02
3 1 controller03
2.8 查看節(jié)點(diǎn)和資源
#查看節(jié)點(diǎn)
[root@controller01 ~]# corosync-cmapctl | grep members
runtime.members.1.config_version (u64) = 0
runtime.members.1.ip (str) = r(0) ip(10.15.253.163)
runtime.members.1.join_count (u32) = 1
runtime.members.1.status (str) = joined
runtime.members.2.config_version (u64) = 0
runtime.members.2.ip (str) = r(0) ip(10.15.253.195)
runtime.members.2.join_count (u32) = 1
runtime.members.2.status (str) = joined
runtime.members.3.config_version (u64) = 0
runtime.members.3.ip (str) = r(0) ip(10.15.253.227)
runtime.members.3.join_count (u32) = 1
runtime.members.3.status (str) = joined
#查看資源
[root@controller01 ~]# pcs resource
NO resources configured
2.9 通過web界面訪問pacemaker
訪問任意控制節(jié)點(diǎn):https://10.15.253.163:2224
賬號/密碼(即構(gòu)建集群時生成的密碼):hacluster/Zx*****
2.10 設(shè)置高可用屬性
在任意控制節(jié)點(diǎn)設(shè)置屬性即可,以controller01節(jié)點(diǎn)為例奶栖;
- 設(shè)置合適的輸入處理歷史記錄及策略引擎生成的錯誤與警告匹表,在
trouble shooting
故障排查時有用
[root@controller01 ~]# pcs property set pe-warn-series-max=1000 \
pe-input-series-max=1000 \
pe-error-series-max=1000
- pacemaker基于時間驅(qū)動的方式進(jìn)行狀態(tài)處理,
cluster-recheck-interval
默認(rèn)定義某些pacemaker操作發(fā)生的事件間隔為15min宣鄙,建議設(shè)置為5min或3min
[root@controller01 ~]# pcs property set cluster-recheck-interval=5
- corosync默認(rèn)啟用
stonith
袍镀,但stonith
機(jī)制(通過ipmi或ssh關(guān)閉節(jié)點(diǎn))并沒有配置相應(yīng)的stonith
設(shè)備(通過crm_verify -L -V
驗證配置是否正確,沒有輸出即正確)框冀,此時pacemaker將拒絕啟動任何資源流椒;在生產(chǎn)環(huán)境可根據(jù)情況靈活調(diào)整,測試環(huán)境下可關(guān)閉
[root@controller01 ~]# pcs property set stonith-enabled=false
- 默認(rèn)當(dāng)有半數(shù)以上節(jié)點(diǎn)在線時明也,集群認(rèn)為自己擁有法定人數(shù)宣虾,是“合法”的,滿足公式:total_nodes < 2 * active_nodes温数;
- 以3個節(jié)點(diǎn)的集群計算绣硝,當(dāng)故障2個節(jié)點(diǎn)時,集群狀態(tài)不滿足上述公式撑刺,此時集群即非法鹉胖;當(dāng)集群只有2個節(jié)點(diǎn)時,故障1個節(jié)點(diǎn)集群即非法够傍,所謂的”雙節(jié)點(diǎn)集群”就沒有意義甫菠;
- 在實際生產(chǎn)環(huán)境中,做2節(jié)點(diǎn)集群冕屯,無法仲裁時寂诱,可選擇忽略;做3節(jié)點(diǎn)集群安聘,可根據(jù)對集群節(jié)點(diǎn)的高可用閥值靈活設(shè)置
[root@controller01 ~]# pcs property set no-quorum-policy=ignore
- v2的heartbeat為了支持多節(jié)點(diǎn)集群痰洒,提供了一種積分策略來控制各個資源在集群中各節(jié)點(diǎn)之間的切換策略瓢棒;通過計算出各節(jié)點(diǎn)的的總分?jǐn)?shù),得分最高者將成為active狀態(tài)來管理某個(或某組)資源丘喻;
- 默認(rèn)每一個資源的初始分?jǐn)?shù)(取全局參數(shù)default-resource-stickiness脯宿,通過"pcs property list --all"查看)是0,同時每一個資源在每次失敗之后減掉的分?jǐn)?shù)(取全局參數(shù)default-resource-failure-stickiness)也是0泉粉,此時一個資源不論失敗多少次连霉,heartbeat都只是執(zhí)行restart操作,不會進(jìn)行節(jié)點(diǎn)切換搀继;
- 如果針對某一個資源設(shè)置初始分?jǐn)?shù)”resource-stickiness“或"resource-failure-stickiness"窘面,則取單獨(dú)設(shè)置的資源分?jǐn)?shù)翠语;
- 一般來說叽躯,resource-stickiness的值都是正數(shù),resource-failure-stickiness的值都是負(fù)數(shù)肌括;有一個特殊值是正無窮大(INFINITY)和負(fù)無窮大(-INFINITY)点骑,即"永遠(yuǎn)不切換"與"只要失敗必須切換",是用來滿足極端規(guī)則的簡單配置項谍夭;
- 如果節(jié)點(diǎn)的分?jǐn)?shù)為負(fù)黑滴,該節(jié)點(diǎn)在任何情況下都不會接管資源(冷備節(jié)點(diǎn));如果某節(jié)點(diǎn)的分?jǐn)?shù)大于當(dāng)前運(yùn)行該資源的節(jié)點(diǎn)的分?jǐn)?shù)紧索,heartbeat會做出切換動作袁辈,現(xiàn)在運(yùn)行該資源的節(jié)點(diǎn)將釋 放資源,分?jǐn)?shù)高出的節(jié)點(diǎn)將接管該資源
- pcs property list 只可查看修改后的屬性值珠漂,參數(shù)”--all”可查看含默認(rèn)值的全部屬性值晚缩;
- 也可查看/var/lib/pacemaker/cib/cib.xml文件,或”pcs cluster cib”媳危,或“cibadmin --query --scope crm_config”查看屬性設(shè)置荞彼,” cibadmin --query --scope resources”查看資源配置
[root@controller01 ~]# pcs property list
Cluster Properties:
cluster-infrastructure: corosync
cluster-name: openstack-cluster-01
cluster-recheck-interval: 5
dc-version: 2.0.3-5.el8_2.1-4b1f869f0f
have-watchdog: false
no-quorum-policy: ignore
pe-error-series-max: 1000
pe-input-series-max: 1000
pe-warn-series-max: 1000
stonith-enabled: false
3. 配置 vip
- 在任意控制節(jié)點(diǎn)設(shè)置vip(resource_id屬性)即可,命名即為
vip
待笑; - ocf(standard屬性):資源代理(resource agent)的一種鸣皂,另有systemd,lsb暮蹂,service等寞缝;
- heartbeat:資源腳本的提供者(provider屬性),ocf規(guī)范允許多個供應(yīng)商提供同一資源代理仰泻,大多數(shù)ocf規(guī)范提供的資源代理都使用heartbeat作為provider荆陆;
- IPaddr2:資源代理的名稱(type屬性),IPaddr2便是資源的type我纪;
- cidr_netmask: 子網(wǎng)掩碼位數(shù)
- 通過定義資源屬性(standard:provider:type)慎宾,定位
vip
資源對應(yīng)的ra腳本位置丐吓; - centos系統(tǒng)中,符合ocf規(guī)范的ra腳本位于
/usr/lib/ocf/resource.d/
目錄趟据,目錄下存放了全部的provider券犁,每個provider目錄下有多個type; - op:表示Operations(運(yùn)作方式 監(jiān)控間隔= 30s)
[root@controller01 ~]# pcs resource create vip ocf:heartbeat:IPaddr2 ip=10.15.253.88 cidr_netmask=24 op monitor interval=30s
查看集群資源
通過
pcs resouce
查詢汹碱,vip資源在controller01節(jié)點(diǎn)粘衬;通過
ip a show
可查看vip
[root@controller01 ~]# pcs resource
* vip (ocf::heartbeat:IPaddr2): Started controller01
[root@controller01 ~]# ip a show ens192
2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 00:50:56:82:82:40 brd ff:ff:ff:ff:ff:ff
inet 10.15.253.163/12 brd 10.15.255.255 scope global noprefixroute ens192
valid_lft forever preferred_lft forever
inet 10.15.253.88/24 brd 10.15.255.255 scope global ens192
valid_lft forever preferred_lft forever
inet6 fe80::250:56ff:fe82:8240/64 scope link
valid_lft forever preferred_lft forever
可選(根據(jù)業(yè)務(wù)需求是否區(qū)分來決定):
如果api
區(qū)分管理員/內(nèi)部/公共的接口,對客戶端只開放公共接口咳促,通常設(shè)置兩個vip稚新,如在命名時設(shè)置為:
vip_management 與 vip_public
建議是將vip_management與vip_public約束在1個節(jié)點(diǎn)上
[root@controller01 ~]# pcs constraint colocation add vip_management with vip_public
4. 高可用性管理
通過web訪問任意控制節(jié)點(diǎn):https://10.15.253.163:2224
賬號/密碼(即構(gòu)建集群時生成的密碼):hacluster/Zx*****
雖然以命令行的方式設(shè)置了集群,但web界面默認(rèn)并不顯示跪腹,手動添加集群褂删,實際操作只需要添加已組建集群的任意節(jié)點(diǎn)即可,如下
九冲茸、部署Haproxy
https://docs.openstack.org/ha-guide/control-plane-stateless.html#load-balancer
1. 安裝haproxy(控制節(jié)點(diǎn))
在全部控制節(jié)點(diǎn)安裝haproxy屯阀,以controller01節(jié)點(diǎn)為例;
[root@controller01 ~]# yum install haproxy -y
2. 配置haproxy.cfg
在全部控制節(jié)點(diǎn)配置轴术,以controller01節(jié)點(diǎn)為例难衰;
創(chuàng)建HAProxy記錄日志文件并授權(quán)
建議開啟haproxy的日志功能,便于后續(xù)的問題排查
[root@controller01 ~]# mkdir /var/log/haproxy
[root@controller01 ~]# chmod a+w /var/log/haproxy
在rsyslog文件下修改以下字段
#取消注釋并添加
[root@controller01 ~]# vim /etc/rsyslog.conf
19 module(load="imudp") # needs to be done just once
20 input(type="imudp" port="514")
24 module(load="imtcp") # needs to be done just once
25 input(type="imtcp" port="514")
#在文件最后添加haproxy配置日志
local0.=info -/var/log/haproxy/haproxy-info.log
local0.=err -/var/log/haproxy/haproxy-err.log
local0.notice;local0.!=err -/var/log/haproxy/haproxy-notice.log
#重啟rsyslog
[root@controller01 ~]# systemctl restart rsyslog
集群的haproxy文件逗栽,涉及服務(wù)較多盖袭,這里針對涉及到的openstack服務(wù),一次性設(shè)置完成:
使用vip 10.15.253.88
[root@controller01 ~]# cp /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg.bak
[root@controller01 ~]# cat /etc/haproxy/haproxy.cfg
global
log 127.0.0.1 local0
chroot /var/lib/haproxy
daemon
group haproxy
user haproxy
maxconn 4000
pidfile /var/run/haproxy.pid
stats socket /var/lib/haproxy/stats
defaults
mode http
log global
maxconn 4000 #最大連接數(shù)
option httplog
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout check 10s
# haproxy監(jiān)控頁
listen stats
bind 0.0.0.0:1080
mode http
stats enable
stats uri /
stats realm OpenStack\ Haproxy
stats auth admin:admin
stats refresh 30s
stats show-node
stats show-legends
stats hide-version
# horizon服務(wù)
listen dashboard_cluster
bind 10.15.253.88:80
balance source
option tcpka
option httpchk
option tcplog
server controller01 10.15.253.163:80 check inter 2000 rise 2 fall 5
server controller02 10.15.253.195:80 check inter 2000 rise 2 fall 5
server controller03 10.15.253.227:80 check inter 2000 rise 2 fall 5
# mariadb服務(wù)彼宠;
#設(shè)置controller01節(jié)點(diǎn)為master鳄虱,controller02/03節(jié)點(diǎn)為backup,一主多備的架構(gòu)可規(guī)避數(shù)據(jù)不一致性兵志;
#另外官方示例為檢測9200(心跳)端口醇蝴,測試在mariadb服務(wù)宕機(jī)的情況下,雖然”/usr/bin/clustercheck”腳本已探測不到服務(wù)想罕,但受xinetd控制的9200端口依然正常悠栓,導(dǎo)致haproxy始終將請求轉(zhuǎn)發(fā)到mariadb服務(wù)宕機(jī)的節(jié)點(diǎn),暫時修改為監(jiān)聽3306端口
listen galera_cluster
bind 10.15.253.88:3306
balance source
mode tcp
server controller01 10.15.253.163:3306 check inter 2000 rise 2 fall 5
server controller02 10.15.253.195:3306 backup check inter 2000 rise 2 fall 5
server controller03 10.15.253.227:3306 backup check inter 2000 rise 2 fall 5
#為rabbirmq提供ha集群訪問端口按价,供openstack各服務(wù)訪問惭适;
#如果openstack各服務(wù)直接連接rabbitmq集群,這里可不設(shè)置rabbitmq的負(fù)載均衡
listen rabbitmq_cluster
bind 10.15.253.88:5673
mode tcp
option tcpka
balance roundrobin
timeout client 3h
timeout server 3h
option clitcpka
server controller01 10.15.253.163:5672 check inter 10s rise 2 fall 5
server controller02 10.15.253.195:5672 check inter 10s rise 2 fall 5
server controller03 10.15.253.227:5672 check inter 10s rise 2 fall 5
# glance_api服務(wù)
listen glance_api_cluster
bind 10.15.253.88:9292
balance source
option tcpka
option httpchk
option tcplog
timeout client 3h
timeout server 3h
server controller01 10.15.253.163:9292 check inter 2000 rise 2 fall 5
server controller02 10.15.253.195:9292 check inter 2000 rise 2 fall 5
server controller03 10.15.253.227:9292 check inter 2000 rise 2 fall 5
# keystone_public _api服務(wù)
listen keystone_public_cluster
bind 10.15.253.88:5000
balance source
option tcpka
option httpchk
option tcplog
server controller01 10.15.253.163:5000 check inter 2000 rise 2 fall 5
server controller02 10.15.253.195:5000 check inter 2000 rise 2 fall 5
server controller03 10.15.253.227:5000 check inter 2000 rise 2 fall 5
listen nova_compute_api_cluster
bind 10.15.253.88:8774
balance source
option tcpka
option httpchk
option tcplog
server controller01 10.15.253.163:8774 check inter 2000 rise 2 fall 5
server controller02 10.15.253.195:8774 check inter 2000 rise 2 fall 5
server controller03 10.15.253.227:8774 check inter 2000 rise 2 fall 5
listen nova_placement_cluster
bind 10.15.253.88:8778
balance source
option tcpka
option tcplog
server controller01 10.15.253.163:8778 check inter 2000 rise 2 fall 5
server controller02 10.15.253.195:8778 check inter 2000 rise 2 fall 5
server controller03 10.15.253.227:8778 check inter 2000 rise 2 fall 5
listen nova_metadata_api_cluster
bind 10.15.253.88:8775
balance source
option tcpka
option tcplog
server controller01 10.15.253.163:8775 check inter 2000 rise 2 fall 5
server controller02 10.15.253.195:8775 check inter 2000 rise 2 fall 5
server controller03 10.15.253.227:8775 check inter 2000 rise 2 fall 5
listen nova_vncproxy_cluster
bind 10.15.253.88:6080
balance source
option tcpka
option tcplog
server controller01 10.15.253.163:6080 check inter 2000 rise 2 fall 5
server controller02 10.15.253.195:6080 check inter 2000 rise 2 fall 5
server controller03 10.15.253.227:6080 check inter 2000 rise 2 fall 5
listen neutron_api_cluster
bind 10.15.253.88:9696
balance source
option tcpka
option httpchk
option tcplog
server controller01 10.15.253.163:9696 check inter 2000 rise 2 fall 5
server controller02 10.15.253.195:9696 check inter 2000 rise 2 fall 5
server controller03 10.15.253.227:9696 check inter 2000 rise 2 fall 5
listen cinder_api_cluster
bind 10.15.253.88:8776
balance source
option tcpka
option httpchk
option tcplog
server controller01 10.15.253.163:8776 check inter 2000 rise 2 fall 5
server controller02 10.15.253.195:8776 check inter 2000 rise 2 fall 5
server controller03 10.15.253.227:8776 check inter 2000 rise 2 fall 5
將配置文件拷貝到其他節(jié)點(diǎn)中:
scp /etc/haproxy/haproxy.cfg controller02:/etc/haproxy/haproxy.cfg
scp /etc/haproxy/haproxy.cfg controller03:/etc/haproxy/haproxy.cfg
3. 配置內(nèi)核參數(shù)
在基礎(chǔ)環(huán)境準(zhǔn)備中已經(jīng)配置楼镐,這里再做一次記錄癞志,以controller01節(jié)點(diǎn)為例;
- net.ipv4.ip_nonlocal_bind:是否允許no-local ip綁定框产,關(guān)系到haproxy實例與vip能否綁定并切換
- net.ipv4.ip_forward:是否允許轉(zhuǎn)發(fā)
echo 'net.ipv4.ip_nonlocal_bind = 1' >>/etc/sysctl.conf
echo "net.ipv4.ip_forward = 1" >>/etc/sysctl.conf
sysctl -p
4. 啟動服務(wù)
在全部控制節(jié)點(diǎn)啟動凄杯,以controller01節(jié)點(diǎn)為例错洁;
開機(jī)啟動是否設(shè)置可自行選擇,利用pacemaker設(shè)置haproxy相關(guān)資源后戒突,pacemaker可控制各節(jié)點(diǎn)haproxy服務(wù)是否啟動
systemctl enable haproxy
systemctl restart haproxy
systemctl status haproxy
5. 訪問網(wǎng)站
訪問:http://10.15.253.88:1080 用戶名/密碼:admin/admin
每個項的狀態(tài)可以清晰看到屯碴;文中圖片為glance創(chuàng)建完成后的截圖,因glance未安裝膊存;在此步驟會顯示紅色
6. 設(shè)置pcs資源
6.1 添加資源 lb-haproxy-clone
任意控制節(jié)點(diǎn)操作即可导而,以controller01節(jié)點(diǎn)為例;
[root@controller01 ~]# pcs resource create lb-haproxy systemd:haproxy clone
[root@controller01 ~]# pcs resource
* vip (ocf::heartbeat:IPaddr2): Started controller01
* Clone Set: lb-haproxy-clone [lb-haproxy]:
* Started: [ controller01 ]
6.2 設(shè)置資源啟動順序隔崎,先vip再lb-haproxy-clone今艺;
通過cibadmin --query --scope constraints
可查看資源約束配置
[root@controller01 ~]# pcs constraint order start vip then lb-haproxy-clone kind=Optional
Adding vip lb-haproxy-clone (kind: Optional) (Options: first-action=start then-action=start)
6.3 將兩種資源約束在1個節(jié)點(diǎn)
官方建議設(shè)置vip運(yùn)行在haproxy active的節(jié)點(diǎn),通過綁定lb-haproxy-clone
與vip服務(wù)爵卒,所以將兩種資源約束在1個節(jié)點(diǎn)虚缎;約束后,從資源角度看技潘,其余暫時沒有獲得vip的節(jié)點(diǎn)的haproxy會被pcs關(guān)閉
[root@controller01 ~]# pcs constraint colocation add lb-haproxy-clone with vip
[root@controller01 ~]# pcs resource
* vip (ocf::heartbeat:IPaddr2): Started controller01
* Clone Set: lb-haproxy-clone [lb-haproxy]:
* Started: [ controller01 ]
* Stopped: [ controller02 controller03 ]
6.4 通過pacemaker高可用管理查看資源相關(guān)的設(shè)置
高可用配置(pacemaker&haproxy)部署完畢遥巴。