前言
因資源成本問題群井,本Harbor高可用架構(gòu)為最小開銷方案,如果資源充足琴锭,可以將PG、Redis全部使用使用云廠商集群模式衙传。
同時(shí)為了配置簡單决帖,并沒用使用keepalived與heartbeat等高可用開源組件。
準(zhǔn)備工作
| 阿里云SLB | 阿里云ECS | 共享存儲 | Redis |
| :---------: | :-------: | :-------: | :---------: |
| 最小實(shí)例SLB | 2c4g 倆臺 | 阿里云NFS | 阿里云Redis |
操作系統(tǒng)為Ubuntu18.04蓖捶,在倆臺ECS上搭建主從PG地回,如果不想用阿里云redis,也可以使用ECS搭建Redis俊鱼。
安裝Harbor刻像,用于導(dǎo)出基礎(chǔ)harbor數(shù)據(jù),恢復(fù)到PG集群中.
- 安裝docker-compose
curl -L "https://github.com/docker/compose/releases/download/1.24.0/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose
sudo add-apt-repository "deb [arch=amd64] http://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable"
# 添加國內(nèi)阿里云
curl -fsSL http://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add -
#更新
sudo apt-get update
[[查看docker]]版本
apt-cache madison docker-ce
#安裝最新版
sudo apt-get install -y docker-ce
[[安裝5]]:19.03.6~3-0~ubuntu-bionic版
sudo apt-get install -y docker-ce=5:19.03.6~3-0~ubuntu-bionic
- Docker配置鏡像加速與國內(nèi)docker-cn源
sudo tee /etc/docker/daemon.json <<-'EOF'
{
"registry-mirrors": ["https://8sab4djv.mirror.aliyuncs.com"],
"registry-mirrors": ["https://registry.docker-cn.com"],
"insecure-registries": ["https://harbor.unixsre.com"]
}
EOF
sudo systemctl daemon-reload
sudo systemctl restart docker
- 安裝Harbor2.3
# 下載Harbor
wget -P /usr/local wget https://github.com/goharbor/harbor/releases/download/v2.3.2/harbor-online-installer-v2.3.2.tgz
tar zxf /usr/local/harbor-online-installer-v2.3.2.tgz -C /data/harbor
# 修改配置文件并闲,根據(jù)自己的需求進(jìn)行修改
cd /var/www/dream/harbor
cp harbor.yml.tmpl harbor.yml
# harbor.yml中按需修改或添加如下內(nèi)容
# Configuration file of Harbor
# The IP address or hostname to access admin UI and registry service.
# DO NOT use localhost or 127.0.0.1, because Harbor needs to be accessed by external clients.
hostname: harbor.unixsre.com
# http related config
http:
# port for http, default is 80. If https enabled, this port will redirect to https port
port: 80
# https related config
https:
# https port for harbor, default is 443
port: 443
# The path of cert and key files for nginx
certificate: /data/harbor/ssl/unixsre.com.cer
private_key: /data/harbor/ssl/unixsre.com.key
# # Uncomment following will enable tls communication between all harbor components
# internal_tls:
# # set enabled to true means internal tls is enabled
# enabled: true
# # put your cert and key files on dir
# dir: /etc/harbor/tls/internal
# Uncomment external_url if you want to enable external proxy
# And when it enabled the hostname will no longer used
# external_url: https://reg.mydomain.com:8433
# The initial password of Harbor admin
# It only works in first time to install harbor
# Remember Change the admin password from UI after launching Harbor.
# 初始密碼细睡,可以修改成自己需要的,然后后續(xù)在WEBUI上自行修改帝火。
harbor_admin_password: 1234567
## 添加禁止用戶自注冊
self_registration: off
## 設(shè)置只有管理員可以創(chuàng)建項(xiàng)目
project_creation_restriction: adminonly
# The default data volume
data_volume: /data/harbor
# 執(zhí)行安裝命令
bash /data/harbor/install.sh
# 如果對配置文件harbor.yml溜徙,需要使用./prepare腳本重新生成
./prepare
# 重啟
docker-compose restart
- 常用命令示例
# 登錄
docker login https://harbor.unixsre.com
# 拉取
docker pull busybox
# 打包
docker build -t busybox:v1 .
docker build -t busybox:v1 -f Dockerfile .
# 打TAG
docker tag busybox:latest harbor.unixsre.com/ops/busybox:latest
# 上傳
docker push harbor.unixsre.com/library/busybox:latest
# k3s pull
k3s crictl pull harbor.unixsre.com/library/busybox
- 備份harbor庫,并且導(dǎo)出用于恢復(fù).
# 進(jìn)入容器備份
docker container exec -it harbor-db /bin/bash
# 執(zhí)行pg備份
pg_dump -U postgres registry > /tmp/registry.sql
pg_dump -U postgres notarysigner > /tmp/notarysigner.sql
pg_dump -U postgres notaryserver > /tmp/notaryserver.sql
# 復(fù)制到本地宿主機(jī)
docker container cp harbor-db:/tmp/registry.sql /data/harbor/backup_sql/
docker container cp harbor-db:/tmp/notarysigner.sql /data/harbor/backup_sql/
docker container cp harbor-db:/tmp/notaryserver.sql /data/harbor/backup_sql/
安裝PG主從集群
PostgreSql主從復(fù)制是一種高可用解決方案犀填,可以實(shí)現(xiàn)讀寫分離蠢壹,實(shí)時(shí)備份,PG的主從復(fù)制是基于xlog來實(shí)現(xiàn)的九巡,主庫開啟日志功能图贸,從庫根據(jù)主庫xlog來完成數(shù)據(jù)的同步。
PG主從復(fù)制注意事項(xiàng):
啟動從庫之前: 不能執(zhí)行初始化冕广,若已經(jīng)初始化了需要刪掉對應(yīng)的目錄中的數(shù)據(jù)文件疏日。
啟動從庫之前: 需要通過base_backup從主服務(wù)器上同步配置與數(shù)據(jù)。
啟動從庫之前: 需要對同步之后的配置文件(standby.signal)進(jìn)行修改撒汉。
從庫只能讀沟优,不能寫。
- 分別在每個(gè)ECS安裝postgresql-13
# 添加PG apt源
sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
# 更新源
apt-get update
# 安裝PG13
apt -y install postgresql-13 postgresql-client-13 postgresql-contrib
# 驗(yàn)證服務(wù)是否啟動成功
systemctl status postgresql@13-main.service
# 登錄驗(yàn)證修改密碼
sudo -i -u postgres psql -p 5432
ALTER USER postgres WITH PASSWORD '1234567.com';
# 登錄驗(yàn)證
psql -h localhost -p 5432 -U postgres
- 創(chuàng)建PG數(shù)據(jù)目錄神凑,分別在每個(gè)機(jī)器上創(chuàng)建.
#創(chuàng)建數(shù)據(jù)目錄
mkdir -p /data/harbor_nas/pgsql/data && chown postgres:postgres /data/harbor_nas/pgsql/data
#創(chuàng)建歸檔目錄
mkdir -p /data/harbor_nas/pgsql/pg_archive && chown postgres:postgres /data/harbor_nas/pgsql/pg_archive
#給目錄賦權(quán)
chmod 700 /data/harbor_nas/pgsql/pg_archive/ && chmod 700 /data/harbor_nas/pgsql/data/
- 添加systemd啟動配置文件中的數(shù)據(jù)目錄環(huán)境變量.
vim /lib/systemd/system/postgresql@.service
Environment=PGDATA=/data/harbor_nas/pgsql/data
# 重載
systemctl daemon-reload
# 刪除默認(rèn)集群
pg_dropcluster --stop 13 main
# 在新目錄創(chuàng)建集群
pg_createcluster -d /data/harbor_nas/pgsql/data 13 main
# 重啟服務(wù)
systemctl restart postgresql@13-main.service
# 配置開機(jī)啟動
systemctl enable postgresql@13-main.service
#開啟外部訪問配置
vim /etc/postgresql/13/main/pg_hba.conf
local all postgres peer
# TYPE DATABASE USER ADDRESS METHOD
# "local" is for Unix domain socket connections only
local all all peer
# IPv4 local connections:
host all all 0.0.0.0/0 md5
# IPv6 local connections:
host all all ::1/128 md5
# Allow replication connections from localhost, by a user with the
# replication privilege.
local replication all peer
host replication all 127.0.0.1/32 md5
host replication all ::1/128 md5
# 修改集群監(jiān)聽地址
vim /etc/postgresql/13/main/postgresql.conf
listen_addresses = '*'
# 重啟服務(wù)
systemctl restart postgresql@13-main.service
- 主服務(wù)器配置
# 創(chuàng)建具有復(fù)制流操作權(quán)限的的用戶:replica
CREATE ROLE replica login replication encrypted password 'Deniss_12PRO@@@';
# 添加從服務(wù)器免密登錄,replica為用戶,172.19.48.254X為從節(jié)點(diǎn)的內(nèi)網(wǎng)IP净神,md5為允許密碼驗(yàn)證, trust為免密何吝。
vim /etc/postgresql/13/main/pg_hba.conf
host replication replica 172.19.48.254/20 trust
# 添加主服務(wù)器postgresql.conf配置
vim /etc/postgresql/13/main/postgresql.conf
listen_addresses = '*'
max_connections = 100
archive_mode = on
archive_command = 'test ! -f /data/harbor_nas/pgsql/pg_archive/%f && cp %p /data/harbor_nas/pgsql/pg_archive/%f'
wal_level = replica
# 重啟服務(wù)
systemctl restart postgresql@13-main.service
- 從服務(wù)器配置
# 如果前面已經(jīng)在從服務(wù)器執(zhí)行過了這個(gè)操作,直接可以進(jìn)入postgres用戶家目錄清理鹃唯、復(fù)制數(shù)據(jù)爱榕。
#創(chuàng)建數(shù)據(jù)目錄
mkdir -p /data/harbor_nas/pgsql_replica/data && chown postgres:postgres /data/harbor_nas/pgsql_replica/data
#創(chuàng)建歸檔目錄
mkdir -p /data/harbor_nas/pgsql_replica/pg_archive && chown postgres:postgres /data/harbor_nas/pgsql_replica/pg_archive
#給目錄賦權(quán)
chmod 700 /data/harbor_nas/pgsql_replica/pg_archive/ && chmod 700 /data/harbor_nas/pgsql_replica/data/
# 添加如下配置
vim /lib/systemd/system/postgresql@.service
Environment=PGDATA=/data/harbor_nas/pgsql_replica/data/
# 重載配置
systemctl daemon-reload
#刪除默認(rèn)目錄的集群
pg_dropcluster --stop 13 main
#在新目錄創(chuàng)建集群
pg_createcluster -d /data/harbor_nas/pgsql_replica/data 13 main
#重啟服務(wù)
systemctl restart postgresql@13-main.service
# 進(jìn)入postgres用戶清理初始化的數(shù)據(jù),從主服務(wù)器復(fù)制數(shù)據(jù)坡慌。
su - postgres
rm -rf /data/harbor_nas/pgsql_replica/data/*
pg_basebackup -h 172.19.48.253 -p 5432 -U replica -Fp -Xs -Pv -R -D /data/harbor_nas/pgsql_replica/data
echo "standby_mode = 'on'" > /data/harbor_nas/pgsql_replica/data/standby.signal
# 修改從服務(wù)器配置
vim /etc/postgresql/13/main/postgresql.conf
primary_conninfo = 'host=172.19.48.253 port=5432 user=replica password=Deniss_12PRO@@@'
recovery_target_timeline = latest
max_connections = 100
hot_standby = on
max_standby_streaming_delay = 30s
wal_receiver_status_interval = 10s
hot_standby_feedback = on
# 啟動從節(jié)點(diǎn)PG數(shù)據(jù)庫
systemctl start postgresql@13-main.service
# 登錄主節(jié)點(diǎn)數(shù)據(jù)庫查看裝
psql -h 172.19.48.253 -p 5432 -U postgres
postgres=# select client_addr,sync_state from pg_stat_replication;
client_addr | sync_state
---------------+------------
172.19.48.254 | async
# 至此黔酥,PG主從復(fù)制安裝完成。
配置Horbor為PG主節(jié)點(diǎn)
- 登錄主節(jié)點(diǎn)創(chuàng)建harbor用戶與harbor需要的DB洪橘,并且將數(shù)據(jù)恢復(fù)到當(dāng)前數(shù)據(jù).
# 新建Harbor用戶
CREATE USER harbor LOGIN PASSWORD 'Deniss1112s';
CREATE SCHEMA harbor;
GRANT harbor TO postgres;GRANT USAGE ON SCHEMA harbor TO postgres;
ALTER SCHEMA harbor OWNER TO postgres;
# 創(chuàng)建數(shù)據(jù)庫
CREATE DATABASE registry OWNER harbor;
CREATE DATABASE notarysigner OWNER harbor;
CREATE DATABASE notaryserver OWNER harbor;
# 授權(quán)
GRANT ALL PRIVILEGES ON DATABASE registry TO harbor;
GRANT ALL PRIVILEGES ON DATABASE notarysigner TO harbor;
GRANT ALL PRIVILEGES ON DATABASE notaryserver TO harbor;
# 恢復(fù)數(shù)據(jù)庫
psql -h localhost -U harbor registry < /data/harbor/backup_sql/registry.sql
psql -h localhost -U harbor notarysigner < /data/harbor/backup_sql/notarysigner.sql
psql -h localhost -U harbor notaryserver < /data/harbor/backup_sql/notaryserver.sql
- 對倆個(gè)ECS的harbor.yml進(jìn)行調(diào)整跪者,開啟外部PG、Redis配置熄求,注釋默認(rèn)PG數(shù)據(jù)庫配置渣玲,注意:倆個(gè)ECS鏈接的必須為一樣的Redis與PG數(shù)據(jù)庫。
hostname: harbor.unixsre.com
http:
port: 80
https:
port: 443
certificate: /data/harbor/ssl/unixsre.com.cer
private_key: /data/harbor/ssl/unixsre.com.key
harbor_admin_password: 1234567
data_volume: /data/harbor_nas/harbor_data
trivy:
ignore_unfixed: false
skip_update: false
insecure: false
jobservice:
max_job_workers: 10
notification:
webhook_job_max_retry: 10
chart:
absolute_url: disabled
log:
level: info
local:
rotate_count: 50
rotate_size: 200M
location: /var/log/harbor
_version: 2.3.0
external_database:
harbor:
host: 172.19.48.253
port: 5432
db_name: registry
username: harbor
password: Deniss1112s
ssl_mode: disable
max_idle_conns: 2
max_open_conns: 0
notary_signer:
host: 172.19.48.253
port: 5432
db_name: notarysigner
username: harbor
password: Deniss1112s
ssl_mode: disable
notary_server:
host: 172.19.48.253
port: 5432
db_name: notaryserver
username: harbor
password: Deniss1112s
ssl_mode: disable
external_redis:
host: 172.19.48.253:6379
password: Deniss1589s
registry_db_index: 1
jobservice_db_index: 2
chartmuseum_db_index: 3
trivy_db_index: 5
idle_timeout_seconds: 30
proxy:
http_proxy:
https_proxy:
no_proxy:
components:
- core
- jobservice
- trivy
- harbor重新生成配置弟晚,并且重啟容器.
cd /data/harbor/
./prepare
docker-compose down && docker-compose up -d
- 在阿里云創(chuàng)建傳統(tǒng)SLB忘衍,使用TCP四層添加443端口監(jiān)聽。
- 將域名綁定在新建的SLB上卿城,這個(gè)SLB不一定非要是阿里云的枚钓,任何云的SLB都可以,比如AWS瑟押、微軟云搀捷、GCP都可以。
PG主從故障切換
假設(shè)主庫宕機(jī)或者主節(jié)點(diǎn)宕機(jī)多望,因?yàn)槲覀兊腞edis在阿里云嫩舟,而Harbor的鏡像數(shù)據(jù)在阿里云的NFS,要保證服務(wù)的可用性便斥,這個(gè)時(shí)候至壤,只需要快速的將從節(jié)點(diǎn)切換為主庫威始,并且修改Harbor的配置文件枢纠,重啟Harbor的服務(wù)下即可。
下面為手動操作黎棠,建議調(diào)整為腳本執(zhí)行快速切換晋渺。
- 模擬當(dāng)前主節(jié)點(diǎn)庫掛掉,
# 停止主數(shù)據(jù)庫的PG服務(wù).
service postgresql@13-main stop
- 激活備庫為主庫.
psql -h 172.19.48.254 -p 5432 -U postgres
postgres=# select pg_promote(true,60);
# 驗(yàn)證是否升級為主庫
/usr/lib/postgresql/13/bin/pg_controldata -D /data/harbor_nas/pgsql_replica/data/ |grep cluster
Database cluster state: in production
- 修改Harbor配置,重啟所有Harbor服務(wù)
#
sed -i 's/172.19.48.253/172.19.48.254/'
./prepare
docker-compose down && docker-compose up -d
訪問域名脓斩,驗(yàn)證harbor服務(wù)的可用性木西。
快速恢復(fù)主節(jié)點(diǎn),將主節(jié)點(diǎn)的PG庫設(shè)置為從庫随静。
# 修改253從庫免密配置八千,可以提前設(shè)置好吗讶,不需要此處配置了
/etc/postgresql/13/main/pg_hba.conf
host replication replica 172.19.48.253/20 trust
# 切換用戶
su - postgres
# 清理數(shù)據(jù)
rm -rf /data/harbor_nas/pgsql/data/*
# 同步254數(shù)據(jù)到253
pg_basebackup -h 172.19.48.254 -p 5432 -U replica -Fp -Xs -Pv -R -D /data/harbor_nas/pgsql/data/
echo "standby_mode = 'on'" > /data/harbor_nas/pgsql/data/standby.signal
# 修改253配置
vim /etc/postgresql/13/main/postgresql.conf
primary_conninfo = 'host=172.19.48.254 port=5432 user=replica password=Deniss_12PRO@@@'
recovery_target_timeline = latest
max_connections = 100
hot_standby = on
max_standby_streaming_delay = 30s
wal_receiver_status_interval = 10s
hot_standby_feedback = on
# 啟動253PG服務(wù)
systemctl start postgresql@13-main.service
- 在當(dāng)前主節(jié)點(diǎn)254登錄驗(yàn)證集群復(fù)制是否正常.
# 登錄節(jié)點(diǎn)驗(yàn)證當(dāng)前同步是否正常
psql -h localhost -p 5432 -U postgres
postgres=# select client_addr,sync_state from pg_stat_replication;
client_addr | sync_state
---------------+------------
172.19.48.253 | async
- 如果想將原來的庫基本恢復(fù)成主庫,只需要清理掉standby.signal文件恋捆,在原來的從庫上的數(shù)據(jù)目錄中新建standby.signal文件照皆,并且將
standby_mode = 'on'
配置好,重啟PG服務(wù)即可沸停。
災(zāi)難性故障恢復(fù)
對于不可抗拒因素是比較極端的情況膜毁,任何人都無法預(yù)料,包括當(dāng)前的各種云廠商愤钾,我們只把能想到的瘟滨,能做到的全部做好,我這邊已經(jīng)做了PG數(shù)據(jù)庫的全備上傳到了OSS上能颁,Harbor的鏡像數(shù)據(jù)阿里云NFS一份杂瘸,OSS一份,想要災(zāi)難性恢復(fù)必須保證如下倆個(gè)前提:
PG數(shù)據(jù)庫全備可用(注意:必須可以承受丟失全備時(shí)間起止到故障時(shí)間的數(shù)據(jù))伙菊。
阿里云NFS或者OSS中的Harbor鏡像數(shù)據(jù)文件可用胧沫。
恢復(fù)步驟:搭建一個(gè)單節(jié)點(diǎn)PG,全備導(dǎo)入占业,Harbor中的配置使用單節(jié)點(diǎn)PG绒怨,Redis本地或者h(yuǎn)arbor啟動的都可以,然后使用docker-compose啟動即可谦疾,具體操作步驟不在敘述南蹂。
但是這樣并不是最快的方法,還有沒有更好的方案呢念恍?當(dāng)然有了六剥,使用云服務(wù),一切都交給云峰伙,但是就算是云也不可能保證100%的可用性疗疟,此處的災(zāi)難性故障恢復(fù),僅做拋磚引玉瞳氓,并不是最終的解決方案策彤,只是給大家提供一個(gè)可以展開思考的思路,如果大家有更完美完善的方案匣摘,歡迎一起交流店诗。
歡迎搜索k8stech 公眾號(Kubernetes技術(shù)棧)關(guān)注云原生與SRE、運(yùn)維開發(fā)等技術(shù)文摘音榜。