Postgres-XL集群部署與管理

一 Postgres-XL簡介

Postgres的-XL是一個基于PostgreSQL數(shù)據(jù)庫的橫向擴展開源SQL數(shù)據(jù)庫集群爆侣，具有足夠的靈活性來處理不同的數(shù)據(jù)庫工作負載:

完全ACID窿克，保持事務(wù)一致性
OLTP 寫頻繁的業(yè)務(wù)
需要MPP并行性商業(yè)智能/大數(shù)據(jù)分析
操作數(shù)據(jù)存儲
Key-value 存儲
GIS的地理空間
混合業(yè)務(wù)工作環(huán)境
多租戶服務(wù)提供商托管環(huán)境
Web 2.0

Postgres-XL架構(gòu)

二組件簡介

Global Transaction Monitor (GTM)
全局事務(wù)管理器咪鲜，確保群集范圍內(nèi)的事務(wù)一致性飞几。 GTM負責(zé)發(fā)放事務(wù)ID和快照作為其多版本并發(fā)控制的一部分饿序。
集群可選地配置一個備用GTM刽脖，以改進可用性鳞上。此外怯伊，可以在協(xié)調(diào)器間配置代理GTM借嗽，可用于改善可擴展性态鳖，減少GTM的通信量。
GTM Standby
GTM的備節(jié)點恶导，在pgxc,pgxl中浆竭，GTM控制所有的全局事務(wù)分配，如果出現(xiàn)問題惨寿，就會導(dǎo)致整個集群不可用邦泄，為了增加可用性，增加該備用節(jié)點缤沦。當GTM出現(xiàn)問題時虎韵，GTM Standby可以升級為GTM，保證集群正常工作缸废。
GTM-Proxy
GTM需要與所有的Coordinators通信包蓝，為了降低壓力，可以在每個Coordinator機器上部署一個GTM-Proxy企量。
Coordinator
協(xié)調(diào)員管理用戶會話测萎，并與GTM和數(shù)據(jù)節(jié)點進行交互。協(xié)調(diào)員解析届巩，并計劃查詢硅瞧，并給語句中的每一個組件發(fā)送下一個序列化的全局性計劃。
為節(jié)省機器恕汇，通常此服務(wù)和數(shù)據(jù)節(jié)點部署在一起腕唧。
Data Node
數(shù)據(jù)節(jié)點是數(shù)據(jù)實際存儲的地方。數(shù)據(jù)的分布可以由DBA來配置瘾英。為了提高可用性枣接，可以配置數(shù)據(jù)節(jié)點的熱備以便進行故障轉(zhuǎn)移準備。

總結(jié)：gtm是負責(zé)ACID的缺谴，保證分布式數(shù)據(jù)庫全局事務(wù)一致性但惶。得益于此，就算數(shù)據(jù)節(jié)點是分布的，但是你在主節(jié)點操作增刪改查事務(wù)時膀曾，就如同只操作一個數(shù)據(jù)庫一樣簡單县爬。Coordinator是調(diào)度的，將操作指令發(fā)送到各個數(shù)據(jù)節(jié)點添谊。datanodes是數(shù)據(jù)節(jié)點财喳，分布式存儲數(shù)據(jù)。

三 Postgres-XL環(huán)境配置與安裝

3.1 集群規(guī)劃

準備三臺Centos7服務(wù)器（或者虛擬機）碉钠，集群規(guī)劃如下：

主機名	IP	角色	端口	nodename	數(shù)據(jù)目錄
gtm	192.168.0.125	GTM	6666	gtm	/nodes/gtm
		GTM Slave	20001	gtmSlave	/nodes/gtmSlave
datanode1	192.168.0.127	Coordinator	5432	coord1	/nodes/coord
		Datanode	5433	node1	/nodes/dn_master
		Datanode Slave	15433	node1_slave	/nodes/dn_slave
		GTM Proxy	6666	gtm_pxy1	/nodes/gtm_pxy
datanode2	192.168.0.128	Coordinator	5432	coord2	/nodes/coord
		Datanode	5433	node2	nodes/dn_master
		Datanode Slave	15433	node2_slave	/nodes/dn_slave
		GTM Proxy	6666	gtm_pxy2	/nodes/gtm_pxy

在每臺機器的 /etc/hosts中加入以下內(nèi)容：

192.168.0.125 gtm
192.168.0.126 datanode1
192.168.0.127 datanode2

gtm上部署gtm纲缓，gtm_sandby測試環(huán)境暫未部署。
Coordinator與Datanode節(jié)點一般部署在同一臺機器上喊废。實際上祝高，GTM-proxy,Coordinator與Datanode節(jié)點一般都在同一個機器上，使用時避免端口號與連接池端口號重疊污筷！規(guī)劃datanode1,datanode2作為協(xié)調(diào)節(jié)點與數(shù)據(jù)節(jié)點工闺。

3.2 系統(tǒng)環(huán)境設(shè)置

以下操作，對每個服務(wù)器節(jié)點都適用瓣蛀。
關(guān)閉防火墻：

[root@localhost ~]# systemctl stop firewalld.service
[root@localhost ~]# systemctl disable firewalld.service

selinux設(shè)置:

[root@localhost ~]#vim /etc/selinux/config

設(shè)置SELINUX=disabled陆蟆，保存退出。

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#     enforcing - SELinux security policy is enforced.
#     permissive - SELinux prints warnings instead of enforcing.
#     disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of three two values:
#     targeted - Targeted processes are protected,
#     minimum - Modification of targeted policy. Only selected processes are protected.
#     mls - Multi Level Security protection.

安裝依賴包：

[root@localhost ~]# yum install -y flex bison readline-devel zlib-devel openjade docbook-style-dsssl

同步系統(tǒng)時間

[root@localhost ~]# ntpdate asia.pool.ntp.org

重啟服務(wù)器惋增！一定要重啟叠殷！

3.3 新建用戶

每個節(jié)點都建立用戶postgres，并且建立.ssh目錄诈皿，并配置相應(yīng)的權(quán)限：

[root@localhost ~]# useradd postgres
[root@localhost ~]# passwd postgres
[root@localhost ~]# su - postgres
[root@localhost ~]# mkdir ~/.ssh
[root@localhost ~]# chmod 700 ~/.ssh

3.4 ssh免密碼登錄

僅僅在gtm節(jié)點配置如下操作：

[root@localhost ~]# su - postgres
[postgres@localhost ~]# ssh-keygen -t rsa
[postgres@localhost ~]# cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[postgres@localhost ~]# chmod 600 ~/.ssh/authorized_keys

將剛生成的認證文件拷貝到datanode1到datanode2中林束，使得gtm節(jié)點可以免密碼登錄datanode1~datanode2的任意一個節(jié)點：

[postgres@localhost ~]# scp ~/.ssh/authorized_keys postgres@datanode1:~/.ssh/
[postgres@localhost ~]# scp ~/.ssh/authorized_keys postgres@datanode2:~/.ssh/

對所有提示都不要輸入，直接enter下一步稽亏。直到最后壶冒，因為第一次要求輸入目標機器的用戶密碼，輸入即可截歉。

3.5 Postgres-XL安裝

pg1-pg3每個節(jié)點都需安裝配置胖腾。切換回root用戶下，執(zhí)行如下步驟安裝

[root@localhost ~]#  cd /opt
[root@localhost ~]# git clone git://git.postgresql.org/git/postgres-xl.git
[root@localhost ~]# cd postgres-xl
[root@localhost ~postgres-xl]# ./configure --prefix=/home/postgres/pgxl/
[root@localhost ~postgres-xl]# make
[root@localhost ~postgres-xl]# make install
[root@localhost ~postgres-xl]# cd contrib/  
[root@localhost ~contrib]# make
[root@localhost ~contrib]# make install

cortrib中有很多postgres很牛的工具瘪松，一般要裝上咸作。如ltree,uuid,postgres_fdw等等。

3.6 配置環(huán)境變量

進入postgres用戶宵睦，修改其環(huán)境變量性宏，開始編輯

[root@localhost ~]#su - postgres
[postgres@localhost ~]#vi .bashrc

在打開的文件末尾，新增如下變量配置：

export PGHOME=/home/postgres/pgxl
export LD_LIBRARY_PATH=$PGHOME/lib:$LD_LIBRARY_PATH
export PATH=$PGHOME/bin:$PATH

按住esc状飞，然后輸入:wq!保存退出。輸入以下命令對更改重啟生效。

[root@localhost ~]# source .bashrc
#輸入以下語句诬辈，如果輸出變量結(jié)果酵使，代表生效
[root@localhost ~]# echo $PGHOME
#應(yīng)該輸出/home/postgres/pgxl代表生效

如上操作，除特別強調(diào)以外焙糟，是datanode1-datanode2節(jié)點都要配置安裝的口渔。

四集群配置

4.1 生成pgxc_ctl配置文件

[postgres@localhost ~]# pgxc_ctl

PGXC prepare ---執(zhí)行該命令將會生成一份配置文件模板
PGXC   ---按ctrl c退出。

4.2 配置pgxc_ctl.conf

在pgxc_ctl文件夾中存在一個pgxc_ctl.conf文件穿撮，編輯如下：

pgxcInstallDir=$PGHOME
pgxlDATA=$PGHOME/data 

pgxcOwner=postgres

#---- GTM Master -----------------------------------------
gtmName=gtm
gtmMasterServer=gtm
gtmMasterPort=6666
gtmMasterDir=$pgxlDATA/nodes/gtm

gtmSlave=y                  # Specify y if you configure GTM Slave.   Otherwise, GTM slave will not be configured and
                            # all the following variables will be reset.
gtmSlaveName=gtmSlave
gtmSlaveServer=gtm      # value none means GTM slave is not available.  Give none if you don't configure GTM Slave.
gtmSlavePort=20001          # Not used if you don't configure GTM slave.
gtmSlaveDir=$pgxlDATA/nodes/gtmSlave    # Not used if you don't configure GTM slave.

#---- GTM-Proxy Master -------
gtmProxyDir=$pgxlDATA/nodes/gtm_proxy
gtmProxy=y                              
gtmProxyNames=(gtm_pxy1 gtm_pxy2)   
gtmProxyServers=(datanode1 datanode2)           
gtmProxyPorts=(6666 6666)               
gtmProxyDirs=($gtmProxyDir $gtmProxyDir)            

#---- Coordinators ---------
coordMasterDir=$pgxlDATA/nodes/coord
coordNames=(coord1 coord2)      
coordPorts=(5432 5432)          
poolerPorts=(6667 6667)         
coordPgHbaEntries=(0.0.0.0/0)

coordMasterServers=(datanode1 datanode2)    
coordMasterDirs=($coordMasterDir $coordMasterDir)
coordMaxWALsernder=0    #沒設(shè)置備份節(jié)點缺脉，設(shè)置為0
coordMaxWALSenders=($coordMaxWALsernder $coordMaxWALsernder) #數(shù)量保持和coordMasterServers一致

coordSlave=n

#---- Datanodes ----------
datanodeMasterDir=$pgxlDATA/nodes/dn_master
primaryDatanode=node1               # 主數(shù)據(jù)節(jié)點
datanodeNames=(node1 node2)
datanodePorts=(5433 5433)   
datanodePoolerPorts=(6668 6668) 
datanodePgHbaEntries=(0.0.0.0/0)

datanodeMasterServers=(datanode1 datanode2)
datanodeMasterDirs=($datanodeMasterDir $datanodeMasterDir)
datanodeMaxWalSender=4
datanodeMaxWALSenders=($datanodeMaxWalSender $datanodeMaxWalSender)

datanodeSlave=n
#將datanode1節(jié)點的slave做到了datanode2服務(wù)器上，交叉做了備份
datanodeSlaveServers=(datanode2 datanode1)  # value none means this slave is not available
datanodeSlavePorts=(15433 15433)    # value none means this slave is not available
datanodeSlavePoolerPorts=(20012 20012)  # value none means this slave is not available
datanodeSlaveSync=y     # If datanode slave is connected in synchronized mode
datanodeSlaveDirs=($datanodeSlaveDir $datanodeSlaveDir)
datanodeArchLogDirs=( $datanodeArchLogDir $datanodeArchLogDir)

如上配置悦穿，都沒有配置slave攻礼，具體生產(chǎn)環(huán)境，請閱讀配置文件栗柒，自行配置礁扮。

4.3 集群初始化，啟動瞬沦，停止

第一次啟動集群太伊，需要初始化，初始化如下：

[postgres@pg1 pgxc_ctl]$ pgxc_ctl -c /home/postgres/pgxc_ctl/pgxc_ctl.conf init all

初始化后會直接啟動集群逛钻。

/bin/bash
Installing pgxc_ctl_bash script as /home/postgres/pgxc_ctl/pgxc_ctl_bash.
Installing pgxc_ctl_bash script as /home/postgres/pgxc_ctl/pgxc_ctl_bash.
Reading configuration using /home/postgres/pgxc_ctl/pgxc_ctl_bash --home /home/postgres/pgxc_ctl --configuration /home/postgres/pgxc_ctl/pgxc_ctl.conf
Finished reading configuration.
   ******** PGXC_CTL START ***************

Current directory: /home/postgres/pgxc_ctl
Initialize GTM master
ERROR: target directory (/home/postgres/pgxl/data/nodes/gtm) exists and not empty. Skip GTM initilialization
1:1430554432:2017-07-11 23:31:14.737 PDT -FATAL:  lock file "gtm.pid" already exists
2:1430554432:2017-07-11 23:31:14.737 PDT -HINT:  Is another GTM (PID 2823) running in data directory "/home/postgres/pgxl/data/nodes/gtm"?
LOCATION:  CreateLockFile, main.c:2099
waiting for server to shut down.... done
server stopped
Done.
Start GTM master
server starting
Initialize all the gtm proxies.
Initializing gtm proxy gtm_pxy1.
Initializing gtm proxy gtm_pxy2.
The files belonging to this GTM system will be owned by user "postgres".
This user must also own the server process.


fixing permissions on existing directory /home/postgres/pgxl/data/nodes/gtm_pxy ... ok
creating configuration files ... ok

Success.
The files belonging to this GTM system will be owned by user "postgres".
This user must also own the server process.


fixing permissions on existing directory /home/postgres/pgxl/data/nodes/gtm_pxy ... ok
creating configuration files ... ok

Success.
Done.
Starting all the gtm proxies.
Starting gtm proxy gtm_pxy1.
Starting gtm proxy gtm_pxy2.
server starting
server starting
Done.
Initialize all the coordinator masters.
Initialize coordinator master coord1.
Initialize coordinator master coord2.
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

fixing permissions on existing directory /home/postgres/pgxl/data/nodes/coord ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... creating cluster information ... ok
syncing data to disk ... ok
freezing database template0 ... ok
freezing database template1 ... ok
freezing database postgres ... ok

WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success.
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

fixing permissions on existing directory /home/postgres/pgxl/data/nodes/coord ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... creating cluster information ... ok
syncing data to disk ... ok
freezing database template0 ... ok
freezing database template1 ... ok
freezing database postgres ... ok

WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success.
Done.
Starting coordinator master.
Starting coordinator master coord1
Starting coordinator master coord2
2017-07-11 23:31:31.116 PDT [3650] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2017-07-11 23:31:31.116 PDT [3650] LOG:  listening on IPv6 address "::", port 5432
2017-07-11 23:31:31.118 PDT [3650] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2017-07-11 23:31:31.126 PDT [3650] LOG:  redirecting log output to logging collector process
2017-07-11 23:31:31.126 PDT [3650] HINT:  Future log output will appear in directory "pg_log".
2017-07-11 23:31:31.122 PDT [3613] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2017-07-11 23:31:31.122 PDT [3613] LOG:  listening on IPv6 address "::", port 5432
2017-07-11 23:31:31.124 PDT [3613] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2017-07-11 23:31:31.132 PDT [3613] LOG:  redirecting log output to logging collector process
2017-07-11 23:31:31.132 PDT [3613] HINT:  Future log output will appear in directory "pg_log".
Done.
Initialize all the datanode masters.
Initialize the datanode master datanode1.
Initialize the datanode master datanode2.
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

fixing permissions on existing directory /home/postgres/pgxl/data/nodes/dn_master ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... creating cluster information ... ok
syncing data to disk ... ok
freezing database template0 ... ok
freezing database template1 ... ok
freezing database postgres ... ok

WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success.
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

fixing permissions on existing directory /home/postgres/pgxl/data/nodes/dn_master ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... creating cluster information ... ok
syncing data to disk ... ok
freezing database template0 ... ok
freezing database template1 ... ok
freezing database postgres ... ok

WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success.
Done.
Starting all the datanode masters.
Starting datanode master datanode1.
Starting datanode master datanode2.
2017-07-11 23:31:37.013 PDT [3995] LOG:  listening on IPv4 address "0.0.0.0", port 5433
2017-07-11 23:31:37.013 PDT [3995] LOG:  listening on IPv6 address "::", port 5433
2017-07-11 23:31:37.014 PDT [3995] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5433"
2017-07-11 23:31:37.021 PDT [3995] LOG:  redirecting log output to logging collector process
2017-07-11 23:31:37.021 PDT [3995] HINT:  Future log output will appear in directory "pg_log".
2017-07-11 23:31:37.008 PDT [3958] LOG:  listening on IPv4 address "0.0.0.0", port 5433
2017-07-11 23:31:37.008 PDT [3958] LOG:  listening on IPv6 address "::", port 5433
2017-07-11 23:31:37.009 PDT [3958] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5433"
2017-07-11 23:31:37.017 PDT [3958] LOG:  redirecting log output to logging collector process
2017-07-11 23:31:37.017 PDT [3958] HINT:  Future log output will appear in directory "pg_log".
Done.
ALTER NODE coord1 WITH (HOST='datanode1', PORT=5432);
ALTER NODE
CREATE NODE coord2 WITH (TYPE='coordinator', HOST='datanode2', PORT=5432);
CREATE NODE
CREATE NODE datanode1 WITH (TYPE='datanode', HOST='datanode1', PORT=5433, PRIMARY, PREFERRED);
CREATE NODE
CREATE NODE datanode2 WITH (TYPE='datanode', HOST='datanode2', PORT=5433);
CREATE NODE
SELECT pgxc_pool_reload();
 pgxc_pool_reload 
------------------
 t
(1 row)

CREATE NODE coord1 WITH (TYPE='coordinator', HOST='datanode1', PORT=5432);
CREATE NODE
ALTER NODE coord2 WITH (HOST='datanode2', PORT=5432);
ALTER NODE
CREATE NODE datanode1 WITH (TYPE='datanode', HOST='datanode1', PORT=5433, PRIMARY);
CREATE NODE
CREATE NODE datanode2 WITH (TYPE='datanode', HOST='datanode2', PORT=5433, PREFERRED);
CREATE NODE
SELECT pgxc_pool_reload();
 pgxc_pool_reload 
------------------
 t
(1 row)

Done.
EXECUTE DIRECT ON (datanode1) 'CREATE NODE coord1 WITH (TYPE=''coordinator'', HOST=''datanode1'', PORT=5432)';
EXECUTE DIRECT
EXECUTE DIRECT ON (datanode1) 'CREATE NODE coord2 WITH (TYPE=''coordinator'', HOST=''datanode2'', PORT=5432)';
EXECUTE DIRECT
EXECUTE DIRECT ON (datanode1) 'ALTER NODE datanode1 WITH (TYPE=''datanode'', HOST=''datanode1'', PORT=5433, PRIMARY, PREFERRED)';
EXECUTE DIRECT
EXECUTE DIRECT ON (datanode1) 'CREATE NODE datanode2 WITH (TYPE=''datanode'', HOST=''datanode2'', PORT=5433, PREFERRED)';
EXECUTE DIRECT
EXECUTE DIRECT ON (datanode1) 'SELECT pgxc_pool_reload()';
 pgxc_pool_reload 
------------------
 t
(1 row)

EXECUTE DIRECT ON (datanode2) 'CREATE NODE coord1 WITH (TYPE=''coordinator'', HOST=''datanode1'', PORT=5432)';
EXECUTE DIRECT
EXECUTE DIRECT ON (datanode2) 'CREATE NODE coord2 WITH (TYPE=''coordinator'', HOST=''datanode2'', PORT=5432)';
EXECUTE DIRECT
EXECUTE DIRECT ON (datanode2) 'CREATE NODE datanode1 WITH (TYPE=''datanode'', HOST=''datanode1'', PORT=5433, PRIMARY, PREFERRED)';
EXECUTE DIRECT
EXECUTE DIRECT ON (datanode2) 'ALTER NODE datanode2 WITH (TYPE=''datanode'', HOST=''datanode2'', PORT=5433, PREFERRED)';
EXECUTE DIRECT
EXECUTE DIRECT ON (datanode2) 'SELECT pgxc_pool_reload()';
 pgxc_pool_reload 
------------------
 t
(1 row)

Done.

以后啟動僚焦，直接執(zhí)行如下命令：

[postgres@pg1 pgxc_ctl]$ pgxc_ctl -c /home/postgres/pgxc_ctl/pgxc_ctl.conf start all

停止集群如下：

[postgres@pg1 pgxc_ctl]$ pgxc_ctl -c /home/postgres/pgxc_ctl/pgxc_ctl.conf stop all

這幾個主要命令暫時這么多，更多請從pgxc_ctl --help中獲取更多信息曙痘。

五 Postgres-XL集群測試

5.1 插入數(shù)據(jù)

在datanode1節(jié)點芳悲，執(zhí)行psql -p 5432進入數(shù)據(jù)庫操作。

[postgres@localhost]$ psql -p 5432
psql (PGXL 10alpha1, based on PG 10beta1 (Postgres-XL 10alpha1))
Type "help" for help.

postgres=# select * from pgxc_node;
 node_name | node_type | node_port | node_host | nodeis_primary | nodeis_preferred |   node_id   
-----------+-----------+-----------+-----------+----------------+------------------+-------------
 coord1    | C         |      5432 | datanode1 | f              | f                |  1885696643
 coord2    | C         |      5432 | datanode2 | f              | f                | -1197102633
 datanode1 | D         |      5433 | datanode1 | t              | t                |   888802358
 datanode2 | D         |      5433 | datanode2 | f              | f                |  -905831925
(4 rows)
postgres=# create table test1(id it,name text);
postgres=#  insert into test1(id,name) select generate_series(1,8),'測試';

5.2 查看數(shù)據(jù)分布

在datanode1節(jié)點上查看數(shù)據(jù)

[postgres@bogon ~]$ psql -p 5433
psql (PGXL 10alpha1, based on PG 10beta1 (Postgres-XL 10alpha1))
Type "help" for help.

postgres=# select * from test1;
 id | name 
----+------
  1 | 測試
  2 | 測試
  5 | 測試
  6 | 測試
  8 | 測試
(5 rows)

在datanode2節(jié)點上查看數(shù)據(jù)

postgres=# select * from test1;
 id | name 
----+------
  3 | 測試
  4 | 測試
  7 | 測試
(3 rows)

注意：由于所有的數(shù)據(jù)節(jié)點組成了完整的數(shù)據(jù)視圖屡江，所以一個數(shù)據(jù)節(jié)點down機芭概，整個pgxl都啟動不了了，所以實際生產(chǎn)中惩嘉，為了提高可用性罢洲，一定要配置數(shù)據(jù)節(jié)點的熱備以便進行故障轉(zhuǎn)移準備。

六集群應(yīng)用與管理

6.1 建表說明

REPLICATION表：各個datanode節(jié)點中文黎，表的數(shù)據(jù)完全相同惹苗，也就是說，插入數(shù)據(jù)時耸峭，會分別在每個datanode節(jié)點插入相同數(shù)據(jù)桩蓉。讀數(shù)據(jù)時，只需要讀任意一個datanode節(jié)點上的數(shù)據(jù)劳闹。
建表語法：

postgres=#  CREATE TABLE repltab (col1 int, col2 int) DISTRIBUTE BY REPLICATION;

DISTRIBUTE ：會將插入的數(shù)據(jù)院究，按照拆分規(guī)則洽瞬，分配到不同的datanode節(jié)點中存儲，也就是sharding技術(shù)业汰。每個datanode節(jié)點只保存了部分數(shù)據(jù)伙窃，通過coordinate節(jié)點可以查詢完整的數(shù)據(jù)視圖。

postgres=#  CREATE TABLE disttab(col1 int, col2 int, col3 text) DISTRIBUTE BY HASH(col1);

模擬部分數(shù)據(jù)样漆，插入測試數(shù)據(jù)：

#任意登錄一個coordinate節(jié)點進行建表操作
[postgres@gtm ~]$  psql -h  datanode1 -p 5432 -U postgres
postgres=# INSERT INTO disttab SELECT generate_series(1,100), generate_series(101, 200), 'foo';
INSERT 0 100
postgres=# INSERT INTO repltab SELECT generate_series(1,100), generate_series(101, 200);
INSERT 0 100

查看數(shù)據(jù)分布結(jié)果：

#DISTRIBUTE表分布結(jié)果
postgres=# SELECT xc_node_id, count(*) FROM disttab GROUP BY xc_node_id;
 xc_node_id | count 
------------+-------
 1148549230 |    42
 -927910690 |    58
(2 rows)
#REPLICATION表分布結(jié)果
postgres=# SELECT count(*) FROM repltab;
 count 
-------
   100
(1 row)

查看另一個datanode2中repltab表結(jié)果

[postgres@datanode2 pgxl9.5]$ psql -p 5433
psql (PGXL 9.5r1.3, based on PG 9.5.4 (Postgres-XL 9.5r1.3))
Type "help" for help.

postgres=# SELECT count(*) FROM repltab;
 count 
-------
   100
(1 row)

結(jié)論：REPLICATION表中为障，datanode1,datanode2中表是全部數(shù)據(jù)，一模一樣放祟。而DISTRIBUTE表鳍怨，數(shù)據(jù)散落近乎平均分配到了datanode1,datanode2節(jié)點中。

6.2新增datanode節(jié)點與數(shù)據(jù)重分布

6.2.1 新增datanode節(jié)點

在gtm集群管理節(jié)點上執(zhí)行pgxc_ctl命令

[postgres@gtm ~]$ pgxc_ctl
/bin/bash
Installing pgxc_ctl_bash script as /home/postgres/pgxc_ctl/pgxc_ctl_bash.
Installing pgxc_ctl_bash script as /home/postgres/pgxc_ctl/pgxc_ctl_bash.
Reading configuration using /home/postgres/pgxc_ctl/pgxc_ctl_bash --home /home/postgres/pgxc_ctl --configuration /home/postgres/pgxc_ctl/pgxc_ctl.conf
Finished reading configuration.
   ******** PGXC_CTL START ***************

Current directory: /home/postgres/pgxc_ctl
PGXC

在PGXC后面執(zhí)行新增數(shù)據(jù)節(jié)點命令:

Current directory: /home/postgres/pgxc_ctl
# 在服務(wù)器datanode1上跪妥，新增一個master角色的datanode節(jié)點鞋喇，名稱是dn3
# 端口號暫定5430，pool master暫定6669 骗奖，指定好數(shù)據(jù)目錄位置确徙，從兩個節(jié)點升級到3個節(jié)點，之后要寫3個none
# none應(yīng)該是datanodeSpecificExtraConfig或者datanodeSpecificExtraPgHba配置
PGXC add datanode master dn3 datanode1 5430 6669 /home/postgres/pgxl9.5/data/nodes/dn_master3 none none none

等待新增完成后执桌，查詢集群節(jié)點狀態(tài)：

[postgres@gtm ~]$ psql -h datanode1 -p 5432 -U postgres
psql (PGXL 9.5r1.3, based on PG 9.5.4 (Postgres-XL 9.5r1.3))
Type "help" for help.

postgres=# select * from pgxc_node;
 node_name | node_type | node_port | node_host | nodeis_primary | nodeis_preferred |   node_id   
-----------+-----------+-----------+-----------+----------------+------------------+-------------
 coord1    | C         |      5432 | datanode1 | f              | f                |  1885696643
 coord2    | C         |      5432 | datanode2 | f              | f                | -1197102633
 node1     | D         |      5433 | datanode1 | f              | t                |  1148549230
 node2     | D         |      5433 | datanode2 | f              | f                |  -927910690
 dn3       | D         |      5430 | datanode1 | f              | f                |  -700122826
(5 rows)

可以發(fā)現(xiàn)節(jié)點新增完畢鄙皇。

6.2.2 數(shù)據(jù)重分布

之前我們的DISTRIBUTE表分布在了node1,node2節(jié)點上，如下：

postgres=# SELECT xc_node_id, count(*) FROM disttab GROUP BY xc_node_id;
 xc_node_id | count 
------------+-------
 1148549230 |    42
 -927910690 |    58
(2 rows)

新增一個節(jié)點后仰挣，將sharding表數(shù)據(jù)重新分配到三個節(jié)點上伴逸，將repl表復(fù)制到新節(jié)點：

# 重分布sharding表
postgres=# ALTER TABLE disttab ADD NODE (dn3);
ALTER TABLE
# 復(fù)制數(shù)據(jù)到新節(jié)點
postgres=#  ALTER TABLE repltab ADD NODE (dn3);
ALTER TABLE

查看新的數(shù)據(jù)分布：

postgres=# SELECT xc_node_id, count(*) FROM disttab GROUP BY xc_node_id;
 xc_node_id | count 
------------+-------
 -700122826 |    36
 -927910690 |    32
 1148549230 |    32
(3 rows)

登錄dn3(新增的時候，放在了datanode1服務(wù)器上膘壶，端口5430)節(jié)點查看數(shù)據(jù)：

[postgres@gtm ~]$ psql -h datanode1 -p 5430 -U postgres
psql (PGXL 9.5r1.3, based on PG 9.5.4 (Postgres-XL 9.5r1.3))
Type "help" for help.
postgres=# select count(*) from repltab;
 count 
-------
   100
(1 row)

很明顯,通過 ALTER TABLE tt ADD NODE (dn)命令错蝴，可以將DISTRIBUTE表數(shù)據(jù)重新分布到新節(jié)點，重分布過程中會中斷所有事務(wù)颓芭∏昝蹋可以將REPLICATION表數(shù)據(jù)復(fù)制到新節(jié)點。

6.2.3 從datanode節(jié)點中回收數(shù)據(jù)

postgres=# ALTER TABLE disttab DELETE NODE (dn3);
ALTER TABLE
postgres=# ALTER TABLE repltab DELETE NODE (dn3);
ALTER TABLE

6.3 刪除數(shù)據(jù)節(jié)點

Postgresql-XL并沒有檢查將被刪除的datanode節(jié)點是否有replicated/distributed表的數(shù)據(jù)亡问，為了數(shù)據(jù)安全官紫，在刪除之前需要檢查下被刪除節(jié)點上的數(shù)據(jù)，有數(shù)據(jù)的話州藕，要回收掉分配到其他節(jié)點束世，然后才能安全刪除。刪除數(shù)據(jù)節(jié)點分為四步驟：

查詢要刪除節(jié)點dn3的oid

postgres=#  SELECT oid, * FROM pgxc_node;
  oid  | node_name | node_type | node_port | node_host | nodeis_primary | nodeis_preferred |   node_id   
-------+-----------+-----------+-----------+-----------+----------------+------------------+-------------
 11819 | coord1    | C         |      5432 | datanode1 | f              | f                |  1885696643
 16384 | coord2    | C         |      5432 | datanode2 | f              | f                | -1197102633
 16385 | node1     | D         |      5433 | datanode1 | f              | t                |  1148549230
 16386 | node2     | D         |      5433 | datanode2 | f              | f                |  -927910690
 16397 | dn3       | D         |      5430 | datanode1 | f              | f                |  -700122826
(5 rows)

查詢dn3對應(yīng)的oid中是否有數(shù)據(jù)

testdb=# SELECT * FROM pgxc_class WHERE nodeoids::integer[] @> ARRAY[16397];
 pcrelid | pclocatortype | pcattnum | pchashalgorithm | pchashbuckets |     nodeoids      
---------+---------------+----------+-----------------+---------------+-------------------
   16388 | H             |        1 |               1 |          4096 | 16397 16385 16386
   16394 | R             |        0 |               0 |             0 | 16397 16385 16386
(2 rows)

有數(shù)據(jù)的先回收數(shù)據(jù)

postgres=# ALTER TABLE disttab DELETE NODE (dn3);
ALTER TABLE
postgres=# ALTER TABLE repltab DELETE NODE (dn3);
ALTER TABLE
postgres=# SELECT * FROM pgxc_class WHERE nodeoids::integer[] @> ARRAY[16397];
 pcrelid | pclocatortype | pcattnum | pchashalgorithm | pchashbuckets | nodeoids 
---------+---------------+----------+-----------------+---------------+----------
(0 rows)

安全刪除dn3

PGXC$  remove datanode master dn3 clean

6.4 coordinate節(jié)點管理

同datanode節(jié)點相似床玻，列出語句不做測試了：

# 新增coordinate
PGXC$  add coordinator master coord3 localhost 30003 30013 $dataDirRoot/coord_master.3 none none none
# 刪除coordinate,clean選項可以將相應(yīng)的數(shù)據(jù)目錄也刪除
PGXC$  remove coordinator master coord3 clean

6.5 故障切換

查看當前數(shù)據(jù)集群

postgres=# SELECT oid, * FROM pgxc_node;
  oid  | node_name | node_type | node_port | node_host | nodeis_primary | nodeis_preferred |   node_id   
-------+-----------+-----------+-----------+-----------+----------------+------------------+-------------
 11819 | coord1    | C         |      5432 | datanode1 | f              | f                |  1885696643
 16384 | coord2    | C         |      5432 | datanode2 | f              | f                | -1197102633
 16385 | node1     | D         |      5433 | datanode1 | f              | t                |  1148549230
 16386 | node2     | D         |      5433 | datanode2 | f              | f                |  -927910690
(4 rows)

模擬node1節(jié)點故障

PGXC$  stop -m immediate datanode master node1
Stopping datanode master node1.
Done.

測試集群查詢

postgres=# SELECT xc_node_id, count(*) FROM disttab GROUP BY xc_node_id;
ERROR:  Failed to get pooled connections
postgres=# SELECT xc_node_id, * FROM disttab WHERE col1 = 3;
 xc_node_id | col1 | col2 | col3 
------------+------+------+------
 -927910690 |    3 |  103 | foo
(1 row)

測試發(fā)現(xiàn)毁涉，查詢范圍如果涉及到故障的node1節(jié)點，會報錯锈死，而查詢的數(shù)據(jù)范圍不在node1上的話贫堰，仍然可以查詢穆壕。

手動切換node1的slave

PGXC$  failover datanode node1
# 切換完成后，查詢集群
postgres=# SELECT oid, * FROM pgxc_node;
  oid  | node_name | node_type | node_port | node_host | nodeis_primary | nodeis_preferred |   node_id   
-------+-----------+-----------+-----------+-----------+----------------+------------------+-------------
 11819 | coord1    | C         |      5432 | datanode1 | f              | f                |  1885696643
 16384 | coord2    | C         |      5432 | datanode2 | f              | f                | -1197102633
 16386 | node2     | D         |      5433 | datanode2 | f              | f                |  -927910690
 16385 | node1     | D         |     15433 | datanode2 | f              | t                |  1148549230
(4 rows)

發(fā)現(xiàn)node1節(jié)點的ip和端口都已經(jīng)替換為配置的slave了严嗜。

七部署遇到的問題

在配置的時候一定要細心粱檀，避免端口號之類的配置沖突等錯誤。
錯誤一：

postgres=# create table test1(id integer,name varchar(20));
LOG:  failed to connect to node, connection string (host=192.168.0.125 port=1925 dbname=postgres user=postgres application_name=pgxc sslmode=disable options='-c remotetype=coordinator -c parentnode=coord1  -c DateStyle=iso,mdy -c timezone=prc -c geqo=on -c intervalstyle=postgres -c lc_monetary=C'), connection error (fe_sendauth: no password supplied
        )
WARNING:  can not connect to node 16384
WARNING:  Health map updated to reflect DOWN node (16384)
LOG:  Pooler could not open a connection to node 16384
LOG:  failed to acquire connections
STATEMENT:  create table test1(id integer,name varchar(20));
ERROR:  Failed to get pooled connections
HINT:  This may happen because one or more nodes are currently unreachable, either because of node or network failure.
         Its also possible that the target node may have hit the connection limit or the pooler is configured with low connections.
         Please check if all nodes are running fine and also review max_connections and max_pool_size configuration parameters
STATEMENT:  create table test1(id integer,name varchar(20));

原因：這個是由于某些環(huán)境或配置出了問題漫玄，我的就是pg_hba.conf配置出了問題，Ipv4要改成 0:0:0:0/0 trust才行压彭。
但這僅僅是一個問題睦优，開發(fā)者搭建環(huán)境遇到這個錯誤，一定要檢查如下:

** 各個機器的防火墻是否關(guān)閉壮不？**
** 各個機器的SELINUX狀態(tài)是否是disabled汗盘？**
** 各個機器的ssh免密登錄是否成功？**
** 各個節(jié)點的pg_hba.conf,postgresql.conf是否配置為信任登錄询一？是否有IP限制隐孽？**
超過某些節(jié)點的最大連接數(shù)？（對于我們測試環(huán)境來說健蕊，肯定不會是這個問題）

作者搭建pgxl是為地理大數(shù)據(jù)做技術(shù)預(yù)研的菱阵，使用postgis作為空間數(shù)據(jù)，歡迎postgis開發(fā)者參與交流缩功。

最后編輯于：2018.12.29 07:11:59

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者

人面猴
序言：七十年代末晴及，一起剝皮案震驚了整個濱河市，隨后出現(xiàn)的幾起案子嫡锌，更是在濱河造成了極大的恐慌虑稼，老刑警劉巖，帶你破解...
沈念sama閱讀 217,509評論 6贊 504
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件势木，死亡現(xiàn)場離奇詭異蛛倦，居然都是意外死亡，警方通過查閱死者的電腦和手機啦桌，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 92,806評論 3贊 394
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進店門溯壶，熙熙樓的掌柜王于貴愁眉苦臉地迎上來，“玉大人震蒋，你說我怎么就攤上這事茸塞。” “怎么了查剖？”我有些...
開封第一講書人閱讀 163,875評論 0贊 354
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵钾虐，是天一觀的道長。經(jīng)常有香客問我笋庄，道長效扫，這世上最難降的妖魔是什么倔监？我笑而不...
開封第一講書人閱讀 58,441評論 1贊 293
?港島之戀（遺憾婚禮）
正文為了忘掉前任，我火速辦了婚禮菌仁，結(jié)果婚禮上浩习，老公的妹妹穿的比我還像新娘。我一直安慰自己济丘，他們只是感情好谱秽，可當我...
茶點故事閱讀 67,488評論 6贊 392
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布。她就那樣靜靜地躺著摹迷，像睡著了一般疟赊。火紅的嫁衣襯著肌膚如雪。梳的紋絲不亂的頭發(fā)上峡碉，一...
開封第一講書人閱讀 51,365評論 1贊 302
城市分裂傳說
那天近哟，我揣著相機與錄音，去河邊找鬼鲫寄。笑死吉执，一個胖子當著我的面吹牛，可吹牛的內(nèi)容都是我干的地来。我是一名探鬼主播戳玫，決...
沈念sama閱讀 40,190評論 3贊 418
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼，長吁一口氣：“原來是場噩夢啊……” “哼靠抑！你這毒婦竟也來了量九？” 一聲冷哼從身側(cè)響起，我...
開封第一講書人閱讀 39,062評論 0贊 276
萬榮殺人案實錄
序言：老撾萬榮一對情侶失蹤颂碧，失蹤者是張志新（化名）和其女友劉穎荠列，沒想到半個月后，有當?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體载城，經(jīng)...
沈念sama閱讀 45,500評論 1贊 314
?護林員之死
正文獨居荒郊野嶺守林人離奇死亡肌似，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點故事閱讀 37,706評論 3贊 335
?白月光啟示錄
正文我和宋清朗相戀三年，在試婚紗的時候發(fā)現(xiàn)自己被綠了诉瓦。大學(xué)時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片川队。...
茶點故事閱讀 39,834評論 1贊 347
活死人
序言：一個原本活蹦亂跳的男人離奇死亡，死狀恐怖睬澡，靈堂內(nèi)的尸體忽然破棺而出固额，到底是詐尸還是另有隱情，我是刑警寧澤煞聪，帶...
沈念sama閱讀 35,559評論 5贊 345
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布斗躏，位于F島的核電站，受9級特大地震影響昔脯，放射性物質(zhì)發(fā)生泄漏啄糙。R本人自食惡果不足惜笛臣，卻給世界環(huán)境...
茶點故事閱讀 41,167評論 3贊 328
男人毒藥：我在死后第九天來索命
文/蒙蒙一、第九天我趴在偏房一處隱蔽的房頂上張望隧饼。院中可真熱鬧沈堡，春花似錦、人聲如沸燕雁。這莊子的主人今日做“春日...
開封第一講書人閱讀 31,779評論 0贊 22
一樁弒父案，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽拐格。三九已至率拒，卻和暖如春，著一層夾襖步出監(jiān)牢的瞬間禁荒，已是汗流浹背。一陣腳步聲響...
開封第一講書人閱讀 32,912評論 1贊 269
情欲美人皮
我被黑心中介騙來泰國打工角撞，沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留呛伴，地道東北人。一個月前我還...
沈念sama閱讀 47,958評論 2贊 370
代替公主和親
正文我出身青樓谒所，卻偏偏與公主長得像热康，于是被迫代替她去往敵國和親。傳聞我的和親對象是個殘疾皇子劣领，可洞房花燭夜當晚...
茶點故事閱讀 44,779評論 2贊 354