简介
文档:https://www.postgres-xl.org/documentation/index.html
https://www.postgres-xl.org/overview/
https://wiki.postgresql.org/wiki/Postgres-XC
Postgres-XL是一款开源的PG集群软件,XL代表eXtensible Lattice,即可扩展的PG“格子”之意,以下简称PGXL。
官方称其既适合写操作压力较大的OLTP应用,又适合读操作为主的大数据应用。它的前身是Postgres-XC(简称PGXC),PGXC是在PG的基础上加入了集群功能,主要适用于OLTP应用。PGXL是在PGXC的基础上的升级产品,加入了一些适用于OLAP应用的特性,如 Massively Parallel Processing (MPP) 特性。
通俗的说PGXL的代码是包含PG代码,使用PGXL安装PG集群并不需要单独安装PG。这样带来的一个问题是无法随意选择任意版本的PG,好在PGXL跟进PG较及时,目前最新版本Postgres-XL 10R1,基于PG 10。
Postgres-XL是由多个PostgreSQL数据库集群组成的,但看起来是单个数据库集群一样。根据你的设计,每个表都可以在各个数据库之间进行复制或分发。
为了实现这一目标,Postgres-XL是由GTM,Coordinator和Datanode三部分组成。GTM负责支持事务的ACID。Datanode存储数据并处理SQL操作(只能操作自己存储的数据)。Coordinator分析来自应用程序的SQL操作,确定哪个Datanode包含数据,并将指令发送到正确的Datanode。
通常情况下,GTM应该安装在单独的服务器上,因为GTM要处理所有Coordinator和Datanode的事务需求。你可以配置GTM-Proxy(GTM代理)来分组同一服务器上运行的Coordinator和Datanode的请求和响应, GTM-Proxy减少了与GTM的交互次数和数据量。GTM代理还还可以处理GTM故障。
在同一台服务器上同时部署Coordinator和Datanode通常是很好的做法,这样我们就不必担心两者之间的负载平衡,如果是复制表的话,不需要发送额外的网络请求就可以从本地拿到数据。你可以部署任意数量的服务器(Coordinator和Datanode同时运行)。Coordinator和Datanode都是PostgreSQL实例,你可能需要做些配置使它们避免资源冲突。例如为它们分配不同的工作目录和端口号是非常重要的。
Postgres-XL允许多个Coordinator单独从应用程序接受SQL指令,而不是集中的方式。写操作可以通过任何一个Coordinator来完成,没有任何区别。他们看起来就像是单一的数据库。Coordinator的职责是接受和分销SQL指令,查找哪些Datanodes存储相应的数据,可能需要将查询计划发送到适当的Datanodes,然后收集结果并将其返回给应用程序。
Coordinator不存储用户数据。它仅存储目录数据,用来确定如何处理SQL语句以及查找目标Datanodes等等。你不必过分担心Coordinator失败,当一个Coordinator失败时,你可以切换到另一个。
GTM可能发生单点故障(SPOF)。为了防止这种情况,你可以运行另一个GTM(GTM-Standby)来备份主GTM的状态。当主GTM失败时,GTM-Proxy可以随时切换到备用。
如上所述,Postgres-XL的Coordinator和Datanodes都是是PostgreSQL数据库。在数据库范畴,PostgreSQL使用客户端/服务器模型。PostgreSQL会话包含如下两个服务:
server,服务端进程,管理数据库文件、接受客户端应用程序的连接,为client执行数据库操作。该进程称为postgres。
client,客户端,需要执行数据库操作。客户端应用程序多种多样:可以是文本工具,图形应用程序,访问数据库以显示网页的Web服务器或专门的数据库维护工具。一些客户端应用程序随PostgreSQL发行版提供;大多数是由用户开发的。
在典型的客户端/服务器应用程序中,客户端和服务器部署在不同的主机上。它们通过TCP / IP网络连接进行通信。需要注意的是,有些文件在客户端上可以访问,在数据库服务器上却不行(可能只是文件名不同)。
PostgreSQL服务器可以处理来自客户端的多个并发连接。为此,它为每个连接启动一个新进程。连接一旦建立,原始的postgres进程不会干预客户端和新的服务器进程之间的通信。主服务进程是始终运行的,等待客户端的连接,期间,有很多连接产生或消亡。
组件简介
Global Transaction Monitor (GTM)
全局事务管理器,确保群集范围内的事务一致性。GTM负责发放事务ID和快照作为其多版本并发控制的一部分。
集群可选地配置一个备用GTM,以改进可用性。此外,可以在协调器间配置代理GTM, 可用于改善可扩展性,减少GTM的通信量。GTM Standby
GTM的备节点,在pgxc,pgxl中,GTM控制所有的全局事务分配,如果出现问题,就会导致整个集群不可用,为了增加可用性,增加该备用节点。当GTM出现问题时,GTM Standby可以升级为GTM,保证集群正常工作。GTM-Proxy
GTM需要与所有的Coordinators通信,为了降低压力,可以在每个Coordinator机器上部署一个GTM-Proxy。Coordinator
协调员管理用户会话,并与GTM和数据节点进行交互。协调员解析,并计划查询,并给语句中的每一个组件发送下一个序列化的全局性计划。
为节省机器,通常此服务和数据节点部署在一起。Data Node
数据节点是数据实际存储的地方。数据的分布可以由DBA来配置。为了提高可用性,可以配置数据节点的热备以便进行故障转移准备。
总结:gtm是负责ACID的,保证分布式数据库全局事务一致性。得益于此,就算数据节点是分布的,但是你在主节点操作增删改查事务时,就如同只操作一个数据库一样简单。Coordinator是调度的,将操作指令发送到各个数据节点。datanodes是数据节点,分布式存储数据。
规划
准备三台Centos7服务器(或者虚拟机),版本为“CentOS Linux release 7.6.1810 (Core) ”,集群规划如下:
主机名 | IP | 角色 | 端口 | nodename | 数据目录 |
---|---|---|---|---|---|
lhrpgxl90 | 172.72.6.90 | GTM | 6666 | gtm | |
GTM Slave | 20001 | gtmSlave | PGHOME/data/gtm∣∣∣∣GTMSlave∣20001∣gtmSlave∣PGHOME/data/gtmSlave | ||
lhrpgxl91 | 172.72.6.91 | Coordinator | 5432 | coord1 | |
Datanode | 5433 | datanode1 | PGHOME/data/coord∣∣∣∣Datanode∣5433∣datanode1∣PGHOME/data/dn_master | ||
Datanode Slave | 15433 | datanode1_slave | |||
GTM Proxy | 6666 | gtm_pxy1 | PGHOME/data/dnslave∣∣∣∣GTMProxy∣6666∣gtmpxy1∣PGHOME/data/gtm_pxy | ||
lhrpgxl92 | 172.72.6.92 | Coordinator | 5432 | coord2 | |
Datanode | 5433 | datanode2 | PGHOME/data/coord∣∣∣∣Datanode∣5433∣datanode2∣PGHOME/data/dn_master | ||
Datanode Slave | 15433 | datanode2_slave | |||
GTM Proxy | 6666 | gtm_pxy2 | PGHOME/data/dnslave∣∣∣∣GTMProxy∣6666∣gtmpxy2∣PGHOME/data/gtm_pxy |
1-- 网卡
2docker network create --subnet=172.72.6.0/24 pg-network
3
4
5docker rm -f lhrpgxl90
6docker run -d --name lhrpgxl90 -h lhrpgxl90 \
7 --net=pg-network --ip 172.72.6.90 \
8 -p 64390:5432 \
9 -v /sys/fs/cgroup:/sys/fs/cgroup \
10 --privileged=true lhrbest/lhrcentos76:8.5 \
11 /usr/sbin/init
12
13
14docker rm -f lhrpgxl91
15docker run -d --name lhrpgxl91 -h lhrpgxl91 \
16 --net=pg-network --ip 172.72.6.91 \
17 -p 64391:5432 \
18 -v /sys/fs/cgroup:/sys/fs/cgroup \
19 --privileged=true lhrbest/lhrcentos76:8.5 \
20 /usr/sbin/init
21
22
23docker rm -f lhrpgxl92
24docker run -d --name lhrpgxl92 -h lhrpgxl92 \
25 --net=pg-network --ip 172.72.6.92 \
26 -p 64392:5432 \
27 -v /sys/fs/cgroup:/sys/fs/cgroup \
28 --privileged=true lhrbest/lhrcentos76:8.5 \
29 /usr/sbin/init
30
31
32[root@docker35 ~]# docker ps
33CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
34198a42183f53 lhrbest/lhrcentos76:8.5 "/usr/sbin/init" 54 seconds ago Up 52 seconds 0.0.0.0:64392->5432/tcp, :::64392->5432/tcp lhrpgxl92
3590ea3592000b lhrbest/lhrcentos76:8.5 "/usr/sbin/init" 56 seconds ago Up 54 seconds 0.0.0.0:64391->5432/tcp, :::64391->5432/tcp lhrpgxl91
36c11615c37160 lhrbest/lhrcentos76:8.5 "/usr/sbin/init" 58 seconds ago Up 56 seconds 0.0.0.0:64390->5432/tcp, :::64390->5432/tcp lhrpgxl90
环境准备
安装之前,需要先确保机器满足一些先决条件。
要运行pgxc_ctl的节点需要支持无密码ssh访问。
在所有机器上,正确设置PATH环境变量包含Postgres-XL数据文件,特别是在通过ssh运行命令时。
必须配置pg_hba.conf允许远程访问。pgxc_ctl.conf配置文件中诸如coordPgHbaEntries和datanodePgHbaEntries都可能需要适当的更改。
配置防火墙和iptables使某些端口可以正常访问。
如果没有安装pgxc_ctl,可以从源代码编译并安装。
1yum install -y flex bison readline-devel zlib-devel openjade docbook-style-dsssl gcc make
2
3groupadd -g 5432 postgres
4useradd -u 5432 -g postgres postgres
5echo "postgres:lhr" | chpasswd
6
7mkdir -p /postgresxl
8chown -R postgres:postgres /postgresxl
9
10cat >> /home/postgres/.bashrc <<"EOF"
11export PGHOME=/postgresxl
12export LD_LIBRARY_PATH=$PGHOME/lib:$LD_LIBRARY_PATH
13export PATH=$PGHOME/bin:$PATH
14export PGUSER=postgres
15export PGXC_CTL_HOME=/postgresxl/bin
16EOF
17
18
19echo "postgres ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers
下载安装
https://www.postgres-xl.org/download/
https://git.postgresql.org/gitweb/?p=postgres-xl.git;a=summary
在3台主机都需要安装PGXC,文件大约300MB,如下:
1su - postgres
2git clone git://git.postgresql.org/git/postgres-xl.git
3cd postgres-xl
4./configure --prefix=/postgresxl
5make -j4
6sudo make install
7cd contrib
8make -j4
9sudo make install
10
11
12chown -R postgres.postgres /postgresxl/
cortrib中有很多postgres很牛的工具,一般要装上。如ltree,uuid,postgres_fdw等等。
配置主节点可以无密码访问备节点
1./sshUserSetup.sh -user postgres -hosts "lhrpgxl90 lhrpgxl91 lhrpgxl92" -advanced exverify -confirm
2
3
4sudo chmod 600 /home/postgres/.ssh/config
集群配置
以下内容在lhrpgxl90上运行即可。
生成pgxc_ctl配置文件
1[postgres@lhrpgxl90 ~]$ pgxc_ctl prepare
2/usr/bin/bash
3Installing pgxc_ctl_bash script as /postgresxl/bin/pgxc_ctl_bash.
4ERROR: File "/postgresxl/bin/pgxc_ctl.conf" not found or not a regular file. No such file or directory
5Installing pgxc_ctl_bash script as /postgresxl/bin/pgxc_ctl_bash.
6Reading configuration using /postgresxl/bin/pgxc_ctl_bash --home /postgresxl/bin --configuration /postgresxl/bin/pgxc_ctl.conf
7Finished reading configuration.
8 ******** PGXC_CTL START ***************
9
10Current directory: /postgresxl/bin
11[postgres@lhrpgxl90 ~]$ ll /postgresxl/bin/pgxc_ctl.conf
12-rw-rw-r-- 1 postgres postgres 17815 Feb 21 17:18 /postgresxl/bin/pgxc_ctl.conf
配置pgxc_ctl.conf
在lhrpgxl90上运行即可。
1cat > /postgresxl/bin/pgxc_ctl.conf <<"EOF"
2
3pgxcInstallDir=$PGHOME
4pgxlDATA=$PGHOME/data
5
6pgxcOwner=postgres
7pgxcUser=postgres
8tmpDir=/tmp
9localTmpDir=$tmpDir
10
11#==========================================================================================================================
12
13#---- GTM Master ---------------
14gtmName=gtm
15gtmMasterServer=lhrpgxl90
16gtmMasterPort=6666
17gtmMasterDir=$pgxlDATA/gtm
18
19
20#---- Configuration
21gtmExtraConfig=none
22gtmMasterSpecificExtraConfig=none
23
24#---- GTM Slave配置信息
25gtmSlave=y # Specify y if you configure GTM Slave. Otherwise, GTM slave will not be configured and
26 # all the following variables will be reset.
27gtmSlaveName=gtmSlave
28gtmSlaveServer=lhrpgxl90 # value none means GTM slave is not available. Give none if you don't configure GTM Slave.
29gtmSlavePort=20001 # Not used if you don't configure GTM slave.
30gtmSlaveDir=$pgxlDATA/gtmSlave # Not used if you don't configure GTM slave.
31
32#---- Configuration
33gtmSlaveSpecificExtraConfig=none
34
35
36#==========================================================================================================================
37
38
39#---- GTM Proxy配置信息,最好每个数据节点配置一个
40#---- GTM-Proxy Master -------
41gtmProxyDir=$pgxlDATA/gtm_proxy
42gtmProxy=y
43gtmProxyNames=(gtm_pxy1 gtm_pxy2)
44gtmProxyServers=(lhrpgxl91 lhrpgxl92)
45gtmProxyPorts=(6666 6666)
46gtmProxyDirs=($gtmProxyDir $gtmProxyDir)
47
48#---- Configuration
49gtmPxyExtraConfig=none
50gtmPxySpecificExtraConfig=(none none)
51
52#==========================================================================================================================
53
54
55#---- Coordinators ---------
56coordMasterDir=$pgxlDATA/coord
57coordNames=(coord1 coord2)
58coordPorts=(5432 5432)
59poolerPorts=(6667 6667)
60coordPgHbaEntries=(0.0.0.0/0)
61
62coordMasterServers=(lhrpgxl91 lhrpgxl92)
63coordMasterDirs=($coordMasterDir $coordMasterDir)
64coordMaxWALsernder=0
65coordMaxWALSenders=($coordMaxWALsernder $coordMaxWALsernder)
66
67coordSlave=n
68
69#==========================================================================================================================
70
71
72#---- Datanodes ----------
73datanodeMasterDir=$pgxlDATA/dn_master
74primaryDatanode=lhrpgxl91
75datanodeNames=(datanode1 datanode2)
76datanodePorts=(5433 5433)
77datanodePoolerPorts=(6668 6668)
78datanodePgHbaEntries=(0.0.0.0/0)
79
80datanodeMasterServers=(lhrpgxl91 lhrpgxl92)
81datanodeMasterDirs=($datanodeMasterDir $datanodeMasterDir)
82datanodeMaxWalSender=4
83datanodeMaxWALSenders=($datanodeMaxWalSender $datanodeMaxWalSender)
84
85datanodeSlave=n
86
87#==========================================================================================================================
88
89
90EOF
初始化集群
1pgxc_ctl -c /postgresxl/bin/pgxc_ctl.conf init all
过程:
1[postgres@lhrpgxl90 ~]$ pgxc_ctl -c /postgresxl/bin/pgxc_ctl.conf init all
2/usr/bin/bash
3Installing pgxc_ctl_bash script as /postgresxl/bin/pgxc_ctl_bash.
4Installing pgxc_ctl_bash script as /postgresxl/bin/pgxc_ctl_bash.
5Reading configuration using /postgresxl/bin/pgxc_ctl_bash --home /postgresxl/bin --configuration /postgresxl/bin/pgxc_ctl.conf
6Finished reading configuration.
7 ******** PGXC_CTL START ***************
8
9Current directory: /postgresxl/bin
10Initialize GTM master
11The files belonging to this GTM system will be owned by user "postgres".
12This user must also own the server process.
13
14
15fixing permissions on existing directory /postgresxl/data/gtm ... ok
16creating configuration files ... ok
17creating control file ... ok
18
19Success.
20waiting for server to shut down.... done
21server stopped
22Done.
23Start GTM master
24server starting
25Initialize GTM slave
26The files belonging to this GTM system will be owned by user "postgres".
27This user must also own the server process.
28
29
30fixing permissions on existing directory /postgresxl/data/gtmSlave ... ok
31creating configuration files ... ok
32creating control file ... ok
33
34Success.
35Done.
36Start GTM slaveserver starting
37Done.
38Initialize all the gtm proxies.
39Initializing gtm proxy gtm_pxy1.
40Initializing gtm proxy gtm_pxy2.
41The files belonging to this GTM system will be owned by user "postgres".
42This user must also own the server process.
43
44
45fixing permissions on existing directory /postgresxl/data/gtm_proxy ... ok
46creating configuration files ... ok
47
48Success.
49The files belonging to this GTM system will be owned by user "postgres".
50This user must also own the server process.
51
52
53fixing permissions on existing directory /postgresxl/data/gtm_proxy ... ok
54creating configuration files ... ok
55
56Success.
57Done.
58Starting all the gtm proxies.
59Starting gtm proxy gtm_pxy1.
60Starting gtm proxy gtm_pxy2.
61server starting
62server starting
63Done.
64Initialize all the coordinator masters.
65Initialize coordinator master coord1.
66Initialize coordinator master coord2.
67The files belonging to this database system will be owned by user "postgres".
68This user must also own the server process.
69
70The database cluster will be initialized with locale "en_US.UTF-8".
71The default database encoding has accordingly been set to "UTF8".
72The default text search configuration will be set to "english".
73
74Data page checksums are disabled.
75
76fixing permissions on existing directory /postgresxl/data/coord ... ok
77creating subdirectories ... ok
78selecting default max_connections ... 100
79selecting default shared_buffers ... 128MB
80selecting dynamic shared memory implementation ... posix
81creating configuration files ... ok
82running bootstrap script ... ok
83performing post-bootstrap initialization ... creating cluster information ... ok
84syncing data to disk ... ok
85freezing database template0 ... ok
86freezing database template1 ... ok
87freezing database postgres ... ok
88
89WARNING: enabling "trust" authentication for local connections
90You can change this by editing pg_hba.conf or using the option -A, or
91--auth-local and --auth-host, the next time you run initdb.
92
93Success.
94The files belonging to this database system will be owned by user "postgres".
95This user must also own the server process.
96
97The database cluster will be initialized with locale "en_US.UTF-8".
98The default database encoding has accordingly been set to "UTF8".
99The default text search configuration will be set to "english".
100
101Data page checksums are disabled.
102
103fixing permissions on existing directory /postgresxl/data/coord ... ok
104creating subdirectories ... ok
105selecting default max_connections ... 100
106selecting default shared_buffers ... 128MB
107selecting dynamic shared memory implementation ... posix
108creating configuration files ... ok
109running bootstrap script ... ok
110performing post-bootstrap initialization ... creating cluster information ... ok
111syncing data to disk ... ok
112freezing database template0 ... ok
113freezing database template1 ... ok
114freezing database postgres ... ok
115
116WARNING: enabling "trust" authentication for local connections
117You can change this by editing pg_hba.conf or using the option -A, or
118--auth-local and --auth-host, the next time you run initdb.
119
120Success.
121Done.
122Starting coordinator master.
123Starting coordinator master coord1
124Starting coordinator master coord2
1252022-02-22 15:22:42.142 CST [23303] LOG: listening on IPv4 address "0.0.0.0", port 5432
1262022-02-22 15:22:42.142 CST [23303] LOG: listening on IPv6 address "::", port 5432
1272022-02-22 15:22:42.203 CST [23303] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
1282022-02-22 15:22:42.291 CST [23304] LOG: database system was shut down at 2022-02-22 15:22:38 CST
1292022-02-22 15:22:42.322 CST [23303] LOG: database system is ready to accept connections
1302022-02-22 15:22:42.323 CST [23311] LOG: cluster monitor started
1312022-02-22 15:22:42.142 CST [23180] LOG: listening on IPv4 address "0.0.0.0", port 5432
1322022-02-22 15:22:42.142 CST [23180] LOG: listening on IPv6 address "::", port 5432
1332022-02-22 15:22:42.203 CST [23180] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
1342022-02-22 15:22:42.291 CST [23181] LOG: database system was shut down at 2022-02-22 15:22:38 CST
1352022-02-22 15:22:42.322 CST [23180] LOG: database system is ready to accept connections
1362022-02-22 15:22:42.323 CST [23188] LOG: cluster monitor started
137Done.
138Initialize all the datanode masters.
139Initialize the datanode master datanode1.
140Initialize the datanode master datanode2.
141The files belonging to this database system will be owned by user "postgres".
142This user must also own the server process.
143
144The database cluster will be initialized with locale "en_US.UTF-8".
145The default database encoding has accordingly been set to "UTF8".
146The default text search configuration will be set to "english".
147
148Data page checksums are disabled.
149
150fixing permissions on existing directory /postgresxl/data/dn_master ... ok
151creating subdirectories ... ok
152selecting default max_connections ... 100
153selecting default shared_buffers ... 128MB
154selecting dynamic shared memory implementation ... posix
155creating configuration files ... ok
156running bootstrap script ... ok
157performing post-bootstrap initialization ... creating cluster information ... ok
158syncing data to disk ... ok
159freezing database template0 ... ok
160freezing database template1 ... ok
161freezing database postgres ... ok
162
163WARNING: enabling "trust" authentication for local connections
164You can change this by editing pg_hba.conf or using the option -A, or
165--auth-local and --auth-host, the next time you run initdb.
166
167Success.
168The files belonging to this database system will be owned by user "postgres".
169This user must also own the server process.
170
171The database cluster will be initialized with locale "en_US.UTF-8".
172The default database encoding has accordingly been set to "UTF8".
173The default text search configuration will be set to "english".
174
175Data page checksums are disabled.
176
177fixing permissions on existing directory /postgresxl/data/dn_master ... ok
178creating subdirectories ... ok
179selecting default max_connections ... 100
180selecting default shared_buffers ... 128MB
181selecting dynamic shared memory implementation ... posix
182creating configuration files ... ok
183running bootstrap script ... ok
184performing post-bootstrap initialization ... creating cluster information ... ok
185syncing data to disk ... ok
186freezing database template0 ... ok
187freezing database template1 ... ok
188freezing database postgres ... ok
189
190WARNING: enabling "trust" authentication for local connections
191You can change this by editing pg_hba.conf or using the option -A, or
192--auth-local and --auth-host, the next time you run initdb.
193
194Success.
195Done.
196Starting all the datanode masters.
197Starting datanode master datanode1.
198Starting datanode master datanode2.
1992022-02-22 15:22:50.478 CST [23764] LOG: listening on IPv4 address "0.0.0.0", port 5433
2002022-02-22 15:22:50.478 CST [23764] LOG: listening on IPv6 address "::", port 5433
2012022-02-22 15:22:50.555 CST [23764] LOG: listening on Unix socket "/tmp/.s.PGSQL.5433"
2022022-02-22 15:22:50.665 CST [23764] LOG: redirecting log output to logging collector process
2032022-02-22 15:22:50.665 CST [23764] HINT: Future log output will appear in directory "pg_log".
2042022-02-22 15:22:50.478 CST [23641] LOG: listening on IPv4 address "0.0.0.0", port 5433
2052022-02-22 15:22:50.478 CST [23641] LOG: listening on IPv6 address "::", port 5433
2062022-02-22 15:22:50.522 CST [23641] LOG: listening on Unix socket "/tmp/.s.PGSQL.5433"
2072022-02-22 15:22:50.625 CST [23641] LOG: redirecting log output to logging collector process
2082022-02-22 15:22:50.625 CST [23641] HINT: Future log output will appear in directory "pg_log".
209Done.
210ALTER NODE coord1 WITH (HOST='lhrpgxl91', PORT=5432);
211ALTER NODE
212CREATE NODE coord2 WITH (TYPE='coordinator', HOST='lhrpgxl92', PORT=5432);
213CREATE NODE
214CREATE NODE datanode1 WITH (TYPE='datanode', HOST='lhrpgxl91', PORT=5433, PREFERRED);
215CREATE NODE
216CREATE NODE datanode2 WITH (TYPE='datanode', HOST='lhrpgxl92', PORT=5433);
217CREATE NODE
218SELECT pgxc_pool_reload();
219 pgxc_pool_reload
220------------------
221 t
222(1 row)
223
224CREATE NODE coord1 WITH (TYPE='coordinator', HOST='lhrpgxl91', PORT=5432);
225CREATE NODE
226ALTER NODE coord2 WITH (HOST='lhrpgxl92', PORT=5432);
227ALTER NODE
228CREATE NODE datanode1 WITH (TYPE='datanode', HOST='lhrpgxl91', PORT=5433);
229CREATE NODE
230CREATE NODE datanode2 WITH (TYPE='datanode', HOST='lhrpgxl92', PORT=5433, PREFERRED);
231CREATE NODE
232SELECT pgxc_pool_reload();
233 pgxc_pool_reload
234------------------
235 t
236(1 row)
237
238Done.
239EXECUTE DIRECT ON (datanode1) 'CREATE NODE coord1 WITH (TYPE=''coordinator'', HOST=''lhrpgxl91'', PORT=5432)';
240EXECUTE DIRECT
241EXECUTE DIRECT ON (datanode1) 'CREATE NODE coord2 WITH (TYPE=''coordinator'', HOST=''lhrpgxl92'', PORT=5432)';
242EXECUTE DIRECT
243EXECUTE DIRECT ON (datanode1) 'ALTER NODE datanode1 WITH (TYPE=''datanode'', HOST=''lhrpgxl91'', PORT=5433, PREFERRED)';
244EXECUTE DIRECT
245EXECUTE DIRECT ON (datanode1) 'CREATE NODE datanode2 WITH (TYPE=''datanode'', HOST=''lhrpgxl92'', PORT=5433, PREFERRED)';
246EXECUTE DIRECT
247EXECUTE DIRECT ON (datanode1) 'SELECT pgxc_pool_reload()';
248 pgxc_pool_reload
249------------------
250 t
251(1 row)
252
253EXECUTE DIRECT ON (datanode2) 'CREATE NODE coord1 WITH (TYPE=''coordinator'', HOST=''lhrpgxl91'', PORT=5432)';
254EXECUTE DIRECT
255EXECUTE DIRECT ON (datanode2) 'CREATE NODE coord2 WITH (TYPE=''coordinator'', HOST=''lhrpgxl92'', PORT=5432)';
256EXECUTE DIRECT
257EXECUTE DIRECT ON (datanode2) 'CREATE NODE datanode1 WITH (TYPE=''datanode'', HOST=''lhrpgxl91'', PORT=5433, PREFERRED)';
258EXECUTE DIRECT
259EXECUTE DIRECT ON (datanode2) 'ALTER NODE datanode2 WITH (TYPE=''datanode'', HOST=''lhrpgxl92'', PORT=5433, PREFERRED)';
260EXECUTE DIRECT
261EXECUTE DIRECT ON (datanode2) 'SELECT pgxc_pool_reload()';
262 pgxc_pool_reload
263------------------
264 t
265(1 row)
266
267Done.
268
269[postgres@lhrpgxl90 ~]$ pgxc_ctl
270/usr/bin/bash
271Installing pgxc_ctl_bash script as /postgresxl/bin/pgxc_ctl_bash.
272Installing pgxc_ctl_bash script as /postgresxl/bin/pgxc_ctl_bash.
273Reading configuration using /postgresxl/bin/pgxc_ctl_bash --home /postgresxl/bin --configuration /postgresxl/bin/pgxc_ctl.conf
274Finished reading configuration.
275 ******** PGXC_CTL START ***************
276
277Current directory: /postgresxl/bin
278PGXC show config all
279========= Postgres-XL configuration Common Info ========================
280=== Overall ===
281Postgres-XL owner: postgres
282Postgres-XL user: postgres
283Postgres-XL install directory: /postgresxl
284pgxc_ctl home: /postgresxl/bin
285pgxc_ctl configuration file: /postgresxl/bin/pgxc_ctl.conf
286pgxc_ctl tmpDir: /tmp
287pgxc_ctl localTempDir: /tmp
288pgxc_ctl log file: /home/postgres/pgxc_ctl/pgxc_log/24719_pgxc_ctl.log
289pgxc_ctl configBackup: n
290pgxc_ctl configBackupHost: none
291pgxc_ctl configBackupFile: none
292========= Postgres-XL configuration End Common Info ===================
293====== Server: lhrpgxl90 =======
294GTM Master:
295 Nodename: 'gtm', port: 6666, dir: '/postgresxl/data/gtm' ExtraConfig: 'none', Specific Extra Config: 'none'
296GTM Slave:
297 Nodename: 'gtmSlave', port: 20001, dir: '/postgresxl/data/gtmSlave' ExtraConfig: 'none', Specific Extra Config: 'none'
298====== Server: lhrpgxl91 =======
299GTM Proxy:
300 Nodename: 'gtm_pxy1', port: 6666, dir: '/postgresxl/data/gtm_proxy' ExtraConfig: 'none', Specific Extra Config: 'none'
301Coordinator Master:
302 Nodename: 'coord1', port: 5432, pooler port: 6667
303 MaxWalSenders: 0, Dir: '/postgresxl/data/coord'
304 ExtraConfig: '(null)', Specific Extra Config: '(null)'
305 pg_hba entries ( '0.0.0.0/0' )
306 Extra pg_hba: '(null)', Specific Extra pg_hba: '(null)'
307Datanode Master:
308 Nodename: 'datanode1', port: 5433, pooler port 6667
309 MaxWALSenders: 4, Dir: '/postgresxl/data/dn_master'
310 ExtraConfig: '(null)', Specific Extra Config: '(null)'
311 pg_hba entries ( '0.0.0.0/0' )
312 Extra pg_hba: '(null)', Specific Extra pg_hba: '(null)'
313====== Server: lhrpgxl92 =======
314GTM Proxy:
315 Nodename: 'gtm_pxy2', port: 6666, dir: '/postgresxl/data/gtm_proxy' ExtraConfig: 'none', Specific Extra Config: 'none'
316Coordinator Master:
317 Nodename: 'coord2', port: 5432, pooler port: 6667
318 MaxWalSenders: 0, Dir: '/postgresxl/data/coord'
319 ExtraConfig: '(null)', Specific Extra Config: '(null)'
320 pg_hba entries ( '0.0.0.0/0' )
321 Extra pg_hba: '(null)', Specific Extra pg_hba: '(null)'
322Datanode Master:
323 Nodename: 'datanode2', port: 5433, pooler port 6667
324 MaxWALSenders: 4, Dir: '/postgresxl/data/dn_master'
325 ExtraConfig: '(null)', Specific Extra Config: '(null)'
326 pg_hba entries ( '0.0.0.0/0' )
327 Extra pg_hba: '(null)', Specific Extra pg_hba: '(null)'
328PGXC monitor all
329Running: gtm master
330Running: gtm slave
331Running: gtm proxy gtm_pxy1
332Running: gtm proxy gtm_pxy2
333Running: coordinator master coord1
334Running: coordinator master coord2
335Running: datanode master datanode1
336Running: datanode master datanode2
查看集群信息
在lhrpgxl91节点,执行psql -p 5432进入数据库操作。
1[root@lhrpgxl91 /]# su - postgres
2Last login: Mon Feb 21 17:11:45 CST 2022 on pts/0
3[postgres@lhrpgxl91 ~]$ psql -p 5432
4psql (PGXL 10alpha2, based on PG 10beta3 (Postgres-XL 10alpha2))
5Type "help" for help.
6
7postgres=#
8postgres=# select * from pgxc_node;
9 node_name | node_type | node_port | node_host | nodeis_primary | nodeis_preferred | node_id
10-----------+-----------+-----------+-----------+----------------+------------------+-------------
11 coord1 | C | 5432 | lhrpgxl91 | f | f | 1885696643
12 coord2 | C | 5432 | lhrpgxl92 | f | f | -1197102633
13 datanode1 | D | 5433 | lhrpgxl91 | f | t | 888802358
14 datanode2 | D | 5433 | lhrpgxl92 | f | f | -905831925
15(4 rows)
16
17-- node_type中的C代表coordinator,D代表DataNode
18
19postgres=# create database lhrdb;
20CREATE DATABASE
21postgres=# \l
22 List of databases
23 Name | Owner | Encoding | Collate | Ctype | Access privileges
24-----------+----------+----------+-------------+-------------+-----------------------
25 lhrdb | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
26 postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
27 template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +
28 | | | | | postgres=CTc/postgres
29 template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +
30 | | | | | postgres=CTc/postgres
31(4 rows)
32
33
34postgres=# \c lhrdb
35psql (14.0, server 10beta3 (Postgres-XL 10alpha2))
36You are now connected to database "lhrdb" as user "postgres".
37lhrdb=# create table test1(id int,name text);
38CREATE TABLE
39lhrdb=# insert into test1(id,name) select generate_series(1,8),'test';
40INSERT 0 8
41lhrdb=#
42lhrdb=# select count(*) from test1;
43 count
44-------
45 8
46(1 row)
47lhrdb=# SELECT xc_node_id, count(*) FROM test1 GROUP BY xc_node_id;
48 xc_node_id | count
49------------+-------
50 -905831925 | 3
51 888802358 | 5
52(2 rows)
53
54[postgres@lhrpgxl91 ~]$ psql -p 5433 -d lhrdb
55psql (PGXL 10alpha2, based on PG 10beta3 (Postgres-XL 10alpha2))
56Type "help" for help.
57
58lhrdb=# select count(*) from test1;
59 count
60-------
61 5
62(1 row)
63
64lhrdb=#
65
66[postgres@lhrpgxl91 ~]$ psql -p 5433 -d lhrdb -h lhrpgxl92
67psql (PGXL 10alpha2, based on PG 10beta3 (Postgres-XL 10alpha2))
68Type "help" for help.
69
70lhrdb=# select count(*) from test1;
71 count
72-------
73 3
74(1 row)
75
注意:由于所有的数据节点组成了完整的数据视图,所以一个数据节点down机,整个pgxl都启动不了了,所以实际生产中,为了提高可用性,一定要配置数据节点的热备以便进行故障转移准备。
进程和端口信息
1[postgres@lhrpgxl90 ~]$ netstat -tulnp | grep gtm
2(Not all processes could be identified, non-owned process info
3 will not be shown, you would have to be root to see it all.)
4tcp 0 0 0.0.0.0:20001 0.0.0.0:* LISTEN 3002/gtm
5tcp 0 0 0.0.0.0:6666 0.0.0.0:* LISTEN 2885/gtm
6tcp6 0 0 :::20001 :::* LISTEN 3002/gtm
7tcp6 0 0 :::6666 :::* LISTEN 2885/gtm
8[postgres@lhrpgxl90 ~]$ ps -ef|grep gtm
9postgres 2885 1 0 10:00 ? 00:00:00 gtm -D /postgresxl/data/gtm
10postgres 3002 1 0 10:00 ? 00:00:00 gtm -D /postgresxl/data/gtmSlave
11postgres 3291 485 0 10:02 pts/0 00:00:00 grep --color=auto gtm
12
13
14[postgres@lhrpgxl91 ~]$ netstat -tulnp | grep "gtm\|postgres"
15(Not all processes could be identified, non-owned process info
16 will not be shown, you would have to be root to see it all.)
17tcp 0 0 0.0.0.0:6666 0.0.0.0:* LISTEN 2827/gtm_proxy
18tcp 0 0 0.0.0.0:5432 0.0.0.0:* LISTEN 2929/postgres
19tcp 0 0 0.0.0.0:5433 0.0.0.0:* LISTEN 3038/postgres
20tcp6 0 0 :::6666 :::* LISTEN 2827/gtm_proxy
21tcp6 0 0 :::5432 :::* LISTEN 2929/postgres
22tcp6 0 0 :::5433 :::* LISTEN 3038/postgres
23[postgres@lhrpgxl91 ~]$ ps -ef|grep postgres
24root 1749 295 0 09:46 pts/0 00:00:00 su - postgres
25postgres 1750 1749 0 09:46 pts/0 00:00:00 -bash
26postgres 2827 1 0 10:00 ? 00:00:00 gtm_proxy -D /postgresxl/data/gtm_proxy
27postgres 2929 1 0 10:00 ? 00:00:00 /postgresxl/bin/postgres --coordinator -D /postgresxl/data/coord -i
28postgres 2931 2929 0 10:00 ? 00:00:00 postgres: pooler process
29postgres 2932 2929 0 10:00 ? 00:00:00 postgres: checkpointer process
30postgres 2933 2929 0 10:00 ? 00:00:00 postgres: writer process
31postgres 2934 2929 0 10:00 ? 00:00:00 postgres: wal writer process
32postgres 2935 2929 0 10:00 ? 00:00:00 postgres: autovacuum launcher process
33postgres 2936 2929 0 10:00 ? 00:00:00 postgres: stats collector process
34postgres 2937 2929 0 10:00 ? 00:00:00 postgres: cluster monitor process
35postgres 2938 2929 0 10:00 ? 00:00:00 postgres: bgworker: logical replication launcher
36postgres 3038 1 0 10:00 ? 00:00:00 /postgresxl/bin/postgres --datanode -D /postgresxl/data/dn_master -i
37postgres 3039 3038 0 10:00 ? 00:00:00 postgres: logger process
38postgres 3042 3038 0 10:00 ? 00:00:00 postgres: pooler process
39postgres 3043 3038 0 10:00 ? 00:00:00 postgres: checkpointer process
40postgres 3044 3038 0 10:00 ? 00:00:00 postgres: writer process
41postgres 3045 3038 0 10:00 ? 00:00:00 postgres: wal writer process
42postgres 3046 3038 0 10:00 ? 00:00:00 postgres: autovacuum launcher process
43postgres 3047 3038 0 10:00 ? 00:00:00 postgres: stats collector process
44postgres 3048 3038 0 10:00 ? 00:00:00 postgres: cluster monitor process
45postgres 3049 3038 0 10:00 ? 00:00:00 postgres: bgworker: logical replication launcher
建表说明
REPLICATION表:各个datanode节点中,表的数据完全相同,也就是说,插入数据时,会分别在每个datanode节点插入相同数据。读数据时,只需要读任意一个datanode节点上的数据。
1lhrdb=# CREATE TABLE repltab (col1 int, col2 int) DISTRIBUTE BY REPLICATION;
DISTRIBUTE :会将插入的数据,按照拆分规则,分配到不同的datanode节点中存储,也就是sharding技术。每个datanode节点只保存了部分数据,通过coordinate节点可以查询完整的数据视图。
1lhrdb=# CREATE TABLE disttab(col1 int, col2 int, col3 text) DISTRIBUTE BY HASH(col1);
模拟部分数据,插入测试数据:
1#任意登录一个coordinate节点进行建表操作
2[postgres@lhrpgxl91 ~]$ psql -p 5432
3lhrdb=# INSERT INTO disttab SELECT generate_series(1,100), generate_series(101, 200), 'foo';
4INSERT 0 100
5lhrdb=# INSERT INTO repltab SELECT generate_series(1,100), generate_series(101, 200);
6INSERT 0 100
查看数据分布结果:
1#DISTRIBUTE表分布结果
2lhrdb=# SELECT xc_node_id, count(*) FROM disttab GROUP BY xc_node_id;
3 xc_node_id | count
4------------+-------
5 -905831925 | 58
6 888802358 | 42
7(2 rows)
8
9
10lhrdb=# select oid,* from pgxc_node;
11 oid | node_name | node_type | node_port | node_host | nodeis_primary | nodeis_preferred | node_id
12-------+-----------+-----------+-----------+-----------+----------------+------------------+-------------
13 11739 | coord1 | C | 5432 | lhrpgxl91 | f | f | 1885696643
14 16384 | coord2 | C | 5432 | lhrpgxl92 | f | f | -1197102633
15 16385 | datanode1 | D | 5433 | lhrpgxl91 | f | t | 888802358
16 16386 | datanode2 | D | 5433 | lhrpgxl92 | f | f | -905831925
17(4 rows)
18
19
20#REPLICATION表分布结果
21
22lhrdb=# SELECT xc_node_id, count(*) FROM repltab GROUP BY xc_node_id;
23 xc_node_id | count
24-------------+-------
25 -1151313560 | 100
26(1 row)
查看另一个datanode2中repltab表结果:
1[postgres@lhrpgxl92 ~]$ psql -p 5433 -d lhrdb
2psql (PGXL 10alpha2, based on PG 10beta3 (Postgres-XL 10alpha2))
3Type "help" for help.
4
5lhrdb=# SELECT count(*) FROM repltab;
6 count
7-------
8 100
9(1 row)
10
结论:REPLICATION表中,datanode1,datanode2中表是全部数据,一模一样。而DISTRIBUTE表,数据散落近乎平均分配到了datanode1,datanode2节点中。
启动和关闭集群
以后启动,直接执行如下命令:
1[postgres@lhrpgxl90 ~]$ pgxc_ctl start all
2/usr/bin/bash
3Installing pgxc_ctl_bash script as /postgresxl/bin/pgxc_ctl_bash.
4Installing pgxc_ctl_bash script as /postgresxl/bin/pgxc_ctl_bash.
5Reading configuration using /postgresxl/bin/pgxc_ctl_bash --home /postgresxl/bin --configuration /postgresxl/bin/pgxc_ctl.conf
6Finished reading configuration.
7 ******** PGXC_CTL START ***************
8
9Current directory: /postgresxl/bin
10Start GTM master
11server starting
12Start GTM slaveserver starting
13Done.
14Starting all the gtm proxies.
15Starting gtm proxy gtm_pxy1.
16Starting gtm proxy gtm_pxy2.
17server starting
18server starting
19Done.
20Starting coordinator master.
21Starting coordinator master coord1
22Starting coordinator master coord2
232022-02-22 15:31:29.336 CST [24824] LOG: listening on IPv4 address "0.0.0.0", port 5432
242022-02-22 15:31:29.336 CST [24824] LOG: listening on IPv6 address "::", port 5432
252022-02-22 15:31:29.401 CST [24824] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
262022-02-22 15:31:29.571 CST [24825] LOG: database system was shut down at 2022-02-22 15:31:00 CST
272022-02-22 15:31:29.617 CST [24824] LOG: database system is ready to accept connections
282022-02-22 15:31:29.618 CST [24832] LOG: cluster monitor started
292022-02-22 15:31:29.336 CST [24697] LOG: listening on IPv4 address "0.0.0.0", port 5432
302022-02-22 15:31:29.336 CST [24697] LOG: listening on IPv6 address "::", port 5432
312022-02-22 15:31:29.401 CST [24697] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
322022-02-22 15:31:29.571 CST [24698] LOG: database system was shut down at 2022-02-22 15:31:00 CST
332022-02-22 15:31:29.617 CST [24697] LOG: database system is ready to accept connections
342022-02-22 15:31:29.618 CST [24706] LOG: cluster monitor started
35Done.
36Starting all the datanode masters.
37Starting datanode master datanode1.
38Starting datanode master datanode2.
392022-02-22 15:31:31.157 CST [24933] LOG: listening on IPv4 address "0.0.0.0", port 5433
402022-02-22 15:31:31.157 CST [24933] LOG: listening on IPv6 address "::", port 5433
412022-02-22 15:31:31.220 CST [24933] LOG: listening on Unix socket "/tmp/.s.PGSQL.5433"
422022-02-22 15:31:31.322 CST [24933] LOG: redirecting log output to logging collector process
432022-02-22 15:31:31.322 CST [24933] HINT: Future log output will appear in directory "pg_log".
442022-02-22 15:31:31.157 CST [24807] LOG: listening on IPv4 address "0.0.0.0", port 5433
452022-02-22 15:31:31.157 CST [24807] LOG: listening on IPv6 address "::", port 5433
462022-02-22 15:31:31.220 CST [24807] LOG: listening on Unix socket "/tmp/.s.PGSQL.5433"
472022-02-22 15:31:31.322 CST [24807] LOG: redirecting log output to logging collector process
482022-02-22 15:31:31.322 CST [24807] HINT: Future log output will appear in directory "pg_log".
49Done.
停止集群如下:
1[postgres@lhrpgxl90 ~]$ pgxc_ctl stop all
2/usr/bin/bash
3Installing pgxc_ctl_bash script as /postgresxl/bin/pgxc_ctl_bash.
4Installing pgxc_ctl_bash script as /postgresxl/bin/pgxc_ctl_bash.
5Reading configuration using /postgresxl/bin/pgxc_ctl_bash --home /postgresxl/bin --configuration /postgresxl/bin/pgxc_ctl.conf
6Finished reading configuration.
7 ******** PGXC_CTL START ***************
8
9Current directory: /postgresxl/bin
10Stopping all the coordinator masters.
11Stopping coordinator master coord1.
12Stopping coordinator master coord2.
13Done.
14Stopping all the datanode masters.
15Stopping datanode master datanode1.
16Stopping datanode master datanode2.
17Done.
18Stopping all the gtm proxies.
19Stopping gtm proxy gtm_pxy1.
20Stopping gtm proxy gtm_pxy2.
21waiting for server to shut down.... done
22server stopped
23waiting for server to shut down.... done
24server stopped
25Done.
26Stop GTM slave
27waiting for server to shut down.... done
28server stopped
29Stop GTM master
30waiting for server to shut down.... done
31server stopped
这几个主要命令暂时这么多,更多请从pgxc_ctl --help中获取更多信息。
1[postgres@lhrpgxl90 ~]$ pgxc_ctl --help
2/usr/bin/bash
3pgxc_ctl [option ...] [command]
4option:
5 -c or --configuration conf_file: Specify configruration file.
6 -v or --verbose: Specify verbose output.
7 -V or --version: Print version and exit.
8 -l or --logdir log_directory: specifies what directory to write logs.
9 -L or --logfile log_file: Specifies log file.
10 --home home_direcotry: Specifies pgxc_ctl work director.
11 -i or --infile input_file: Specifies inptut file.
12 -o or --outfile output_file: Specifies output file.
13 -h or --help: Prints this message and exits.
14For more deatils, refer to pgxc_ctl reference manual included in
15postgres-xc reference manual.
参考
https://www.jianshu.com/p/82aaf352b772