暂无图片
暂无图片
4
暂无图片
暂无图片
暂无图片

KingbaseES(KES)V9 RWC集群在线扩缩容

原创 飞天 2025-02-27
110

一、在线扩缩容介绍

KingbaseES 提供数据库扩缩容工具进行数据库集群的在线扩缩容。对于不支持 GUI 的服务器,KingbaseES 提供基于命令行操作的集群扩缩容方式,本文主要介绍如何使用命令行的方式对KES V9 RWC集群进行在线扩缩容。

一主一备rwc集群部署请参考:KingbaseES(KES)V9 RWC集群部署实战

二、KES V9 RWC集群环境说明

目前已存在如下一主一备两节点的rwc集群环境:

主机名 ip地址 OS版本 内存、CPU 节点角色 数据库端口 集群软件安装目录 数据目录
node1 192.*.*.60 Centos7.9 4G 、 1个双核 主节点 54321 /opt/kes/v9 /data/cluster
node2 192.*.*.62 Centos7.9 4G 、 1个双核 备节点 54321 /opt/kes/v9 /data/cluster

集群vip地址: 192.*.*.64

三、KES V9 RWC集群扩容

需求:需要把node3节点加入到现有的一主一备两节点rwc集群中:

主机名 ip地址 OS版本 内存、CPU 节点角色 数据库端口 集群软件安装目录 数据目录
node3 192.*.*.66 Centos7.9 4G 、 1个双核 主节点 54321 /opt/kes/v9 /data/cluster

详细扩容步骤

1、准备待扩容节点node3的操作系统环境
参考KingbaseES(KES)V9 RWC集群部署实战中的 <安装前环境准备> 章节。

【注意】要在三台主机node1、node2、node3的/etc/hosts文件中加入node3的信息:

192.*.*.66 node3

2、在node1或node2上检查现有集群状态

[kingbase@node1 ~]$ repmgr service status ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen ----+-------+---------+-----------+----------+---------+-------+---------+-------------------- 1 | node1 | primary | * running | | running | 77003 | no | n/a 2 | node2 | standby | running | node1 | running | 43158 | no | 1 second(s) ago

2、准备扩容所需文件
从现有集群的主节点node1上获取文件:db.zip、license.dat、install.conf、cluster_install.sh 和 trust_cluster.sh,拷贝到待扩容主机node3上。

#进入{kes软件安装目录}/KESRealPro/V009R001C002B0014/ClientTools/guitools/DeployTools/zip目录 [root@node1 ~]# cd /opt/kes/v9/KESRealPro/V009R001C002B0014/ClientTools/guitools/DeployTools/zip/ [root@node1 zip]# ll total 322412 -rwxrwxr-x 1 kingbase kingbase 252402 Sep 23 18:41 cluster_install.sh -rw-rw-r-- 1 kingbase kingbase 327258132 Sep 23 18:41 db.zip -rw-rw-r-- 1 kingbase kingbase 19580 Jan 18 12:24 install.conf -rw-rw-r-- 1 kingbase kingbase 3676 Jan 18 11:47 license.dat -rw-rw-r-- 1 kingbase kingbase 2595145 Sep 23 18:41 securecmdd.zip -rwxrwxr-x 1 kingbase kingbase 9677 Sep 23 18:41 trust_cluster.sh [root@node1 zip]# # 拷贝文件到待扩容节点node3 [root@node1 zip]# scp * node3:/soft #登录node3修改扩容需要文件的权限: [root@node3 ~]# chown -R kingbase:kingbase /soft/* [root@node3 ~]# ll /soft/* total 322432 -rwxr-xr-x 1 kingbase kingbase 252402 Feb 27 17:59 cluster_install.sh -rw-r--r-- 1 kingbase kingbase 327258132 Feb 27 17:59 db.zip -rw-r--r-- 1 kingbase kingbase 19678 Feb 27 20:40 install.conf -rw-r--r-- 1 kingbase kingbase 3676 Feb 27 17:59 license.dat -rw-r--r-- 1 kingbase kingbase 2595145 Feb 27 17:59 securecmdd.zip -rwxr-xr-x 1 kingbase kingbase 9677 Feb 27 17:59 trust_cluster.sh

以下操作都在待扩容节点node3上进行。

3、配置 install.conf 文件
3.1 编辑 install.conf 中 install 标签下的参数

在all_ip所在的行加入待扩容主机node3的ip

[root@node3 ~]# cd /soft [root@node3 soft]# vi install.conf #在all_ip所在的行加入待扩容主机node3的ip: 192.*.*.66

image.png

3.2 编辑 install.conf 中 expand 标签下的参数

[expand] expand_type="0" # The node type of standby/witness node, which would be add to cluster. 0:standby 1:witness primary_ip="192.*.*.60" # The ip addr of cluster primary node, which need to expand a standby/witness node. expand_ip="192.*.*.66" # The ip addr of standby/witness node, which would be add to cluster. node_id="3" # The node_id of standby/witness node, which would be add to cluster. It does not the same with any one in cluster node # for example: node_id="3" sync_type="" # the sync_type parameter is used to specify the sync type for expand node. 0:sync 1:potential 2:async # this parameter is only valid when expand_type="0" and the synchronous parameter of the cluster is set to custom mode. ## Specific instructions ,see it under [install] install_dir="/opt/kes/v9" # the last layer of directory could not add '/' zip_package="/soft/db.zip" net_device=(ens33) # if virtual_ip set,it must be set net_device_ip=(192.*.*.66) # if virtual_ip set,it must be set license_file=(license.dat) deploy_by_sshd="1" ssh_port="22" scmd_port="8890"

【注意】如需修改 ssh 连接端口,先修改 install.conf 文件中 ssh_port 项的值,然后修改系统/etc/ssh/sshd_config 文件中的 Port 项的值,最后需要重启 sshd 服务。

4、配置ssh免密
在待扩容主机node3上配置各节点 root 与kingbase用户的免密。操作如下:

#配置ssh免密 [root@node3 soft]# ./trust_cluster.sh

image.png

5、集群扩容
使用root用户或者kingbase用户都可以扩容成功,本文中使用kingbase用户执行”cluster_install.sh expand” 命令进行扩容,脚本将按照配置自动完成集群扩容工作。
【注意】在扩容过程中会自动创建集群安装目录/opt/kes/v9,而kingbase用户默认没有在/opt目录创建文件的权限,因此需提前创建/opt/kes目录并修改权限为kingbase:kingbase。如果使用root用户扩容则不需要提前创建目录/opt/kes。

root用户创建目录并授权:

[root@node3 ~]# mkdir /opt/kes [root@node3 ~]# chown -R kingbase:kingbase /opt/kes

使用kingbase用户执行扩容操作:

[kingbase@node3 soft]$ ./cluster_install.sh expand

扩容日志如下:

[kingbase@node3 soft]$ ./cluster_install.sh expand [CONFIG_CHECK] will deploy the cluster of [RUNNING] success connect to the target "192.*.*.66" ..... OK [RUNNING] success connect to "192.*.*.66" from current node by 'ssh' ..... OK [RUNNING] success connect to the target "192.*.*.60" ..... OK [RUNNING] success connect to "192.*.*.60" from current node by 'ssh' ..... OK [RUNNING] Primary node ip is 192.*.*.60 ... [RUNNING] Primary node ip is 192.*.*.60 ... OK [CONFIG_CHECK] set install_with_root=1 [RUNNING] success connect to the target "192.*.*.66" ..... OK [RUNNING] success connect to "192.*.*.66" from current node by 'ssh' ..... OK [RUNNING] success connect to the target "192.*.*.60" ..... OK [RUNNING] success connect to "192.*.*.60" from current node by 'ssh' ..... OK [INSTALL] load config from cluster..... [INFO] db_user=system [INFO] db_port=54321 [INFO] use_scmd=1 [INFO] data_directory=/data/cluster [INFO] scmd_port=8890 [INFO] recovery=standby [INFO] use_check_disk=off ./cluster_install.sh: line 4981: 192.*.*.62: command not found [INFO] trusted_servers=192.*.*.60 192.*.*.62 [INFO] virtual_ip=192.*.*.64/24 [INFO] ipaddr_path=/usr/sbin [INFO] ping_path=/usr/bin [INFO] arping_path=/opt/kes/bin [INFO] reconnect_attempts=10 [INFO] reconnect_interval=6 [INFO] auto_cluster_recovery_level=1 [INFO] synchronous=quorum [INSTALL] load config from cluster.....OK [CONFIG_CHECK] success to access license_file: /soft/license.dat [CONFIG_CHECK] file format is correct ... OK [CONFIG_CHECK] check database connection ... [CONFIG_CHECK] check database connection ... OK [CONFIG_CHECK] expand_ip[192.*.*.66] is not used in the cluster ... [CONFIG_CHECK] expand_ip[192.*.*.66] is not used in the cluster ...ok [CONFIG_CHECK] The localhost is expand_ip:[192.*.*.66] ... [CONFIG_CHECK] The localhost is expand_ip:[192.*.*.66] ...ok [CONFIG_CHECK] check node_id is in cluster ... [CONFIG_CHECK] check node_id is in cluster ...OK [RUNNING] check the db is running or not... [RUNNING] the db is not running on "192.*.*.66:54321" ..... OK [RUNNING] the install dir is not exist on "192.*.*.66" ..... OK [RUNNING] check the sys_securecmdd is running or not... [RUNNING] the sys_securecmdd is not running on "192.*.*.66:8890" ..... OK [CONFIG_CHECK] The virtual ip [192.*.*.64] exists on primary host [192.*.*.60]..... [CONFIG_CHECK] The virtual ip [192.*.*.64] exists on primary host [192.*.*.60].....OK [CONFIG_CHECK] The net_device_ip:[192.*.*.66] exists on dev ens33 on [192.*.*.66]..... [CONFIG_CHECK] The net_device_ip:[192.*.*.66] exists on host "192.*.*.66" on dev ens33 .....OK [INFO] use_ssl=0 2025-02-27 21:12:21 [INFO] start to check system parameters on 192.*.*.66 ... 2025-02-27 21:12:21 [WARNING] [GSSAPIAuthentication] yes (should be: no) on 192.*.*.66 2025-02-27 21:12:21 [INFO] [UseDNS] is null on 192.*.*.66 2025-02-27 21:12:22 [INFO] [UsePAM] yes on 192.*.*.66 2025-02-27 21:12:22 [INFO] [ulimit.open files] 65536 on 192.*.*.66 2025-02-27 21:12:22 [INFO] [ulimit.open proc] 65536 on 192.*.*.66 2025-02-27 21:12:22 [INFO] [ulimit.core size] unlimited on 192.*.*.66 2025-02-27 21:12:22 [INFO] [ulimit.mem lock] 50000000 on 192.*.*.66 2025-02-27 21:12:23 [INFO] [kernel.sem] 5010 641280 5010 256 on 192.*.*.66 2025-02-27 21:12:23 [INFO] [RemoveIPC] no on 192.*.*.66 2025-02-27 21:12:23 [INFO] [DefaultTasksAccounting] no on 192.*.*.66 2025-02-27 21:12:23 [INFO] write file "/etc/udev/rules.d/kingbase.rules" on 192.*.*.66 2025-02-27 21:12:24 [INFO] [crontab] chmod /usr/bin/crontab ... 2025-02-27 21:12:24 [INFO] [crontab] chmod /usr/bin/crontab ... Done 2025-02-27 21:12:24 [INFO] [crontab access] OK 2025-02-27 21:12:25 [INFO] [cron.deny] kingbase not exists in cron.deny 2025-02-27 21:12:25 [INFO] [crontab auth] crontab is accessible by kingbase now on 192.*.*.66 2025-02-27 21:12:25 [INFO] [SELINUX] disabled on 192.*.*.66 2025-02-27 21:12:26 [INFO] [firewall] down on 192.*.*.66 2025-02-27 21:12:26 [INFO] [The memory] OK on 192.*.*.66 2025-02-27 21:12:26 [INFO] [The hard disk] OK on 192.*.*.66 2025-02-27 21:12:26 [INFO] [ping] chmod /usr/bin/ping ... 2025-02-27 21:12:26 [INFO] [ping] chmod /usr/bin/ping ... Done 2025-02-27 21:12:27 [INFO] [ping access] OK 2025-02-27 21:12:27 [INFO] [/bin/cp --version] on 192.*.*.66 OK 2025-02-27 21:12:27 [INFO] [ip command path] on 192.*.*.66 OK [INSTALL] create the install dir "/opt/kes/v9/kingbase" on 192.*.*.66 ... [INSTALL] success to create the install dir "/opt/kes/v9/kingbase" on "192.*.*.66" ..... OK [INSTALL] try to copy the zip package "/soft/db.zip" to /opt/kes/v9/kingbase of "192.*.*.66" ..... [INSTALL] success to scp the zip package "/soft/db.zip" /opt/kes/v9/kingbase of to "192.*.*.66" ..... OK [INSTALL] decompress the "/opt/kes/v9/kingbase" to "/opt/kes/v9/kingbase" on 192.*.*.66 [INSTALL] success to decompress the "/opt/kes/v9/kingbase/db.zip" to "/opt/kes/v9/kingbase" on "192.*.*.66"..... OK [RUNNING] chmod u+s and a+x for "/usr/sbin" and "/opt/kes/bin" on 192.*.*.66 [RUNNING] chmod u+s and a+x /usr/sbin/ip on "192.*.*.66" ..... OK [RUNNING] chmod u+s and a+x /opt/kes/bin/arping on "192.*.*.66" ..... OK [INSTALL] check license_file "license.dat" [INSTALL] Scp license to /opt/kes/v9/kingbase/../license.dat on 192.*.*.66 [INSTALL] success to copy /soft/license.dat to /opt/kes/v9/kingbase/../ on 192.*.*.66 [RUNNING] config sys_securecmdd and start it ... [RUNNING] config the sys_securecmdd port to 8890 ... [RUNNING] success to config the sys_securecmdd port on 192.*.*.66 ... OK successfully initialized the sys_securecmdd, please use "/opt/kes/v9/kingbase/bin/sys_HAscmdd.sh start" to start the sys_securecmdd [RUNNING] success to config sys_securecmdd on 192.*.*.66 ... OK Created symlink from /etc/systemd/system/multi-user.target.wants/securecmdd.service to /etc/systemd/system/securecmdd.service. [RUNNING] success to start sys_securecmdd on 192.*.*.66 ... OK [INSTALL] success to access file: /opt/kes/v9/kingbase/etc/all_nodes_tools.conf [INSTALL] success to scp the /opt/kes/v9/kingbase/etc/repmgr.conf from 192.*.*.60 to "192.*.*.66"..... ok [INSTALL] success to scp the ~/.encpwd from 192.*.*.60 to "192.*.*.66"..... ok [INSTALL] success to scp /opt/kes/v9/kingbase/etc/all_nodes_tools.conf from "192.*.*.60" to "192.*.*.66" ...ok [INSTALL] success to chmod 600 the ~/.encpwd on 192.*.*.66..... ok [INFO] parameter_name=node_id [INFO] parameter_values='3' [INFO] [parameter_name] para_exist=1 [INFO] sed -i "/[#]*node_id[ ]*=/cnode_id='3'" /opt/kes/v9/kingbase/etc/repmgr.conf [INFO] parameter_name=node_name [INFO] parameter_values='node3' [INFO] [parameter_name] para_exist=1 [INFO] sed -i "/[#]*node_name[ ]*=/cnode_name='node3'" /opt/kes/v9/kingbase/etc/repmgr.conf [INFO] parameter_name=conninfo [INFO] parameter_values='host [INFO] [parameter_name] para_exist=1 [INFO] sed -i "/[#]*conninfo[ ]*=/cconninfo='host=192.*.*.66 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=2 keepalives_interval=2 keepalives_count=3 tcp_user_timeout=9000'" /opt/kes/v9/kingbase/etc/repmgr.conf [INFO] parameter_name=ping_path [INFO] parameter_values='/usr/bin' [INFO] [parameter_name] para_exist=1 [INFO] sed -i "/[#]*ping_path[ ]*=/cping_path='/usr/bin'" /opt/kes/v9/kingbase/etc/repmgr.conf [INFO] parameter_name=net_device [INFO] parameter_values='ens33' [INFO] [parameter_name] para_exist=1 [INFO] sed -i "/[#]*net_device[ ]*=/cnet_device='ens33'" /opt/kes/v9/kingbase/etc/repmgr.conf [INFO] parameter_name=net_device_ip [INFO] parameter_values='192.*.*.66' [INFO] [parameter_name] para_exist=1 [INFO] sed -i "/[#]*net_device_ip[ ]*=/cnet_device_ip='192.*.*.66'" /opt/kes/v9/kingbase/etc/repmgr.conf [INFO] parameter_name=arping_path [INFO] parameter_values='/opt/kes/bin' [INFO] [parameter_name] para_exist=1 [INFO] sed -i "/[#]*arping_path[ ]*=/carping_path='/opt/kes/bin'" /opt/kes/v9/kingbase/etc/repmgr.conf [INFO] parameter_name=ipaddr_path [INFO] parameter_values='/usr/sbin' [INFO] [parameter_name] para_exist=1 [INFO] sed -i "/[#]*ipaddr_path[ ]*=/cipaddr_path='/usr/sbin'" /opt/kes/v9/kingbase/etc/repmgr.conf [RUNNING] standby clone ... [WARNING] following problems with command line parameters detected: -D/--sysdata will be ignored if a repmgr configuration file is provided [NOTICE] destination directory "/data/cluster" provided [INFO] connecting to source node [DETAIL] connection string is: host=192.*.*.60 user=esrep port=54321 dbname=esrep [DETAIL] current installation size is 87 MB [NOTICE] checking for available walsenders on the source node (2 required) [NOTICE] checking replication connections can be made to the source server (2 required) [INFO] checking and correcting permissions on existing directory "/data/cluster" [INFO] creating replication slot as user "esrep" [NOTICE] starting backup (using sys_basebackup)... [INFO] executing: /opt/kes/v9/kingbase/bin/sys_basebackup -l "repmgr base backup" -D /data/cluster -h 192.*.*.60 -p 54321 -U esrep -c fast -X stream -S repmgr_slot_3 [NOTICE] standby clone (using sys_basebackup) complete [NOTICE] you can now start your Kingbase server [HINT] for example: sys_ctl -D /data/cluster start [HINT] after starting the server, you need to register this standby with "repmgr standby register" [RUNNING] standby clone ...OK [RUNNING] db start ... waiting for server to start.... done server started [RUNNING] db start ...OK [INFO] connecting to local node "node3" (ID: 3) [INFO] connecting to primary database [WARNING] --upstream-node-id not supplied, assuming upstream node is primary (node ID: 1) [INFO] standby registration complete [NOTICE] standby node "node3" (ID: 3) successfully registered 2025-02-27 21:12:52 begin to start DB on "[localhost]". 2025-02-27 21:12:53 DB on "[localhost]" already started, connect to check it. 2025-02-27 21:12:54 DB on "[localhost]" start success. 2025-02-27 21:12:54 Ready to start local kbha daemon and repmgrd daemon ... 2025-02-27 21:12:54 begin to start repmgrd on "[localhost]". [2025-02-27 21:12:55] [NOTICE] using provided configuration file "/opt/kes/v9/kingbase/bin/../etc/repmgr.conf" [2025-02-27 21:12:55] [INFO] creating directory "/opt/kes/v9/kingbase/log"... [2025-02-27 21:12:55] [NOTICE] redirecting logging output to "/opt/kes/v9/kingbase/log/hamgr.log" 2025-02-27 21:12:56 repmgrd on "[localhost]" start success. [2025-02-27 21:12:58] [NOTICE] redirecting logging output to "/opt/kes/v9/kingbase/log/kbha.log" 2025-02-27 21:12:59 Done. ID | Name | Role | Status | Upstream | Location | Priority | Timeline | LSN_Lag | Connection string ----+-------+---------+-----------+----------+----------+----------+----------+---------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 1 | node1 | primary | * running | | default | 100 | 1 | | host=192.*.*.60 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=2 keepalives_interval=2 keepalives_count=3 tcp_user_timeout=9000 2 | node2 | standby | running | node1 | default | 100 | 1 | 0 bytes | host=192.*.*.62 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=2 keepalives_interval=2 keepalives_count=3 tcp_user_timeout=9000 3 | node3 | standby | running | node1 | default | 100 | 1 | 0 bytes | host=192.*.*.66 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=2 keepalives_interval=2 keepalives_count=3 tcp_user_timeout=9000 [RUNNING] query archive command at 192.*.*.60 ... [RUNNING] current cluster not config sys_rman,return. [root@node3 soft]#

6、集群扩容结束后,查看集群状态

[kingbase@node3 soft]$ repmgr service status ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen ----+-------+---------+-----------+----------+---------+-------+---------+-------------------- 1 | node1 | primary | * running | | running | 77003 | no | n/a 2 | node2 | standby | running | node1 | running | 43158 | no | 1 second(s) ago 3 | node3 | standby | running | node1 | running | 15704 | no | 0 second(s) ago [kingbase@node3 soft]$ repmgr cluster show ID | Name | Role | Status | Upstream | Location | Priority | Timeline | LSN_Lag | Connection string ----+-------+---------+-----------+----------+----------+----------+----------+---------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 1 | node1 | primary | * running | | default | 100 | 1 | | host=192.168.100.60 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=2 keepalives_interval=2 keepalives_count=3 tcp_user_timeout=9000 2 | node2 | standby | running | node1 | default | 100 | 1 | 0 bytes | host=192.168.100.62 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=2 keepalives_interval=2 keepalives_count=3 tcp_user_timeout=9000 3 | node3 | standby | running | node1 | default | 100 | 1 | 0 bytes | host=192.168.100.66 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=2 keepalives_interval=2 keepalives_count=3 tcp_user_timeout=9000 [kingbase@node3 soft]$

至此,节点node3成功加入集群中,集群状态正常。

四、KES V9 RWC集群缩容

需求:需要把node3节点从下面的rwc集群中删除:

主机名 ip地址 OS版本 内存、CPU 节点角色 数据库端口 集群软件安装目录 数据目录
node1 192.*.*.60 Centos7.9 4G 、 1个双核 主节点 54321 /opt/kes/v9 /data/cluster
node2 192.*.*.62 Centos7.9 4G 、 1个双核 备节点 54321 /opt/kes/v9 /data/cluster
node3 192.*.*.66 Centos7.9 4G 、 1个双核 备节点 54321 /opt/kes/v9 /data/cluster

集群vip地址: 192...64

详细缩容步骤

1、在任意节点上检查现有集群状态

[kingbase@node3 soft]$ repmgr service status ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen ----+-------+---------+-----------+----------+---------+-------+---------+-------------------- 1 | node1 | primary | * running | | running | 77003 | no | n/a 2 | node2 | standby | running | node1 | running | 43158 | no | 1 second(s) ago 3 | node3 | standby | running | node1 | running | 15704 | no | 1 second(s) ago [kingbase@node3 soft]$

以下操作都在待缩容节点node3上进行。
2、配置 install.conf 文件
2.1 编辑 install.conf 中 shrink 标签下的参数

[shrink] shrink_type="standby" # The node type of standby/witness node, which would be delete from cluster. 0:standby 1:witness primary_ip="192.168.100.60" # The ip addr of cluster primary node, which need to shrink a standby/witness node. shrink_ip="192.168.100.66" # The ip addr of standby/witness node, which would be delete from cluster. node_id="3" # The node_id of standby/witness node, which would be delete from cluster. It does not the same with any one in cluster node # for example: node_id="3" ## Specific instructions ,see it under [install] install_dir="/opt/kes/v9" # the last layer of directory could not add '/' ssh_port="22" # the port of ssh, default is 22 scmd_port="8890" # the port of sys_securecmd, default is 8890

【注意】如需修改 ssh 连接端口,先修改 install.conf 文件中 ssh_port 项的值,然后修改系统/etc/ssh/sshd_config 文件中的 Port 项的值,最后需要重启 sshd 服务。

3、集群缩容
使用root用户或者kingbase用户都可以缩容成功,本文中使用kingbase用户执行”cluster_install.sh shrink” 命令进行缩容,脚本将按照配置自动完成集群缩容工作。

[kingbase@node3 soft]$ ./cluster_install.sh shrink [CONFIG_CHECK] will deploy the cluster of [RUNNING] success connect to the target "192.*.*.66" ..... OK [RUNNING] success connect to "192.*.*.66" from current node by 'ssh' ..... OK [RUNNING] success connect to the target "192.*.*.60" ..... OK [RUNNING] success connect to "192.*.*.60" from current node by 'ssh' ..... OK [RUNNING] Primary node ip is 192.*.*.60 ... [RUNNING] Primary node ip is 192.*.*.60 ... OK [CONFIG_CHECK] set install_with_root=1 [RUNNING] success connect to "" from current node by 'ssh' ..... OK [RUNNING] success connect to the target "192.*.*.60" ..... OK [RUNNING] success connect to "192.*.*.60" from current node by 'ssh' ..... OK [INSTALL] load config from cluster..... [INFO] db_user=system [INFO] db_port=54321 [INFO] use_scmd=1 [INFO] auto_cluster_recovery_level=1 [INFO] synchronous=quorum [INSTALL] load config from cluster.....OK [CONFIG_CHECK] check database connection ... [CONFIG_CHECK] check database connection ... OK [CONFIG_CHECK] shrink_ip[192.*.*.66] is a standby node IP in the cluster ... [CONFIG_CHECK] shrink_ip[192.*.*.66] is a standby node IP in the cluster ...ok [CONFIG_CHECK] The localhost is shrink_ip:[192.*.*.66] or primary_ip:[192.*.*.60]... [CONFIG_CHECK] The localhost is shrink_ip:[192.*.*.66] or primary_ip:[192.*.*.60]...ok [RUNNING] Primary node ip is 192.*.*.60 ... [RUNNING] Primary node ip is 192.*.*.60 ... OK [CONFIG_CHECK] check node_id is in cluster ... [CONFIG_CHECK] check node_id is in cluster ...OK [RUNNING] The /opt/kes/v9/kingbase/bin dir exist on "192.*.*.66" ... [RUNNING] The /opt/kes/v9/kingbase/bin dir exist on "192.*.*.66" ... OK ID | Name | Role | Status | Upstream | Location | Priority | Timeline | LSN_Lag | Connection string ----+-------+---------+-----------+----------+----------+----------+----------+---------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 1 | node1 | primary | * running | | default | 100 | 1 | | host=192.*.*.60 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=2 keepalives_interval=2 keepalives_count=3 tcp_user_timeout=9000 2 | node2 | standby | running | node1 | default | 100 | 1 | 0 bytes | host=192.*.*.62 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=2 keepalives_interval=2 keepalives_count=3 tcp_user_timeout=9000 3 | node3 | standby | running | node1 | default | 100 | 1 | 0 bytes | host=192.*.*.66 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=2 keepalives_interval=2 keepalives_count=3 tcp_user_timeout=9000 [RUNNING] Del node is standby ... [INFO] node:192.*.*.66 can be deleted ... OK [RUNNING] query archive command at 192.*.*.60 ... [RUNNING] current cluster not config sys_rman,return. [2025年 02月 27日 星期四 22:42:06 CST] [INFO] /opt/kes/v9/kingbase/bin/repmgr standby unregister --node-id=3 ... [INFO] connecting to local standby [INFO] connecting to primary database [NOTICE] unregistering node 3 [INFO] SET synchronous TO "quorum" on primary host [INFO] change synchronous_standby_names from "ANY 1( node2,node3)" to "ANY 1( node2)" [INFO] try to drop slot "repmgr_slot_3" of node 3 on primary node [WARNING] replication slot "repmgr_slot_3" is still active on node 3 [INFO] standby unregistration complete [2025年 02月 27日 星期四 22:42:07 CST] [INFO] /opt/kes/v9/kingbase/bin/repmgr standby unregister --node-id=3 ...OK [2025年 02月 27日 星期四 22:42:07 CST] [INFO] check db connection ... [2025年 02月 27日 星期四 22:42:07 CST] [INFO] check db connection ...ok 2025-02-27 22:42:07 Ready to stop local kbha daemon and repmgrd daemon ... 2025-02-27 22:42:11 begin to stop repmgrd on "[localhost]". 2025-02-27 22:42:12 repmgrd on "[localhost]" stop success. 2025-02-27 22:42:12 Done. 2025-02-27 22:42:12 begin to stop DB on "[localhost]". waiting for server to shut down.... done server stopped 2025-02-27 22:42:12 DB on "[localhost]" stop success. ID | Name | Role | Status | Upstream | Location | Priority | Timeline | LSN_Lag | Connection string ----+-------+---------+-----------+----------+----------+----------+----------+---------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 1 | node1 | primary | * running | | default | 100 | 1 | | host=192.*.*.60 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=2 keepalives_interval=2 keepalives_count=3 tcp_user_timeout=9000 2 | node2 | standby | running | node1 | default | 100 | 1 | 0 bytes | host=192.*.*.62 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=2 keepalives_interval=2 keepalives_count=3 tcp_user_timeout=9000 [2025年 02月 27日 星期四 22:42:12 CST] [INFO] drop replication slot:repmgr_slot_3... pg_drop_replication_slot -------------------------- (1 row) [2025年 02月 27日 星期四 22:42:13 CST] [INFO] drop replication slot:repmgr_slot_3...OK [2025年 02月 27日 星期四 22:42:13 CST] [INFO] modify synchronous parameter configuration... [2025年 02月 27日 星期四 22:42:14 CST] [INFO] modify synchronous parameter configuration...ok ID | Name | Role | Status | Upstream | Location | Priority | Timeline | LSN_Lag | Connection string ----+-------+---------+-----------+----------+----------+----------+----------+---------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 1 | node1 | primary | * running | | default | 100 | 1 | | host=192.*.*.60 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=2 keepalives_interval=2 keepalives_count=3 tcp_user_timeout=9000 2 | node2 | standby | running | node1 | default | 100 | 1 | 0 bytes | host=192.*.*.62 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=2 keepalives_interval=2 keepalives_count=3 tcp_user_timeout=9000 [kingbase@node3 soft]$

4、集群缩容结束后,查看集群状态
登录到node1或node2节点,检查集群状态:

[kingbase@node2 ~]$ repmgr service status ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen ----+-------+---------+-----------+----------+---------+-------+---------+-------------------- 1 | node1 | primary | * running | | running | 77003 | no | n/a 2 | node2 | standby | running | node1 | running | 43158 | no | 0 second(s) ago [kingbase@node2 ~]$ repmgr cluster show ID | Name | Role | Status | Upstream | Location | Priority | Timeline | LSN_Lag | Connection string ----+-------+---------+-----------+----------+----------+----------+----------+---------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 1 | node1 | primary | * running | | default | 100 | 1 | | host=192.168.100.60 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=2 keepalives_interval=2 keepalives_count=3 tcp_user_timeout=9000 2 | node2 | standby | running | node1 | default | 100 | 1 | 0 bytes | host=192.168.100.62 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=2 keepalives_interval=2 keepalives_count=3 tcp_user_timeout=9000 [kingbase@node2 ~]$

至此,节点node3成功从集群中删除,集群状态正常。

五、参考文档

https://bbs.kingbase.com.cn/docHtml?recId=d16e9a1be637c8fe4644c2c82fe16444&url=aHR0cHM6Ly9iYnMua2luZ2Jhc2UuY29tLmNuL2tpbmdiYXNlLWRvYy92OS9oaWdobHkvYXZhaWxhYmlsaXR5L2luZGV4Lmh0bWw
详细路径:KingbaseES > 高可用 > 金仓数据守护集群和读写分离集群使用手册> 第7章 日常运维管理> 7.5. 在线扩缩容章节

六、总结

KingbaseES(KES)V9 RWC集群在线扩缩容还是非常丝滑的,欢迎大家体验~~~

关于作者:
网名:飞天,墨天轮2024年度优秀原创作者,拥有 Oracle 10g OCM 认证、PGCE认证以及OBCA、KCP、ACP、磐维等众多国产数据库认证证书,目前从事Oracle、Mysql、PostgresSQL、磐维数据库管理运维工作,喜欢结交更多志同道合的朋友,热衷于研究、分享数据库技术。
微信公众号:飞天online
墨天轮:https://www.modb.pro/u/15197
如有任何疑问,欢迎大家留言,共同探讨~~~

最后修改时间:2025-02-28 09:41:52
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

文章被以下合辑收录

评论