openGauss3.0发布,带来不少新的特性,尤其对CM功能感兴趣,立即安装测试。
本次使用Vbox搭建三个节点集群,操作系统是openeuler20.03sp2。
操作系统安装参见openEuler安装
一、系统资源情况
操作系统版本
[root@node1 ~]# cat /etc/os-release
NAME="openEuler"
VERSION="20.03 (LTS-SP2)"
ID="openEuler"
VERSION_ID="20.03"
PRETTY_NAME="openEuler 20.03 (LTS-SP2)"
ANSI_COLOR="0;31"
内存
[root@node1 ~]# free -g
total used free shared buff/cache available
Mem: 15 0 14 0 0 13
Swap: 7 0 7
硬盘空间
[root@node1 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 7.6G 0 7.6G 0% /dev
tmpfs 7.6G 0 7.6G 0% /dev/shm
tmpfs 7.6G 17M 7.6G 1% /run
tmpfs 7.6G 0 7.6G 0% /sys/fs/cgroup
/dev/mapper/openeuler-root 43G 16G 25G 39% /
tmpfs 7.6G 0 7.6G 0% /tmp
/dev/sda1 976M 188M 722M 21% /boot
tmpfs 1.6G 0 1.6G 0% /run/user/0
二、opengauss3.0安装
下载安装包
cd /srv
wget https://opengauss.obs.cn-south-1.myhuaweicloud.com/3.0.0/x86_openEuler/openGauss-3.0.0-openEuler-64bit-all.tar.gz
解压安装包
mkdir soft
cd soft
tar xvf ../openGauss-3.0.0-openEuler-64bit-all.tar.gz
tar xvf openGauss-3.0.0-openEuler-64bit.tar.bz2
tar xvf openGauss-3.0.0-openEuler-64bit-cm.tar.gz
tar xvf openGauss-3.0.0-openEuler-64bit-om.tar.gz
[root@node2 soft]# tree -L 1
.
├── etc
├── gs3cm.xml
├── include
├── jre
├── lib
├── openGauss-3.0.0-openEuler-64bit-cm.sha256
├── openGauss-3.0.0-openEuler-64bit-cm.tar.gz
├── openGauss-3.0.0-openEuler-64bit-om.sha256
├── openGauss-3.0.0-openEuler-64bit-om.tar.gz
├── openGauss-3.0.0-openEuler-64bit.sha256
├── openGauss-3.0.0-openEuler-64bit.tar.bz2
├── openGauss-Package-bak_02c14696.tar.gz
├── script
├── share
├── simpleInstall
├── upgrade_sql.sha256
├── upgrade_sql.tar.gz
└── version.cfg
7 directories, 11 files
准备配置文件
vim gs3cm.xml
<?xml version="1.0" encoding="UTF-8"?>
<ROOT>
<CLUSTER>
<PARAM name="clusterName" value="dbCluster" />
<PARAM name="nodeNames" value="node1,node2,node3" />
<PARAM name="backIp1s" value="192.168.56.7,192.168.56.8,192.168.56.9"/>
<PARAM name="gaussdbAppPath" value="/opt/huawei/install/app" />
<PARAM name="gaussdbLogPath" value="/opt/huawei/omm" />
<PARAM name="tmpMppdbPath" value="/opt/huawei/tmp"/>
<PARAM name="gaussdbToolPath" value="/opt/huawei/install/om" />
<PARAM name="corePath" value="/opt/huawei/corefile"/>
<PARAM name="clusterType" value="single-inst"/>
</CLUSTER>
<DEVICELIST>
<DEVICE sn="1000001">
<PARAM name="name" value="node1"/>
<PARAM name="azName" value="AZ1"/>
<PARAM name="azPriority" value="1"/>
<PARAM name="backIp1" value="192.168.56.7"/>
<PARAM name="sshIp1" value="192.168.56.7"/>
<!--CM-->
<PARAM name="cmDir" value="/opt/huawei/install/data/cm" />
<PARAM name="cmsNum" value="1" />
<PARAM name="cmServerPortBase" value="5000" />
<PARAM name="cmServerlevel" value="1" />
<PARAM name="cmServerListenIp1" value="192.168.56.7,192.168.56.8,192.168.56.9" />
<PARAM name="cmServerRelation" value="node1,node2,node3" />
<!--dbnode-->
<PARAM name="dataNum" value="1"/>
<PARAM name="dataPortBase" value="26000"/>
<PARAM name="dataNode1" value="/opt/huawei/install/data/db1,node2,/opt/huawei/install/data/db1,node3,/opt/huawei/install/data/db1"/>
<PARAM name="dataNode1_syncNum" value="0"/>
</DEVICE>
<DEVICE sn="1000002">
<PARAM name="name" value="node2"/>
<PARAM name="azName" value="AZ1"/>
<PARAM name="azPriority" value="1"/>
<PARAM name="backIp1" value="192.168.56.8"/>
<PARAM name="sshIp1" value="192.168.56.8"/>
<PARAM name="cmDir" value="/opt/huawei/install/data/cm" />
</DEVICE>
<DEVICE sn="1000003">
<PARAM name="name" value="node3"/>
<PARAM name="azName" value="AZ1"/>
<PARAM name="azPriority" value="1"/>
<PARAM name="backIp1" value="192.168.56.9"/>
<PARAM name="sshIp1" value="192.168.56.9"/>
<PARAM name="cmDir" value="/opt/huawei/install/data/cm" />
</DEVICE>
</DEVICELIST>
</ROOT>
执行预安装
scripts/gs_preinstalll -X gs3cm.xml -U omm -G dbgrp
开始安装
chown -R omm:dbgrp /srv/soft
su - omm
cd /srv/soft
[omm@node1 srv]$ script/gs_install -X gs3cm.xml
Parsing the configuration file.
Check preinstall on every node.
Successfully checked preinstall on every node.
Creating the backup directory.
Successfully created the backup directory.
begin deploy..
Installing the cluster.
begin prepare Install Cluster..
Checking the installation environment on all nodes.
begin install Cluster..
Installing applications on all nodes.
Successfully installed APP.
begin init Instance..
encrypt cipher and rand files for database.
Please enter password for database:
Please repeat for database:
begin to create CA cert files
The sslcert will be generated in /opt/huawei/install/app/share/sslcert/om
Create CA files for cm beginning.
Create CA files on directory [/opt/huawei/install/app_02c14696/share/sslcert/cm]. file list: ['client.key.cipher', 'server.crt', 'server.key.rand', 'server.key.cipher', 'client.key.rand', 'cacert.pem', 'client.key', 'server.key', 'client.crt']
Cluster installation is completed.
Configuring.
Deleting instances from all nodes.
Successfully deleted instances from all nodes.
Checking node configuration on all nodes.
Initializing instances on all nodes.
Updating instance configuration on all nodes.
Check consistence of memCheck and coresCheck on database nodes.
Successful check consistence of memCheck and coresCheck on all nodes.
Configuring pg_hba on all nodes.
Configuration is completed.
Starting cluster.
======================================================================
Successfully started primary instance. Wait for standby instance.
======================================================================
.
Successfully started cluster.
======================================================================
cluster_state : Normal
redistributing : No
node_count : 3
Datanode State
primary : 1
standby : 2
secondary : 0
cascade_standby : 0
building : 0
abnormal : 0
down : 0
Successfully installed application.
end deploy..
安装成功
三、CM使用体验
新功能不会用,先看一下命令怎么用
[omm@node1 srv]$ cm_ctl --help
cm_ctl is a utility to start, stop, query or control a mppdb cluster.
Usage:
cm_ctl start [-z AVAILABILITY_ZONE [--cm_arbitration_mode=ARBITRATION_MODE]] | [-n NODEID [-D DATADIR]] [-t SECS]
cm_ctl switchover [-z AVAILABILITY_ZONE] | [-n NODEID -D DATADIR [-f]] | [-a] | [-A] [-t SECS]
cm_ctl finishredo
cm_ctl build [-c] [-n NODEID] [-D DATADIR [-t SECS] [-f] [-b full] [-j NUM]]
cm_ctl check -B BINNAME -T DATAPATH
cm_ctl stop [[-z AVAILABILITY_ZONE] | [-n NODEID [-D DATADIR]]] [-t SECS] [-m SHUTDOWN-MODE]
cm_ctl query [-z ALL] [-l FILENAME] [-v [-C [-s] [-S] [-d] [-i] [-F] [-x] [-p]] | [-r]] [-t SECS] [--minorityAz=AZ_NAME]
cm_ctl view [-v | -N | -n NODEID] [-l FILENAME]
cm_ctl set [--log_level=LOG_LEVEL] [--cm_arbitration_mode=ARBITRATION_MODE] [--cm_switchover_az_mode=SWITCHOVER_AZ_MODE] [--cmsPromoteMode=CMS_PROMOTE_MODE -I INSTANCEID]
cm_ctl set --param --agent | --server [-n [NODEID]] -k [PARAMETER]="[value]"
cm_ctl get [--log_level] [--cm_arbitration_mode] [--cm_switchover_az_mode]
cm_ctl setrunmode -n NODEID -D DATADIR [[--xmode=normal] | [--xmode=minority --votenum=NUM]]
cm_ctl changerole [--role=PASSIVE | --role=FOLLOWER] -n NODEID -D DATADIR [-t SECS]
cm_ctl changemember [--role=PASSIVE | --role=FOLLOWER] [--group=xx] [--priority=xx] -n NODEID -D DATADIR [-t SECS]
cm_ctl reload --param [--agent | --server]
cm_ctl list --param --agent | --server
cm_ctl encrypt [-M MODE] -D DATADIR
cm_ctl ddb DCC_CMD
cm_ctl switch [--ddb_type=[DDB]] [--commit] [--rollback]
Common options:
-D DATADIR location of the database storage area
-l FILENAME write (or append) result to FILENAME
-n NODEID node id
-z AVAILABILITY_ZONE availability zone name
-t SECS seconds to wait
-V, --version output version information, then exit
-?, -h, --help show this help, then exit
Options for switchover:
-a auto switchover to rebalance mppdb service
-A switch all the datanode's standby instances with their master instances
-f fast switchover
Options for build:
-f force build
-b full full build
-c cm server build
-j [num] parallelism
Options for check:
-B BINNAME BINNAME can be "cm_agent", "gaussdb" or "cm_server"
-T DATAPATH location of the database storage area
Options for stop:
-m MODE MODE can be "smart" "fast" "immediate"
Options for query:
-s show instances that need to switchover
-C show query result by HA relation
-v show detail query result
-d show instance datapath
-i show physical node ip
-F show all fenced UDF master process status
-z show all availability zone status. The value must be "ALL"
-r show standby DN redo status
-g show backup and recovery cluster info
-x show abnormal instances
-S show the results of the status check when the cluster was started
--minorityAz check the cms status only in the pointed AZ
-p show the port of datanode
Options for set:
--log_level=LOG_LEVEL LOG_LEVEL can be "DEBUG5", "DEBUG1", "LOG", "WARNING", "ERROR" or "FATAL"
--cm_arbitration_mode=ARBITRATION_MODE ARBITRATION_MODE can be "MAJORITY", "MINORITY"
--cm_switchover_az_mode= SWITCHOVER_AZ_MODE SWITCHOVER_AZ_MODE can be "NON_AUTO", "AUTO"
--cmsPromoteMode=CMS_PROMOTE_MODE -I INSTANCEID CMS_PROMOTE_MODE can be "AUTO", "PRIMARY_F"
--agent set cm agent conf
--server set cm server conf
--k set parameter and value
Options for get:
--log_level show LOG_LEVEL
--cm_arbitration_mode show cm server arbitration mode
--cm_switchover_az_mode show az switchover mode
Options for view:
-v show details of static config
-N show local node static config
Options for setrunmode:
--xmode minority or normal.
--votenum in minority mode,available dn vote number.
Options for changerole:
--role switch dcf role to passive or to follower.
Options for changemember:
--role switch dcf role to passive or to follower.
--group change dcf group id.
--priority change dcf election priority.
Options for reload:
reload reload cluster static config online.
--agent reload cm_agent conf.
--server reload cm_server conf.
Options for list:
--agent list the cm_agent parameter.
--server list the cm_server parameter.
Options for encrypt:
-M encrypt mode (server,client), default value is server mode.
-D appoint encrypt file path.
Options for switch ddb:
--ddb_type switch to which ddb type.
--commit after switch success, need do commit.
--rollback when something wrong, can do rollback.
Shutdown modes are:
smart quit with fast shutdown on primary, and recovery done on standby
fast quit directly, with proper shutdown
immediate quit without complete shutdown; will lead to recovery on restart
Cluster state including:
Normal cluster is available with data replication
Degraded cluster is available without data replication
Unavailable cluster is unavailable
Instance state including:
Primary database system run as a primary server, send xlog to standby server
Standby database system run as a standby server, receive xlog from primary server
Cascade Standby database system run as a cascade standby server, receive xlog from standby server
Pending database system run as a pending server, wait for promoting to primary or demoting to standby
Down database system not running
Unknown database system not connected
HA state including:
Normal database system is normal
Need repair database system is not connected with primary/standby server or not matched with primary/standby server
Wait promoting database system is waiting to promote during switchover
Promoting database system is promoting
Building database system is building
Catchup database system is catching up xlog
Demoting database system is demoting
Starting database system is starting up
Manually stopped database system is down for being manually stopped
Disk damaged database system is down for disk damaged
Port conflicting database system is down for port conflicting
Unknown database system is down for some internal error
Options for dcc cmd:
--help, -h Shows help information of dcc cmd.
--version, -v Shows version information of dcc.
--get key Queries the value of a specified key.
--put key val Updates or insert the value of a specified key.
--delete key Deletes the specified key.
--prefix Prefix matching --get or --delete.
--cluster_info show cluster info.
--leader_info show leader nodeid.
参数这么多,只测几个吧
查看集群状态
[omm@node1 srv]$ cm_ctl query -v -C
[ CMServer State ]
node instance state
-------------------------
1 node1 1 Primary
2 node2 2 Standby
3 node3 3 Standby
[ Cluster State ]
cluster_state : Normal
redistributing : No
balanced : Yes
current_az : AZ_ALL
[ Datanode State ]
node instance state | node instance state | node instance state
---------------------------------------------------------------------------------------------------------------
1 node1 6001 P Primary Normal | 2 node2 6002 S Standby Normal | 3 node3 6003 S Standby Normal
节点一即是CMserver主节点为,也是DN主节点
也可以用gs_om 查集群状态
停止集群,可以在任一节点执行,在节点三上执行
cm_ctl stop
cm_ctl: stop cluster.
cm_ctl: stop nodeid: 1
cm_ctl: stop nodeid: 2
cm_ctl: stop nodeid: 3
...............
cm_ctl: stop cluster successfully.
启动集群,也可以在任一节点执行,在节点二上启动
[omm@node2 ~]$ cm_ctl start
cm_ctl: checking cluster status.
cm_ctl: checking cluster status.
cm_ctl: checking finished in 13290 ms.
cm_ctl: start cluster.
cm_ctl: start nodeid: 1
cm_ctl: start nodeid: 2
cm_ctl: start nodeid: 3
............
cm_ctl: start cluster successfully.
在节点一上停节点三
[omm@node1 srv]$ cm_ctl switchover -n 3 -D /opt/huawei/install/data/db1
............
cm_ctl: switchover successfully.
现在集群的状态
备节点停机主节点仍在节点一
模拟主节点宕机
集群状态
节点二升级为DN的主,节点三升级为CM的主,不用手工failover
启动节点一
自动加入集群,是集群的备节点
结论:openGauss3.0的健壮性增强,成了打不死的“小强”,不再需要第三方工具实现自动切换。美中不足的是,cm_ctl命令的输出,cmserver部分要比DN部分更符合管理员习惯。
最后修改时间:2023-02-15 18:14:36
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。