暂无图片
暂无图片
5
暂无图片
暂无图片
暂无图片

opengauss3.0新特性CM体验

原创 lqkitten 2022-04-25
2463

openGauss3.0发布,带来不少新的特性,尤其对CM功能感兴趣,立即安装测试。
本次使用Vbox搭建三个节点集群,操作系统是openeuler20.03sp2。
操作系统安装参见openEuler安装

一、系统资源情况

操作系统版本

[root@node1 ~]# cat /etc/os-release
NAME="openEuler"
VERSION="20.03 (LTS-SP2)"
ID="openEuler"
VERSION_ID="20.03"
PRETTY_NAME="openEuler 20.03 (LTS-SP2)"
ANSI_COLOR="0;31"

内存

[root@node1 ~]# free -g
              total        used        free      shared  buff/cache   available
Mem:             15           0          14           0           0          13
Swap:             7           0           7

硬盘空间

[root@node1 ~]# df -h
Filesystem                  Size  Used Avail Use% Mounted on
devtmpfs                    7.6G     0  7.6G   0% /dev
tmpfs                       7.6G     0  7.6G   0% /dev/shm
tmpfs                       7.6G   17M  7.6G   1% /run
tmpfs                       7.6G     0  7.6G   0% /sys/fs/cgroup
/dev/mapper/openeuler-root   43G   16G   25G  39% /
tmpfs                       7.6G     0  7.6G   0% /tmp
/dev/sda1                   976M  188M  722M  21% /boot
tmpfs                       1.6G     0  1.6G   0% /run/user/0

二、opengauss3.0安装

下载安装包

cd /srv
wget https://opengauss.obs.cn-south-1.myhuaweicloud.com/3.0.0/x86_openEuler/openGauss-3.0.0-openEuler-64bit-all.tar.gz

解压安装包

mkdir soft
cd soft
tar xvf ../openGauss-3.0.0-openEuler-64bit-all.tar.gz
tar xvf openGauss-3.0.0-openEuler-64bit.tar.bz2
tar xvf openGauss-3.0.0-openEuler-64bit-cm.tar.gz
tar xvf openGauss-3.0.0-openEuler-64bit-om.tar.gz
[root@node2 soft]# tree -L 1
.
├── etc
├── gs3cm.xml
├── include
├── jre
├── lib
├── openGauss-3.0.0-openEuler-64bit-cm.sha256
├── openGauss-3.0.0-openEuler-64bit-cm.tar.gz
├── openGauss-3.0.0-openEuler-64bit-om.sha256
├── openGauss-3.0.0-openEuler-64bit-om.tar.gz
├── openGauss-3.0.0-openEuler-64bit.sha256
├── openGauss-3.0.0-openEuler-64bit.tar.bz2
├── openGauss-Package-bak_02c14696.tar.gz
├── script
├── share
├── simpleInstall
├── upgrade_sql.sha256
├── upgrade_sql.tar.gz
└── version.cfg

7 directories, 11 files

准备配置文件

vim gs3cm.xml
<?xml version="1.0" encoding="UTF-8"?>
<ROOT>
    <CLUSTER>
        <PARAM name="clusterName" value="dbCluster" />
        <PARAM name="nodeNames" value="node1,node2,node3" />
        <PARAM name="backIp1s" value="192.168.56.7,192.168.56.8,192.168.56.9"/>
        <PARAM name="gaussdbAppPath" value="/opt/huawei/install/app" />
        <PARAM name="gaussdbLogPath" value="/opt/huawei/omm" />
        <PARAM name="tmpMppdbPath" value="/opt/huawei/tmp"/>
        <PARAM name="gaussdbToolPath" value="/opt/huawei/install/om" />
        <PARAM name="corePath" value="/opt/huawei/corefile"/>
        <PARAM name="clusterType" value="single-inst"/>
    </CLUSTER>
    <DEVICELIST>
        <DEVICE sn="1000001">
            <PARAM name="name" value="node1"/>
            <PARAM name="azName" value="AZ1"/>
            <PARAM name="azPriority" value="1"/>
            <PARAM name="backIp1" value="192.168.56.7"/>
            <PARAM name="sshIp1" value="192.168.56.7"/>
        <!--CM-->
            <PARAM name="cmDir" value="/opt/huawei/install/data/cm" />
            <PARAM name="cmsNum" value="1" />
            <PARAM name="cmServerPortBase" value="5000" />
            <PARAM name="cmServerlevel" value="1" />
            <PARAM name="cmServerListenIp1" value="192.168.56.7,192.168.56.8,192.168.56.9" />
            <PARAM name="cmServerRelation" value="node1,node2,node3" />
            <!--dbnode-->
                <PARAM name="dataNum" value="1"/>
                <PARAM name="dataPortBase" value="26000"/>
                <PARAM name="dataNode1" value="/opt/huawei/install/data/db1,node2,/opt/huawei/install/data/db1,node3,/opt/huawei/install/data/db1"/>
                <PARAM name="dataNode1_syncNum" value="0"/>
        </DEVICE>
        <DEVICE sn="1000002">
            <PARAM name="name" value="node2"/>
            <PARAM name="azName" value="AZ1"/>
            <PARAM name="azPriority" value="1"/>
            <PARAM name="backIp1" value="192.168.56.8"/>
            <PARAM name="sshIp1" value="192.168.56.8"/>
            <PARAM name="cmDir" value="/opt/huawei/install/data/cm" />
        </DEVICE>
        <DEVICE sn="1000003">
            <PARAM name="name" value="node3"/>
            <PARAM name="azName" value="AZ1"/>
            <PARAM name="azPriority" value="1"/>
            <PARAM name="backIp1" value="192.168.56.9"/>
            <PARAM name="sshIp1" value="192.168.56.9"/>
            <PARAM name="cmDir" value="/opt/huawei/install/data/cm" />
        </DEVICE>
    </DEVICELIST>
</ROOT>

执行预安装

scripts/gs_preinstalll -X gs3cm.xml -U omm -G dbgrp

Screenshot_44.png

开始安装

chown -R omm:dbgrp /srv/soft
su - omm
cd /srv/soft

[omm@node1 srv]$ script/gs_install -X gs3cm.xml
Parsing the configuration file.
Check preinstall on every node.
Successfully checked preinstall on every node.
Creating the backup directory.
Successfully created the backup directory.
begin deploy..
Installing the cluster.
begin prepare Install Cluster..
Checking the installation environment on all nodes.
begin install Cluster..
Installing applications on all nodes.
Successfully installed APP.
begin init Instance..
encrypt cipher and rand files for database.
Please enter password for database:
Please repeat for database:
begin to create CA cert files
The sslcert will be generated in /opt/huawei/install/app/share/sslcert/om
Create CA files for cm beginning.
Create CA files on directory [/opt/huawei/install/app_02c14696/share/sslcert/cm]. file list: ['client.key.cipher', 'server.crt', 'server.key.rand', 'server.key.cipher', 'client.key.rand', 'cacert.pem', 'client.key', 'server.key', 'client.crt']
Cluster installation is completed.
Configuring.
Deleting instances from all nodes.
Successfully deleted instances from all nodes.
Checking node configuration on all nodes.
Initializing instances on all nodes.
Updating instance configuration on all nodes.
Check consistence of memCheck and coresCheck on database nodes.
Successful check consistence of memCheck and coresCheck on all nodes.
Configuring pg_hba on all nodes.
Configuration is completed.
Starting cluster.
======================================================================
Successfully started primary instance. Wait for standby instance.
======================================================================
.
Successfully started cluster.
======================================================================
cluster_state      : Normal
redistributing     : No
node_count         : 3
Datanode State
    primary           : 1
    standby           : 2
    secondary         : 0
    cascade_standby   : 0
    building          : 0
    abnormal          : 0
    down              : 0

Successfully installed application.
end deploy..

安装成功

三、CM使用体验

新功能不会用,先看一下命令怎么用

[omm@node1 srv]$ cm_ctl --help
cm_ctl is a utility to start, stop, query or control a mppdb cluster.
Usage:
  cm_ctl start [-z AVAILABILITY_ZONE [--cm_arbitration_mode=ARBITRATION_MODE]] | [-n NODEID [-D DATADIR]] [-t SECS]
  cm_ctl switchover [-z AVAILABILITY_ZONE] | [-n NODEID -D DATADIR [-f]] | [-a] | [-A] [-t SECS]
  cm_ctl finishredo
  cm_ctl build [-c] [-n NODEID] [-D DATADIR [-t SECS] [-f] [-b full] [-j NUM]]
  cm_ctl check -B BINNAME -T DATAPATH
  cm_ctl stop [[-z AVAILABILITY_ZONE] | [-n NODEID [-D DATADIR]]] [-t SECS] [-m SHUTDOWN-MODE]
  cm_ctl query [-z ALL] [-l FILENAME] [-v [-C [-s] [-S] [-d] [-i] [-F] [-x] [-p]] | [-r]] [-t SECS] [--minorityAz=AZ_NAME]
  cm_ctl view [-v | -N | -n NODEID] [-l FILENAME]
  cm_ctl set [--log_level=LOG_LEVEL] [--cm_arbitration_mode=ARBITRATION_MODE] [--cm_switchover_az_mode=SWITCHOVER_AZ_MODE] [--cmsPromoteMode=CMS_PROMOTE_MODE -I INSTANCEID]
  cm_ctl set --param --agent | --server [-n [NODEID]] -k [PARAMETER]="[value]"
  cm_ctl get [--log_level] [--cm_arbitration_mode] [--cm_switchover_az_mode]
  cm_ctl setrunmode -n NODEID -D DATADIR  [[--xmode=normal] | [--xmode=minority --votenum=NUM]]
  cm_ctl changerole [--role=PASSIVE | --role=FOLLOWER] -n NODEID -D DATADIR [-t SECS]
  cm_ctl changemember [--role=PASSIVE | --role=FOLLOWER] [--group=xx] [--priority=xx] -n NODEID -D DATADIR [-t SECS]
  cm_ctl reload --param [--agent | --server]
  cm_ctl list --param --agent | --server
  cm_ctl encrypt [-M MODE] -D DATADIR
  cm_ctl ddb DCC_CMD
  cm_ctl switch [--ddb_type=[DDB]] [--commit] [--rollback]

Common options:
  -D DATADIR             location of the database storage area
  -l FILENAME            write (or append) result to FILENAME
  -n NODEID              node id
  -z AVAILABILITY_ZONE   availability zone name
  -t SECS                seconds to wait
  -V, --version          output version information, then exit
  -?, -h, --help         show this help, then exit

Options for switchover:
  -a                     auto switchover to rebalance mppdb service
  -A                     switch all the datanode's standby instances with their master instances
  -f                     fast switchover

Options for build:
  -f                     force build
  -b full                full build
  -c                     cm server build
  -j [num]               parallelism

Options for check:
  -B BINNAME             BINNAME can be "cm_agent", "gaussdb" or "cm_server"
  -T DATAPATH            location of the database storage area

Options for stop:
  -m MODE                MODE can be "smart" "fast" "immediate"

Options for query:
  -s                     show instances that need to switchover
  -C                     show query result by HA relation
  -v                     show detail query result
  -d                     show instance datapath
  -i                     show physical node ip
  -F                     show all fenced UDF master process status
  -z                     show all availability zone status. The value must be "ALL"
  -r                     show standby DN redo status
  -g                     show backup and recovery cluster info
  -x                     show abnormal instances
  -S                     show the results of the status check when the cluster was started
  --minorityAz           check the cms status only in the pointed AZ
  -p                     show the port of datanode

Options for set:
  --log_level=LOG_LEVEL                           LOG_LEVEL can be "DEBUG5", "DEBUG1", "LOG", "WARNING", "ERROR" or "FATAL"
  --cm_arbitration_mode=ARBITRATION_MODE          ARBITRATION_MODE can be "MAJORITY", "MINORITY"
  --cm_switchover_az_mode= SWITCHOVER_AZ_MODE     SWITCHOVER_AZ_MODE can be "NON_AUTO", "AUTO"
  --cmsPromoteMode=CMS_PROMOTE_MODE -I INSTANCEID CMS_PROMOTE_MODE can be "AUTO", "PRIMARY_F"
  --agent                set cm agent conf
  --server               set cm server conf
  --k                    set parameter and value

Options for get:
  --log_level              show LOG_LEVEL
  --cm_arbitration_mode    show cm server arbitration mode
  --cm_switchover_az_mode  show az switchover mode

Options for view:
  -v                     show details of static config
  -N                     show local node static config

Options for setrunmode:
  --xmode                minority or normal.
  --votenum              in minority mode,available dn vote number.

Options for changerole:
  --role                 switch dcf role to passive or to follower.

Options for changemember:
  --role                 switch dcf role to passive or to follower.
  --group                change dcf group id.
  --priority             change dcf election priority.

Options for reload:
  reload                 reload cluster static config online.
  --agent                reload cm_agent conf.
  --server               reload cm_server conf.

Options for list:
  --agent                list the cm_agent parameter.
  --server               list the cm_server parameter.

Options for encrypt:
  -M                     encrypt mode (server,client), default value is server mode.
  -D                     appoint encrypt file path.

Options for switch ddb:
  --ddb_type             switch to which ddb type.
  --commit               after switch success, need do commit.
  --rollback             when something wrong, can do rollback.

Shutdown modes are:
  smart                  quit with fast shutdown on primary, and recovery done on standby
  fast                   quit directly, with proper shutdown
  immediate              quit without complete shutdown; will lead to recovery on restart

Cluster state including:
  Normal                 cluster is available with data replication
  Degraded               cluster is available without data replication
  Unavailable            cluster is unavailable

Instance state including:
  Primary                database system run as a primary server, send xlog to standby server
  Standby                database system run as a standby server, receive xlog from primary server
  Cascade Standby        database system run as a cascade standby server, receive xlog from standby server
  Pending                database system run as a pending server, wait for promoting to primary or demoting to standby
  Down                   database system not running
  Unknown                database system not connected

HA state including:
  Normal                 database system is normal
  Need repair            database system is not connected with primary/standby server or not matched with primary/standby server
  Wait promoting         database system is waiting to promote during switchover
  Promoting              database system is promoting
  Building               database system is building
  Catchup                database system is catching up xlog
  Demoting               database system is demoting
  Starting               database system is starting up
  Manually stopped       database system is down for being manually stopped
  Disk damaged           database system is down for disk damaged
  Port conflicting       database system is down for port conflicting
  Unknown                database system is down for some internal error

Options for dcc cmd:
  --help, -h             Shows help information of dcc cmd.
  --version, -v          Shows version information of dcc.
  --get key              Queries the value of a specified key.
  --put key val          Updates or insert the value of a specified key.
  --delete key           Deletes the specified key.
  --prefix               Prefix matching --get or --delete.
  --cluster_info         show cluster info.
  --leader_info          show leader nodeid.

参数这么多,只测几个吧
查看集群状态

[omm@node1 srv]$ cm_ctl query -v -C
[  CMServer State   ]
node     instance state
-------------------------
1  node1 1        Primary
2  node2 2        Standby
3  node3 3        Standby
[   Cluster State   ]
cluster_state   : Normal
redistributing  : No
balanced        : Yes
current_az      : AZ_ALL

[  Datanode State   ]

node     instance state            | node     instance state            | node     instance state
---------------------------------------------------------------------------------------------------------------
1  node1 6001     P Primary Normal | 2  node2 6002     S Standby Normal | 3  node3 6003     S Standby Normal

Screenshot_49.png

节点一即是CMserver主节点为,也是DN主节点
也可以用gs_om 查集群状态
Screenshot_50.png
停止集群,可以在任一节点执行,在节点三上执行

cm_ctl stop 
cm_ctl: stop cluster.
cm_ctl: stop nodeid: 1
cm_ctl: stop nodeid: 2
cm_ctl: stop nodeid: 3
...............
cm_ctl: stop cluster successfully.


Screenshot_46.png
启动集群,也可以在任一节点执行,在节点二上启动

[omm@node2 ~]$ cm_ctl start
cm_ctl: checking cluster status.
cm_ctl: checking cluster status.
cm_ctl: checking finished in 13290 ms.
cm_ctl: start cluster.
cm_ctl: start nodeid: 1
cm_ctl: start nodeid: 2
cm_ctl: start nodeid: 3
............
cm_ctl: start cluster successfully.

Screenshot_45.png

在节点一上停节点三

[omm@node1 srv]$ cm_ctl switchover -n 3 -D /opt/huawei/install/data/db1
............
cm_ctl: switchover successfully.

现在集群的状态
Screenshot_49.png
备节点停机主节点仍在节点一
模拟主节点宕机
Screenshot_51.png
集群状态
Screenshot_52.png
Screenshot_53.png
节点二升级为DN的主,节点三升级为CM的主,不用手工failover
启动节点一
Screenshot_54.png
自动加入集群,是集群的备节点
Screenshot_55.png
结论:openGauss3.0的健壮性增强,成了打不死的“小强”,不再需要第三方工具实现自动切换。美中不足的是,cm_ctl命令的输出,cmserver部分要比DN部分更符合管理员习惯。

最后修改时间:2023-02-15 18:14:36
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论