暂无图片
暂无图片
3
暂无图片
暂无图片
暂无图片

OceanBase 离线手动部署详细教程和一些理解

原创 Chanson 2021-12-26
2931

伟大的梦想始于渺小的起点,渺小的起点可以是一个想法,也可以是一个选择…
而学习伟大的数据库的起点可以是一次认证,也可以是一份安装教程文档。
 

OceanBase的梦想是成为最优秀的世界级数据库。

 
那本文主要目的是按照官方教程的指引,一步步实践,总结部署过程中遇到的问题和一些思考,尽可能为学习安装OceanBase的伙伴做到一次看懂,一次部署完成 ^ v ^

一. 前提须知

1.1 本文原则

  1. 按照官方教程实践,把部署过程尽可能做到一次完成。
  2. 将实践与理论进行结合,理解OceanBase的架构。
     

1.2 部署方式

本文使用手动离线部署集群。虽然OBD工具很不错,但是笔者认为手动部署掌控性更强,有助于理解OceanBase的架构,并且比较适用于生产环境。
 

二. 部署条件

OceanBase社区版安装包加起来不过上百MB,但是集群的最低硬件配置还是挺高的。

硬件配置.png

2.1 服务器配置

笔者的电脑只有20个CPU,为了能够部署OB Cluster,使用超额分配的方式,在使用的时候,需要多留意电脑的资源。
笔者在使用过程中,由于没有注意内存的使用,已经重启好几次。

服务器 OS版本 IP 地址 CPU 核数 内存 GB 磁盘 用途
RHOB-DB00 RHEL 7.6 192.168.117.170 1 1 SSD OBD、OBproxy、OBclient
RHOB-DB01 RHEL 7.6 192.168.117.171 16 8 SSD OBServer1
RHOB-DB02 RHEL 7.6 192.168.117.172 16 8 SSD OBServer2
RHOB-DB03 RHEL 7.6 192.168.117.173 16 8 SSD OBServer3

 

2.2 存储配置

添加3块硬盘(按需划分即可),分别为/dev/sdb、/dev/sdc、/dev/sdd,用于构建oblog和obdata文件系统。

[root@RHOB-DB01 ~]# fdisk -l |grep "Disk /dev/sd" Disk /dev/sda: 85.9 GB, 85899345920 bytes, 167772160 sectors Disk /dev/sdb: 10.7 GB, 10737418240 bytes, 20971520 sectors Disk /dev/sdc: 21.5 GB, 21474836480 bytes, 41943040 sectors Disk /dev/sdd: 53.7 GB, 53687091200 bytes, 104857600 sectors

2.2.1 创建pv

[root@RHOB-DB01 ~]# pvcreate /dev/sdb /dev/sdc /dev/sdd Physical volume "/dev/sdb" successfully created. Physical volume "/dev/sdc" successfully created. Physical volume "/dev/sdd" successfully created.`

2.2.2 创建vg

[root@RHOB-DB01 ~]# vgcreate obdatavg /dev/sdb /dev/sdc /dev/sdd Volume group "obdatavg" successfully created`

2.2.3 创建lv

[root@RHOB-DB01 ~]# lvcreate -n lvoblog -L 20G obdatavg Logical volume “lvoblog” created. [root@RHOB-DB01 ~]# lvcreate -n lvobdata -l 100%FREE obdatavg Logical volume “lvobdata” created.

2.2.4 格式化

[root@RHOB-DB01 ~]# mkfs.ext4 /dev/obdatavg/lvoblog [root@RHOB-DB01 ~]# mkfs.ext4 /dev/obdatavg/lvobdata

2.2.5 挂盘

[root@RHOB-DB01 ~]# mkdir /oblog [root@RHOB-DB01 ~]# mkdir /obdata [root@RHOB-DB01 ~]# mount /dev/obdatavg/lvobdata /obdata [root@RHOB-DB01 ~]# mount /dev/obdatavg/lvoblog /oblog [root@RHOB-DB01 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/rhel-root 50G 5.6G 45G 12% / devtmpfs 894M 0 894M 0% /dev tmpfs 910M 0 910M 0% /dev/shm tmpfs 910M 9.9M 900M 2% /run tmpfs 910M 0 910M 0% /sys/fs/cgroup /dev/sda1 1014M 178M 837M 18% /boot /dev/mapper/rhel-home 27G 37M 27G 1% /home tmpfs 182M 0 182M 0% /run/user/0 /dev/mapper/obdatavg-lvobdata 59G 53M 56G 1% /obdata /dev/mapper/obdatavg-lvoblog 20G 45M 19G 1% /oblog

2.2.6 挂盘重启自动生效

添加以下内容

参照官方教程配置。

[root@RHOB-DB01 ~]# cat /etc/fstab |grep obdata /dev/mapper/obdatavg-lvoblog /oblog ext4 defaults,noatime,nodiratime,nodelalloc,barrier=0 0 0 /dev/mapper/obdatavg-lvobdata /obdata ext4 defaults,noatime,nodiratime,nodelalloc,barrier=0 0 0

2.3 检查网卡

检查网卡.png

 

三. 环境配置

 

3.1 内核参数

[root@RHOB-DB01 ~]# cat /etc/sysctl.conf | grep -Ev '^#|^$' net.core.somaxconn = 2048 net.core.netdev_max_backlog = 10000 net.core.rmem_default = 16777216 net.core.wmem_default = 16777216 net.core.rmem_max = 16777216 net.core.wmem_max = 16777216 net.ipv4.ip_local_port_range = 3500 65535 net.ipv4.ip_forward = 0 net.ipv4.conf.default.rp_filter = 1 net.ipv4.conf.default.accept_source_route = 0 net.ipv4.tcp_syncookies = 0 net.ipv4.tcp_rmem = 4096 87380 16777216 net.ipv4.tcp_wmem = 4096 65536 16777216 net.ipv4.tcp_max_syn_backlog = 16384 net.ipv4.tcp_fin_timeout = 15 net.ipv4.tcp_max_syn_backlog = 16384 net.ipv4.tcp_tw_reuse = 1 net.ipv4.tcp_tw_recycle = 1 net.ipv4.tcp_slow_start_after_idle=0 vm.swappiness = 0 vm.min_free_kbytes = 2097152 vm.max_map_count=655360 fs.aio-max-nr=1048576

 
配置生效

[root@RHOB-DB01 ~]# sysctl -p

vm.min_free_kbytes = 2097152 ,如果虚拟机内存不够,可能会卡住。
解决方案:关掉虚拟机,增大虚拟机的内存,重启虚拟机。
 

3.2 修改会话变量设置

查看ulimit的默认值

[root@RHOB-DB01 ~]# ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 31078 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 31078 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited

 
添加参数

[root@RHOB-DB01 ~]# cat /etc/security/limits.conf |grep -Ev '^#|^$' * soft nofile 655360 * hard nofile 655360 * soft nproc 655360 * hard nproc 655360 * soft core unlimited * hard core unlimited * soft stack unlimited * hard stack unlimited

 

查看ulimit的最新值

[root@RHOB-DB01 ~]# ulimit -a core file size (blocks, -c) unlimited data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 31078 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 655360 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) unlimited cpu time (seconds, -t) unlimited max user processes (-u) 655360 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited

 

3.3 关闭防火墙和 selinux

3.3.1 关闭防火墙

systemctl disable firewalld systemctl stop firewalld systemctl status firewalld

3.3.2 selinux

[root@RHOB-DB01 ~]# cat /etc/selinux/config |grep disabled disabled - No SELinux policy is loaded. SELINUX=disabled
setenforce 0

3.4 新建用户

3.4.1 新建admin账号

如果不想用root用户,那就使用admin用户。
笔者尝试新建obadmin用户,会默认没有admin用户,自动使用root用户安装。

解决方案 : 新建admin用户

[root@RHOB-DB01 ~]# useradd admin [root@RHOB-DB01 ~]# passwd admin Changing password for user admin. New password: [root@RHOB-DB01 ~]# [root@RHOB-DB01 ~]# [root@RHOB-DB01 ~]# echo 'admin:ObAdMin&123' | chpasswd [root@RHOB-DB01 ~]# usermod admin -G wheel [root@RHOB-DB01 ~]# id admin uid=1002(admin) gid=1002(admin) groups=1002(admin),10(wheel)

3.4.2 配置sudo

[root@RHOB-DB01 ~]# cat /etc/sudoers |grep wheel Allows people in group wheel to run all commands %wheel ALL=(ALL) ALL # %wheel ALL=(ALL) NOPASSWD: ALL

新增如下配置,每次使用sudo,均需要密码。

[root@RHOB-DB01 ~]# cat /etc/sudoers |grep admin admin ALL=(ALL) ALL

注意 : 以上的操作,笔者都是在一台虚拟机完成,然后按照前文规划,复制3台服务器后,逐个修改主机名和IP地址,再继续下面的操作。
 


 

3.5 SSH互信

总共有四台机服务器,笔者将四台服务器均配置互信。

操作方法:每台服务器使用admin账号均执行以下命令。

–- 密码 : ObAdMin&123 [admin@RHOB-DB01 ~]$ ssh-keygen -t rsa # 敲几次回车。 [admin@RHOB-DB01 ~]$ ssh-copy-id 192.168.117.171 # 输入密码 [admin@RHOB-DB01 ~]$ ssh-copy-id 192.168.117.172 [admin@RHOB-DB01 ~]$ ssh-copy-id 192.168.117.173 [admin@RHOB-DB01 ~]$ ssh-copy-id 192.168.117.170 [admin@RHOB-DB01 ~]$ ssh RHOB-DB01 [admin@RHOB-DB01 ~]$ ssh RHOB-DB02 [admin@RHOB-DB01 ~]$ ssh RHOB-DB03 [admin@RHOB-DB01 ~]$ ssh RHOB-DB00

3.6 配置时间同步服务

3.6.1 查看时区设置

时区为Asia/Shanghai,没有问题。

[root@RHOB-DB01 ~]# timedatectl Local time: Sun 2021-12-19 14:37:59 CST Universal time: Sun 2021-12-19 06:37:59 UTC RTC time: Sun 2021-12-19 06:37:59 Time zone: Asia/Shanghai (CST, +0800) NTP enabled: yes NTP synchronized: no RTC in local TZ: no DST active: n/a

3.6.2 配置 chrony

OB Cluster的三个节点时间同步误差如果超过 50ms,则后面初始化集群一定会失败。
部署成功后,如果某个节点的时间误差大于50ms ,该节点就会掉线。

3.6.2.1 服务端配置

192.168.117.171 [ RHOB-DB01 ] 作为服务端。

[root@RHOB-DB01 ~]# cat /etc/chrony.conf |grep -Ev '^$|^#' server 127.127.1.0 driftfile /var/lib/chrony/drift makestep 1.0 3 rtcsync allow 192.168.117.0/24 logdir /var/log/chrony

3.6.2.2 客户端配置

其他的服务器均以客户端的形式配置。

[root@RHOB-DB02 ~]# cat /etc/chrony.conf |grep -Ev '^$|^#' server 192.168.117.171 iburst driftfile /var/lib/chrony/drift makestep 1.0 3 rtcsync logdir /var/log/chrony

3.6.2.3 启动chrony & 开机自启动

[root@RHOB-DB01 ~]# systemctl start chronyd.service [root@RHOB-DB01 ~]# systemctl enable chronyd.service

3.6.2.4 chrony 常用命令

非重点内容,根据官方教程的指示,每台服务器各跑一条命令,主要感受一下。

# 查看时间同步活动 [root@RHOB-DB01 ~]# chronyc activity 200 OK 1 sources online 0 sources offline 0 sources doing burst (return to online) 0 sources doing burst (return to offline) 0 sources with unknown address # 查看时间服务器 [admin@RHOB-DB02 ~]$ chronyc sources 210 Number of sources = 1 MS Name/IP address Stratum Poll Reach LastRx Last sample =============================================================================== ^* RHOB-DB01 10 10 377 352 -7308ns[-9110ns] +/- 274us # 查看同步状态 [admin@RHOB-DB03 ~]$ chronyc sources -v 210 Number of sources = 1 .-- Source mode '^' = server, '=' = peer, '#' = local clock. / .- Source state '*' = current synced, '+' = combined , '-' = not combined, | / '?' = unreachable, 'x' = time may be in error, '~' = time too variable. || .- xxxx [ yyyy ] +/- zzzz || Reachability register (octal) -. | xxxx = adjusted offset, || Log2(Polling interval) --. | | yyyy = measured offset, || \ | | zzzz = estimated error. || | | \ MS Name/IP address Stratum Poll Reach LastRx Last sample =============================================================================== ^* RHOB-DB01 10 10 377 549 -38us[ -44us] +/- 265us # 校准时间服务器: [root@RHOB-DB00 ~]# chronyc tracking Reference ID : C0A875AB (RHOB-DB01) Stratum : 11 Ref time (UTC) : Fri Dec 24 15:05:59 2021 System time : 0.000022480 seconds fast of NTP time Last offset : +0.000023738 seconds RMS offset : 0.000058456 seconds Frequency : 0.061 ppm slow Residual freq : +0.001 ppm Skew : 0.012 ppm Root delay : 0.000333177 seconds Root dispersion : 0.000083476 seconds Update interval : 1034.6 seconds Leap status : Normal

3.6.3 clockdiff & ping

有了chrony时间同步服务后,可以使用clockdiff检测一下时间延迟。

官方说明 :>50ms的延迟无法部署集群。

3.6.3.1 clockdiff 延迟测试

## RHOB-DB01 clockdiff [root@RHOB-DB01 ~]# clockdiff 192.168.117.172 . host=192.168.117.172 rtt=750(187)ms/0ms delta=0ms/0ms Sun Dec 19 14:35:57 2021 [root@RHOB-DB01 ~]# clockdiff 192.168.117.173 … host=192.168.117.173 rtt=315(315)ms/0ms delta=0ms/0ms Sun Dec 19 14:36:02 2021 ##RHOB-DB02 clockdiff [root@RHOB-DB02 ~]# clockdiff 192.168.117.171 . host=192.168.117.171 rtt=750(187)ms/0ms delta=0ms/0ms Sun Dec 19 14:36:47 2021 [root@RHOB-DB02 ~]# clockdiff 192.168.117.173 … host=192.168.117.173 rtt=421(315)ms/0ms delta=0ms/0ms Sun Dec 19 14:36:51 2021 ## RHOB-DB03 clockdiff [root@RHOB-DB03 ~]# clockdiff 192.168.117.171 … host=192.168.117.171 rtt=3(18)ms/0ms delta=0ms/0ms Sun Dec 19 14:37:34 2021 [root@RHOB-DB03 ~]# clockdiff 192.168.117.172 … host=192.168.117.172 rtt=315(315)ms/0ms delta=0ms/0ms Sun Dec 19 14:37:37 2021 ## RHOB-DB00 clockdiff [root@RHOB-DB00 ~]# clockdiff 192.168.117.171 . host=192.168.117.171 rtt=750(187)ms/0ms delta=0ms/0ms Sun Dec 19 21:42:21 2021 [root@RHOB-DB00 ~]# clockdiff 192.168.117.172 … host=192.168.117.172 rtt=562(280)ms/0ms delta=0ms/0ms Sun Dec 19 21:42:40 2021 [root@RHOB-DB00 ~]# clockdiff 192.168.117.173 . host=192.168.117.173 rtt=750(187)ms/0ms delta=0ms/0ms Sun Dec 19 21:42:41 2021

3.6.3.2 ping 延迟 测试

[root@RHOB-DB01 ~]# ping -T tsandaddr 192.168.117.172 -c 2 PING 192.168.117.172 (192.168.117.172) 56(124) bytes of data. 64 bytes from 192.168.117.172: icmp_seq=1 ttl=64 time=0.603 ms TS: 192.168.117.171 23780762 absolute 192.168.117.172 0 192.168.117.172 0 192.168.117.171 0 64 bytes from 192.168.117.172: icmp_seq=2 ttl=64 time=0.550 ms TS: 192.168.117.171 23781762 absolute 192.168.117.172 0 192.168.117.172 0 192.168.117.171 1 — 192.168.117.172 ping statistics — 2 packets transmitted, 2 received, 0% packet loss, time 1000ms rtt min/avg/max/mdev = 0.550/0.576/0.603/0.035 ms [root@RHOB-DB01 ~]# ping -T tsandaddr 192.168.117.173 -c 2 PING 192.168.117.173 (192.168.117.173) 56(124) bytes of data. 64 bytes from 192.168.117.173: icmp_seq=1 ttl=64 time=0.475 ms TS: 192.168.117.171 23790341 absolute 192.168.117.173 0 192.168.117.173 0 192.168.117.171 1 64 bytes from 192.168.117.173: icmp_seq=2 ttl=64 time=0.650 ms TS: 192.168.117.171 23791342 absolute 192.168.117.173 1 192.168.117.173 0 192.168.117.171 0 — 192.168.117.173 ping statistics — 2 packets transmitted, 2 received, 0% packet loss, time 1001ms rtt min/avg/max/mdev = 0.475/0.562/0.650/0.090 ms

 

四. 安装部署

4.1 安装包

链接
1 . 阿里云镜像
2 . 官方链接

ID 最新版本 用途
1 obclient-2.0.0-2.el7.x86_64.rpm obclient
2 libobclient-2.0.0-2.el7.x86_64.rpm obclient
3 obproxy-3.2.0-1.el7.x86_64.rpm obproxy
4 ob-deploy-1.1.2-1.el7.x86_64.rpm obd部署工具[可选]
5 oceanbase-ce-3.1.1-4.el7.x86_64.rpm oceanbase 数据库
6 oceanbase-ce-libs-3.1.1-4.el7.x86_64.rpm oceanbase lib包

 

4.2 安装oceanbase

[admin@RHOB-DB01 OceanBase]$ sudo rpm -ivh oceanbase-ce-libs-3.1.1-4.el7.x86_64.rpm [sudo] password for admin: warning: oceanbase-ce-libs-3.1.1-4.el7.x86_64.rpm: Header V4 RSA/SHA1 Signature, key ID e9b4a7aa: NOKEY Preparing… ################################# [100%] Updating / installing… 1:oceanbase-ce-libs-3.1.1-4.el7 ################################# [100%] [admin@RHOB-DB01 OceanBase]$ sudo rpm -ivh oceanbase-ce-3.1.1-4.el7.x86_64.rpm warning: oceanbase-ce-3.1.1-4.el7.x86_64.rpm: Header V4 RSA/SHA1 Signature, key ID e9b4a7aa: NOKEY Preparing… ################################# [100%] Updating / installing… 1:oceanbase-ce-3.1.1-4.el7 ################################# [100%] [admin@RHOB-DB01 OceanBase]$ rpm -qa|grep oceanbase oceanbase-ce-libs-3.1.1-4.el7.x86_64 oceanbase-ce-3.1.1-4.el7.x86_64 [admin@RHOB-DB01 ~]$ ll total 0 drwxr-xr-x 5 root root 39 Dec 19 17:14 oceanbase [admin@RHOB-DB01 ~]$ tree oceanbase oceanbase ├── bin │   ├── import_time_zone_info.py │   └── observer ├── etc │   └── timezone_V1.log └── lib ├── libaio.so -> libaio.so.1.0.1 ├── libaio.so.1 -> libaio.so.1.0.1 ├── libaio.so.1.0.1 ├── libmariadb.so -> libmariadb.so.3 └── libmariadb.so.3 3 directories, 8 files

 

4.3 创建OB目录及授权

[admin@RHOB-DB01 ~]$ ll drwxr-xr-x 3 admin admin 17 Dec 19 17:15 obproxy-3.2.0 drwxr-xr-x 5 root root 39 Dec 19 17:14 oceanbase [root@RHOB-DB01 bin]# chown -R admin:admin /obdata [root@RHOB-DB01 bin]# chown -R admin:admin /oblog [root@RHOB-DB01 bin]# chmod -R 755 /obdata [root@RHOB-DB01 bin]# chmod -R 755 /oblog [root@RHOB-DB01 ~]# chown -R admin:admin /home/admin/oceanbase/ [root@RHOB-DB01 ~]# su - admin [admin@RHOB-DB01 ~]$ mkdir -p ~/oceanbase/store/obdemo /obdata/obdemo/{sstable,etc3} /oblog/obdemo/{clog,ilog,slog,etc2} [admin@RHOB-DB01 ~]$ for f in {clog,ilog,slog,etc2}; do ln -s /oblog/obdemo/$f ~/oceanbase/store/obdemo/$f ; done [admin@RHOB-DB01 ~]$ for f in {sstable,etc3}; do ln -s /obdata/obdemo/$f ~/oceanbase/store/obdemo/$f; done

 

4.4 检查observer执行程序

检查执行程序是否有缺少lib包

[admin@RHOB-DB01 bin]$ ldd observer linux-vdso.so.1 => (0x00007fff319fe000) libmariadb.so.3 => not found libaio.so.1 => /lib64/libaio.so.1 (0x00002b56da968000) libm.so.6 => /lib64/libm.so.6 (0x00002b56dab6a000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00002b56dae6c000) libdl.so.2 => /lib64/libdl.so.2 (0x00002b56db088000) librt.so.1 => /lib64/librt.so.1 (0x00002b56db28c000) libc.so.6 => /lib64/libc.so.6 (0x00002b56db494000) /lib64/ld-linux-x86-64.so.2 (0x00002b56da744000)

添加环境变量

[admin@RHOB-DB01 bin]$ cd [admin@RHOB-DB01 ~]$ echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/oceanbase/lib' >> ~/.bash_profile [admin@RHOB-DB01 ~]$ . ~/.bash_profile

再次检查,没有问题

[admin@RHOB-DB01 bin]$ ldd observer linux-vdso.so.1 => (0x00007ffc44f9d000) libmariadb.so.3 => /home/admin/oceanbase/lib/libmariadb.so.3 (0x00002ab8bf977000) libaio.so.1 => /home/admin/oceanbase/lib/libaio.so.1 (0x00002ab8bfbdd000) libm.so.6 => /lib64/libm.so.6 (0x00002ab8bfddf000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00002ab8c00e1000) libdl.so.2 => /lib64/libdl.so.2 (0x00002ab8c02fd000) librt.so.1 => /lib64/librt.so.1 (0x00002ab8c0501000) libc.so.6 => /lib64/libc.so.6 (0x00002ab8c0709000) /lib64/ld-linux-x86-64.so.2 (0x00002ab8bf753000)

 

4.5 启动observer

4.5.1 查看observer参数选项

[admin@RHOB-DB01 bin]$ ./observer -h ./observer -h observer [OPTIONS] -h,–help print this help -V,–version print the information of version -z,–zone ZONE zone -p,–mysql_port PORT mysql port -P,–rpc_port PORT rpc port -N,–nodaemon dont run in daemon -n,–appname APPNAME application name -c,–cluster_id ID cluster id -d,–data_dir DIR OceanBase data directory -i,–devname DEV net dev interface -o,–optstr OPTSTR extra options string -r,–rs_list RS_LIST root service list -l,–log_level LOG_LEVEL server log level -6,–ipv6 USE_IPV6 server use ipv6 address -m,–mode MODE server mode -f,–scn flashback_scn

注意

  1. 以下的命令在哪台服务器执行。
  2. syslog_level 默认INFO ,建议修改成ERROR,避免INFO产生大量的信息。

4.5.2 启动observer-zone1

[admin@RHOB-DB01 ~]$ cd ~/oceanbase && bin/observer -i ens32 -p 2881 -P 2882 -z zone1 -d ~/oceanbase/store/obdemo -r '192.168.117.171:2882:2881;192.168.117.172:2882:2881;192.168.117.173:2882:2881' -c 20211224 -n obdemo -o “memory_limit=8G,cache_wash_threshold=1G,__min_full_resource_pool_memory=268435456,system_memory=3G,memory_chunk_cache_size=128M,cpu_count=16,net_thread_count=4,datafile_size=20G,stack_size=1536K,syslog_level=ERROR,config_additional_dir=/obdata/obdemo/etc3;/oblog/obdemo/etc2” bin/observer -i ens32 -p 2881 -P 2882 -z zone1 -d /home/admin/oceanbase/store/obdemo -r 192.168.117.171:2882:2881;192.168.117.172:2882:2881;192.168.117.173:2882:2881 -c 20211224 -n obdemo -o memory_limit=8G,cache_wash_threshold=1G,__min_full_resource_pool_memory=268435456,system_memory=3G,memory_chunk_cache_size=128M,cpu_count=16,net_thread_count=4,datafile_size=20G,stack_size=1536K,syslog_level=ERROR,config_additional_dir=/obdata/obdemo/etc3;/oblog/obdemo/etc2 devname: ens32 mysql port: 2881 rpc port: 2882 zone: zone1 data_dir: /home/admin/oceanbase/store/obdemo rs list: 192.168.117.171:2882:2881;192.168.117.172:2882:2881;192.168.117.173:2882:2881 cluster id: 20211224 appname: obdemo optstr: memory_limit=8G,cache_wash_threshold=1G,__min_full_resource_pool_memory=268435456,system_memory=3G,memory_chunk_cache_size=128M,cpu_count=16,net_thread_count=4,datafile_size=20G,stack_size=1536K,syslog_level=ERROR,config_additional_dir=/obdata/obdemo/etc3;/oblog/obdemo/etc2

 
检查
zone1.png

4.5.3 启动observer-zone2

[admin@RHOB-DB02 ~]$ cd ~/oceanbase && bin/observer -i ens32 -p 2881 -P 2882 -z zone2 -d ~/oceanbase/store/obdemo -r '192.168.117.171:2882:2881;192.168.117.172:2882:2881;192.168.117.173:2882:2881' -c 20211224 -n obdemo -o “memory_limit=8G,cache_wash_threshold=1G,__min_full_resource_pool_memory=268435456,system_memory=3G,memory_chunk_cache_size=128M,cpu_count=16,net_thread_count=4,datafile_size=20G,stack_size=1536K,syslog_level=ERROR,config_additional_dir=/obdata/obdemo/etc3;/oblog/obdemo/etc2” bin/observer -i ens32 -p 2881 -P 2882 -z zone2 -d /home/admin/oceanbase/store/obdemo -r 192.168.117.171:2882:2881;192.168.117.172:2882:2881;192.168.117.173:2882:2881 -c 20211224 -n obdemo -o memory_limit=8G,cache_wash_threshold=1G,__min_full_resource_pool_memory=268435456,system_memory=3G,memory_chunk_cache_size=128M,cpu_count=16,net_thread_count=4,datafile_size=20G,stack_size=1536K,syslog_level=ERROR,config_additional_dir=/obdata/obdemo/etc3;/oblog/obdemo/etc2 devname: ens32 mysql port: 2881 rpc port: 2882 zone: zone2 data_dir: /home/admin/oceanbase/store/obdemo rs list: 192.168.117.171:2882:2881;192.168.117.172:2882:2881;192.168.117.173:2882:2881 cluster id: 20211224 appname: obdemo optstr: memory_limit=8G,cache_wash_threshold=1G,__min_full_resource_pool_memory=268435456,system_memory=3G,memory_chunk_cache_size=128M,cpu_count=16,net_thread_count=4,datafile_size=20G,stack_size=1536K,syslog_level=ERROR,config_additional_dir=/obdata/obdemo/etc3;/oblog/obdemo/etc2

 
检查
zone2.png

4.5.4 启动observer-zone3

[admin@RHOB-DB03 ~]$ cd ~/oceanbase && bin/observer -i ens32 -p 2881 -P 2882 -z zone3 -d ~/oceanbase/store/obdemo -r '192.168.117.171:2882:2881;192.168.117.172:2882:2881;192.168.117.173:2882:2881' -c 20211224 -n obdemo -o “memory_limit=8G,cache_wash_threshold=1G,__min_full_resource_pool_memory=268435456,system_memory=3G,memory_chunk_cache_size=128M,cpu_count=16,net_thread_count=4,datafile_size=20G,stack_size=1536K,syslog_level=ERROR,config_additional_dir=/obdata/obdemo/etc3;/oblog/obdemo/etc2” bin/observer -i ens32 -p 2881 -P 2882 -z zone3 -d /home/admin/oceanbase/store/obdemo -r 192.168.117.171:2882:2881;192.168.117.172:2882:2881;192.168.117.173:2882:2881 -c 20211224 -n obdemo -o memory_limit=8G,cache_wash_threshold=1G,__min_full_resource_pool_memory=268435456,system_memory=3G,memory_chunk_cache_size=128M,cpu_count=16,net_thread_count=4,datafile_size=20G,stack_size=1536K,syslog_level=ERROR,config_additional_dir=/obdata/obdemo/etc3;/oblog/obdemo/etc2 devname: ens32 mysql port: 2881 rpc port: 2882 zone: zone3 data_dir: /home/admin/oceanbase/store/obdemo rs list: 192.168.117.171:2882:2881;192.168.117.172:2882:2881;192.168.117.173:2882:2881 cluster id: 20211224 appname: obdemo optstr: memory_limit=8G,cache_wash_threshold=1G,__min_full_resource_pool_memory=268435456,system_memory=3G,memory_chunk_cache_size=128M,cpu_count=16,net_thread_count=4,datafile_size=20G,stack_size=1536K,syslog_level=ERROR,config_additional_dir=/obdata/obdemo/etc3;/oblog/obdemo/etc2

 
检查
zone3.png

注意 : 如果使用OBD自动化部署,参数文件有密码,那么进程信息也会显示密码。

 

4.6 集群初始化

使用 mysql客户端 或者 obclient 进入OceanBase数据库。

本次使用 192.168.117.170 [RHOB-DB00]的mysql客户端工具登录192.168.117.171 [RHOB-DB01]的OceanBase。

[admin@RHOB-DB00 ~]$ mysql -h 192.168.117.171 -u root -P 2881 -p -c -A # 直接敲回车键登录数据库。 Enter password: Welcome to the MySQL monitor. Commands end with ; or \\g. Your MySQL connection id is 3221225472 Server version: 5.7.25 OceanBase 3.1.1 (r4-8c615943cbd25a6f7b8bdfd8677a13a21709a05e) (Built Oct 21 2021 10:33:14) Copyright ? 2000, 2020, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\\h' for help. Type '\\c' to clear the current input statement. mysql> show databases; ERROR 1146 (42S02): Table 'oceanbase.__all_database' doesnt exist mysql> set session ob_query_timeout=1000000000; Query OK, 0 rows affected (0.01 sec) mysql> alter system bootstrap ZONE 'zone1' SERVER '192.168.117.171:2882', ZONE 'zone2' SERVER '192.168.117.172:2882', ZONE 'zone3' SERVER '192.168.117.173:2882' ; Query OK, 0 rows affected (1 min 22.02 sec) # 初始化集群后,可以进行操作。 mysql> show databases; +--------------------+ | Database | +--------------------+ | oceanbase | | information_schema | | mysql | | SYS | | LBACSYS | | ORAAUDITOR | | test | +--------------------+ 7 rows in set (0.03 sec) mysql> select a.zone,concat(a.svr_ip,':',a.svr_port) observer, cpu_total, (cpu_total-cpu_assigned) cpu_free, round(mem_total/1024/1024/1024) mem_total_gb, round((mem_total-mem_assigned)/1024/1024/1024) mem_free_gb, usec_to_time(b.last_offline_time) last_offline_time, usec_to_time(b.start_service_time) start_service_time, b.status, usec_to_time(b.stop_time) stop_time, b.build_version from __all_virtual_server_stat a join __all_server b on (a.svr_ip=b.svr_ip and a.svr_port=b.svr_port) order by a.zone, a.svr_ip;

集群资源.png

查看租户的zone_list、primary_zone、locality属性。

租户.png
 

5. 安装obclient

如果没有mysql客户端,可以安装obclient,直接登陆数据库,比较方便。

[admin@RHOB-DB00 OceanBase]$ sudo rpm -ivh libobclient-2.0.0-2.el7.x86_64.rpm [sudo] password for admin: warning: libobclient-2.0.0-2.el7.x86_64.rpm: Header V4 RSA/SHA1 Signature, key ID e9b4a7aa: NOKEY Preparing… ################################# [100%] Updating / installing… 1:libobclient-2.0.0-2.el7 ################################# [100%] [admin@RHOB-DB00 OceanBase]$ sudo rpm -ivh obclient-2.0.0-2.el7.x86_64.rpm warning: obclient-2.0.0-2.el7.x86_64.rpm: Header V4 RSA/SHA1 Signature, key ID e9b4a7aa: NOKEY Preparing… ################################# [100%] Updating / installing… 1:obclient-2.0.0-2.el7 ################################# [100%] [admin@RHOB-DB00 OceanBase]$ which obclient /bin/obclient

 

6. obproxy

OBProxy实现接受来自应用的请求,并转发给OBServer,然后OBServer将数据返回给OBProxy,OBProxy将数据转发给应用客户端。

支持的路由策略的方式多样,详见官方文档 OBProxy 的路由策略
 

6.1 安装obproxy

[admin@RHOB-DB00 OceanBase]$ sudo rpm -ivh obproxy-3.2.0-1.el7.x86_64.rpm warning: obproxy-3.2.0-1.el7.x86_64.rpm: Header V4 RSA/SHA1 Signature, key ID e9b4a7aa: NOKEY Preparing… ################################# [100%] Updating / installing… 1:obproxy-3.2.0-1.el7 ################################# [100%] [admin@RHOB-DB00 OceanBase]$ rpm -qa|grep obproxy obproxy-3.2.0-1.el7.x86_64

6.2 启动obproxy

[admin@RHOB-DB00 ~]$ cd ~/obproxy-3.2.0/ && bin/obproxy -r “192.168.117.171:2881;192.168.117.172:2881;192.168.117.173:2881” -p 2883 -o “enable_strict_kernel_release=false,enable_cluster_checkout=false,enable_metadb_used=false” -c obdemo bin/obproxy -r 192.168.117.171:2881;192.168.117.172:2881;192.168.117.173:2881 -p 2883 -o enable_strict_kernel_release=false,enable_cluster_checkout=false,enable_metadb_used=false -c obdemo rs list: 192.168.117.171:2881;192.168.117.172:2881;192.168.117.173:2881 listen port: 2883 optstr: enable_strict_kernel_release=false,enable_cluster_checkout=false,enable_metadb_used=false cluster_name: obdemo

启动obproxy.png
 

7. 设置密码

本次把root、proxyro等等用户一起配置。

用户 端口 用途
root 2881 数据库 root 用户
proxyro 2881 数据库 proxyro 用户
proxysys 2883 obproxy的sys用户,登录obproxy
observer_sys_password 2883 数据库 proxyro 用户,obproxy通过proxyro 用户连接数据库
obproxy_sys_password 2883 obproxy的sys用户密码

 

[admin@RHOB-DB00 ~]$ mysql -h 192.168.117.171 -u root -P 2881 -p -c -A # 直接敲回车键登录数据库。 # 本次登录是sys租户。 Enter password: # root用户修改密码后,以后都要用密码登录。 mysql> alter user 'root'@'%' identified by 'rO0t&123' ; Query OK, 0 rows affected (0.12 sec) # observer_sys_password 密码 要 和 proxyro 的密码一致。 mysql> create user proxyro identified by 'Serproxy@123' ; Query OK, 0 rows affected (0.10 sec) mysql> grant select on oceanbase.* to proxyro; Query OK, 0 rows affected (0.07 sec) [admin@RHOB-DB00 ~]$ obclient -h 192.168.117.170 -u root@proxysys -P 2883 -p # 修改 OBPROXY 的密码 : obproxy_sys_password mysql> alter proxyconfig set obproxy_sys_password = 'Sysproxy@123' ; Query OK, 0 rows affected (0.01 sec) # observer_sys_password # 修改 OBPROXY 连接 OceanBase 集群用户 proxyro 的密码。这样 OBPROXY 才能跟 OceanBase 集群正常连接。 # observer_sys_password 密码 要 和 proxyro 的密码一致。 mysql> alter proxyconfig set observer_sys_password = 'Serproxy@123' ; Query OK, 0 rows affected (0.00 sec) mysql> show proxyconfig like '%sys_password%'; +------------------------+------------------------------------------+--------------------------------+-------------+---------------+ | name | value | info | need_reboot | visible_level | +------------------------+------------------------------------------+--------------------------------+-------------+---------------+ | observer_sys_password1 | | password for observer sys user | false | SYS | | observer_sys_password | e1e4ccbd352584c0e44a20a89debca6cfa64c5f7 | password for observer sys user | false | SYS | | obproxy_sys_password | f8543646780b3c88b1a9b11be3e62881e437b95c | password for obproxy sys user | false | SYS | +------------------------+------------------------------------------+--------------------------------+-------------+---------------+ 3 rows in set (0.01 sec)

 

登录proxy 管理使用 obproxy_sys_password 的密码登录。

[admin@RHOB-DB00 ~]$ obclient -h 192.168.117.170 -u root@proxysys -P 2883 -p Enter password: Welcome to the OceanBase. Commands end with ; or \\g. Your MySQL connection id is 9 Server version: 5.6.25 Copyright ? 2000, 2018, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\\h' for help. Type '\\c' to clear the current input statement. MySQL [(none)]>

 
通过 OBPROXY 连接 OceanBase 集群看看, 如果能查看所有会话,则说明 OBPROXY 部署成功。

[admin@RHOB-DB00 ~]$ mysql -h192.168.117.170 -uroot@sys#obdemo -P2883 -p'rO0t&123' -c -A oceanbase 或者 [admin@RHOB-DB00 ~]$ obclient -h192.168.117.170 -uroot@sys#obdemo -P2883 -p'rO0t&123' -c -A oceanbase

proxy登录.png
 
proxy_show.png

 

8. 启动&关闭

看了比较多的文章,好像都讲了如何部署、启动,但其实正确地关闭数据库也是非常重要的。

小插曲 : 笔者的电脑由于内存不够,重启了好几回。每次重启试图开启集群,但都失败了。
如果逐个正常关闭<=2台服务器,然后启动observer,集群是正常的。

手动部署集群的灵活性要比OBD工具部署集群好很多,OBD工具应该没有办法对某个节点启停,只能对整个集群启停。
笔者尝试了对OBD部署集群的某个节点进行关闭,但发现启动无从下手。最后,尝试obd stop 再 start 集群,结果报错"[ERROR] Cluster NTP is out of sync"。
后续有机会再尝试解决。

8.1 检查集群的状态

检查集群关闭前的状态,确认集群正常。

MySQL [oceanbase]> select a.zone,concat(a.svr_ip,':',a.svr_port) observer, cpu_total, (cpu_total-cpu_assigned) cpu_free, round(mem_total/1024/1024/1024) mem_total_gb, round((mem_total-mem_assigned)/1024/1024/1024) mem_free_gb, usec_to_time(b.last_offline_time) last_offline_time, usec_to_time(b.start_service_time) start_service_time, b.status, usec_to_time(b.stop_time) stop_time, b.build_version from __all_virtual_server_stat a join __all_server b on (a.svr_ip=b.svr_ip and a.svr_port=b.svr_port) order by a.zone, a.svr_ip; +-------+----------------------+-----------+----------+--------------+-------------+----------------------------+----------------------------+----------+----------------------------+------------------------------------------------------------------------+ | zone | observer | cpu_total | cpu_free | mem_total_gb | mem_free_gb | last_offline_time | start_service_time | status | stop_time | build_version | +-------+----------------------+-----------+----------+--------------+-------------+----------------------------+----------------------------+----------+----------------------------+------------------------------------------------------------------------+ | zone1 | 192.168.117.171:2882 | 14 | 11.5 | 5 | 4 | 1970-01-01 08:00:00.000000 | 2021-12-25 10:00:16.958704 | active | 1970-01-01 08:00:00.000000 | 3.1.1_4-8c615943cbd25a6f7b8bdfd8677a13a21709a05e(Oct 21 2021 10:33:14) | | zone2 | 192.168.117.172:2882 | 14 | 11.5 | 5 | 4 | 1970-01-01 08:00:00.000000 | 2021-12-25 10:00:18.078459 | active | 1970-01-01 08:00:00.000000 | 3.1.1_4-8c615943cbd25a6f7b8bdfd8677a13a21709a05e(Oct 21 2021 10:33:14) | | zone3 | 192.168.117.173:2882 | 14 | 11.5 | 5 | 4 | 1970-01-01 08:00:00.000000 | 2021-12-25 10:00:20.061450 | active | 1970-01-01 08:00:00.000000 | 3.1.1_4-8c615943cbd25a6f7b8bdfd8677a13a21709a05e(Oct 21 2021 10:33:14) | +-------+----------------------+-----------+----------+--------------+-------------+----------------------------+----------------------------+----------+----------------------------+------------------------------------------------------------------------+

8.2 关闭单个Server

关闭单个OBServer [以observer3为例]

[admin@RHOB-DB03 ~]$ ps -ef|grep observer admin 11383 1 99 18:08 ? 07:34:59 bin/observer -i ens32 -p 2881 -P 2882 -z zone3 -d /home/admin/oceanbase/store/obdemo -r 192.168.117.171:2882:2881;192.168.117.172:2882:2881;192.168.117.173:2882:2881 -c 20211224 -n obdemo -o memory_limit=8G,cache_wash_threshold=1G,__min_full_resource_pool_memory=268435456,system_memory=3G,memory_chunk_cache_size=128M,cpu_count=16,net_thread_count=4,datafile_size=20G,stack_size=1536K,config_additional_dir=/obdata/obdemo/etc3;/oblog/obdemo/etc2 -d /home/admin/oceanbase/store/obdemo admin 14895 13253 0 21:13 pts/1 00:00:00 grep --color=auto observer

 
不要使用kill -9,等待1分钟左右。

参考官方教程。

[admin@RHOB-DB03 ~]$ kill `pidof observer` [admin@RHOB-DB03 ~]$ ps -ef | grep observer | grep -v grep admin 15805 1 99 09:59 ? 02:23:08 bin/observer -i ens32 -p 2881 -P 2882 -z zone3 -d /home/admin/oceanbase/store/obdemo -r 192.168.117.171:2882:2881;192.168.117.172:2882:2881;192.168.117.173:2882:2881 -c 20211224 -n OBCluster -o memory_limit=8G,cache_wash_threshold=1G,__min_full_resource_pool_memory=268435456,system_memory=3G,memory_chunk_cache_size=128M,cpu_count=16,net_thread_count=4,datafile_size=20G,stack_size=1536K,syslog_level=ERROR,config_additional_dir=/obdata/obdemo/etc3;/oblog/obdemo/etc2

 
status已经变为inactive,本次只关闭一个observer,集群可以正常对外提供服务。

MySQL [oceanbase]> select a.zone,concat(a.svr_ip,':',a.svr_port) observer, cpu_total, (cpu_total-cpu_assigned) cpu_free, round(mem_total/1024/1024/1024) mem_total_gb, round((mem_total-mem_assigned)/1024/1024/1024) mem_free_gb, usec_to_time(b.last_offline_time) last_offline_time, usec_to_time(b.start_service_time) start_service_time, b.status, usec_to_time(b.stop_time) stop_time, b.build_version from __all_virtual_server_stat a join __all_server b on (a.svr_ip=b.svr_ip and a.svr_port=b.svr_port) order by a.zone, a.svr_ip; +-------+----------------------+-----------+----------+--------------+-------------+----------------------------+----------------------------+----------+----------------------------+------------------------------------------------------------------------+ | zone | observer | cpu_total | cpu_free | mem_total_gb | mem_free_gb | last_offline_time | start_service_time | status | stop_time | build_version | +-------+----------------------+-----------+----------+--------------+-------------+----------------------------+----------------------------+----------+----------------------------+------------------------------------------------------------------------+ | zone1 | 192.168.117.171:2882 | 14 | 11.5 | 5 | 4 | 1970-01-01 08:00:00.000000 | 2021-12-25 10:12:55.991612 | active | 1970-01-01 08:00:00.000000 | 3.1.1_4-8c615943cbd25a6f7b8bdfd8677a13a21709a05e(Oct 21 2021 10:33:14) | | zone2 | 192.168.117.172:2882 | 14 | 11.5 | 5 | 4 | 1970-01-01 08:00:00.000000 | 2021-12-25 10:00:18.078459 | active | 1970-01-01 08:00:00.000000 | 3.1.1_4-8c615943cbd25a6f7b8bdfd8677a13a21709a05e(Oct 21 2021 10:33:14) | | zone3 | 192.168.117.173:2882 | 14 | 11.5 | 5 | 4 | 1970-01-01 08:00:00.000000 | 1970-01-01 08:00:00.000000 | inactive | 1970-01-01 08:00:00.000000 | 3.1.1_4-8c615943cbd25a6f7b8bdfd8677a13a21709a05e(Oct 21 2021 10:33:14) | +-------+----------------------+-----------+----------+--------------+-------------+----------------------------+----------------------------+----------+----------------------------+------------------------------------------------------------------------+ 3 rows in set (0.007 sec)

 
确保进程已经不在,否则重启会报错。

[admin@RHOB-DB03 ~]$ ps -ef | grep observer | grep -v grep

8.3 启动 observer

关闭两个observer后 [observer2、observer3],集群无法对外提供服务。

[admin@RHOB-DB02 ~]$ kill `pidof observer` [admin@RHOB-DB02 ~]$ [admin@RHOB-DB02 ~]$ ps -ef | grep observer | grep -v grep

 

# 未登录的场景 [admin@RHOB-DB00 ~]$ obclient -h192.168.117.171 -uroot -P2881 -p'rO0t&123' -c -A oceanbase ERROR 4012 (25000): Statement is timeout # 已登陆的场景 MySQL [oceanbase]> show databases; ERROR 4012 (25000): Statement is timeout

 
启动单个OBServer [以observer3为例]

[admin@RHOB-DB03 ~]$ cd ~/oceanbase && bin/observer -i ens32 -p 2881 -P 2882 -z zone3 -d ~/oceanbase/store/obdemo -r '192.168.117.171:2882:2881;192.168.117.172:2882:2881;192.168.117.173:2882:2881' -c 20211224 -n obdemo -o "memory_limit=8G,cache_wash_threshold=1G,__min_full_resource_pool_memory=268435456,system_memory=3G,memory_chunk_cache_size=128M,cpu_count=16,net_thread_count=4,datafile_size=20G,stack_size=1536K,syslog_level=ERROR,config_additional_dir=/obdata/obdemo/etc3;/oblog/obdemo/etc2" bin/observer -i ens32 -p 2881 -P 2882 -z zone2 -d /home/admin/oceanbase/store/obdemo -r 192.168.117.171:2882:2881;192.168.117.172:2882:2881;192.168.117.173:2882:2881 -c 20211224 -n obdemo -o memory_limit=8G,cache_wash_threshold=1G,__min_full_resource_pool_memory=268435456,system_memory=3G,memory_chunk_cache_size=128M,cpu_count=16,net_thread_count=4,datafile_size=20G,stack_size=1536K,syslog_level=ERROR,config_additional_dir=/obdata/obdemo/etc3;/oblog/obdemo/etc2 devname: ens32 mysql port: 2881 rpc port: 2882 zone: zone3 data_dir: /home/admin/oceanbase/store/obdemo rs list: 192.168.117.171:2882:2881;192.168.117.172:2882:2881;192.168.117.173:2882:2881 cluster id: 20211224 appname: obdemo optstr: memory_limit=8G,cache_wash_threshold=1G,__min_full_resource_pool_memory=268435456,system_memory=3G,memory_chunk_cache_size=128M,cpu_count=16,net_thread_count=4,datafile_size=20G,stack_size=1536K,syslog_level=ERROR,config_additional_dir=/obdata/obdemo/etc3;/oblog/obdemo/etc2 [admin@RHOB-DB03 oceanbase]$ ps -ef | grep observer | grep -v grep admin 11257 1 99 21:17 ? 00:04:11 bin/observer -i ens32 -p 2881 -P 2882 -z zone3 -d /home/admin/oceanbase/store/obdemo -r 192.168.117.171:2882:2881;192.168.117.172:2882:2881;192.168.117.173:2882:2881 -c 20211224 -n obdemo -o memory_limit=8G,cache_wash_threshold=1G,__min_full_resource_pool_memory=268435456,system_memory=3G,memory_chunk_cache_size=128M,cpu_count=16,net_thread_count=4,datafile_size=20G,stack_size=1536K,config_additional_dir=/obdata/obdemo/etc3;/oblog/obdemo/etc2 -d /home/admin/oceanbase/store/obdemo

 
在启动observer后,跟踪集群资源的状态变化。

mysql> use oceanbase; Database changed # 检查集群资源,可以看到 192.168.117.173 observer3的status为inactive,且资源为NULL。 MySQL [oceanbase]> select a.zone,concat(a.svr_ip,':',a.svr_port) observer, cpu_total, (cpu_total-cpu_assigned) cpu_free, round(mem_total/1024/1024/1024) mem_total_gb, round((mem_total-mem_assigned)/1024/1024/1024) mem_free_gb, usec_to_time(b.last_offline_time) last_offline_time, usec_to_time(b.start_service_time) start_service_time, b.status, usec_to_time(b.stop_time) stop_time, b.build_version from __all_virtual_server_stat a join __all_server b on (a.svr_ip=b.svr_ip and a.svr_port=b.svr_port) order by a.zone, a.svr_ip; +-------+----------------------+-----------+----------+--------------+-------------+----------------------------+----------------------------+----------+----------------------------+------------------------------------------------------------------------+ | zone | observer | cpu_total | cpu_free | mem_total_gb | mem_free_gb | last_offline_time | start_service_time | status | stop_time | build_version | +-------+----------------------+-----------+----------+--------------+-------------+----------------------------+----------------------------+----------+----------------------------+------------------------------------------------------------------------+ | zone1 | 192.168.117.171:2882 | 14 | 11.5 | 5 | 4 | 1970-01-01 08:00:00.000000 | 2021-12-25 11:07:55.559403 | active | 1970-01-01 08:00:00.000000 | 3.1.1_4-8c615943cbd25a6f7b8bdfd8677a13a21709a05e(Oct 21 2021 10:33:14) | | zone2 | 192.168.117.172:2882 | 14 | 11.5 | 5 | 4 | 1970-01-01 08:00:00.000000 | 2021-12-25 10:59:23.040243 | active | 1970-01-01 08:00:00.000000 | 3.1.1_4-8c615943cbd25a6f7b8bdfd8677a13a21709a05e(Oct 21 2021 10:33:14) | | zone3 | 192.168.117.173:2882 | NULL | NULL | NULL | NULL | 1970-01-01 08:00:00.000000 | 1970-01-01 08:00:00.000000 | inactive | 1970-01-01 08:00:00.000000 | 3.1.1_4-8c615943cbd25a6f7b8bdfd8677a13a21709a05e(Oct 21 2021 10:33:14) | +-------+----------------------+-----------+----------+--------------+-------------+----------------------------+----------------------------+----------+----------------------------+------------------------------------------------------------------------+ # 集群尚未修改 start_service_time 信息 mysql> select a.zone,concat(a.svr_ip,':',a.svr_port) observer, cpu_total, (cpu_total-cpu_assigned) cpu_free, round(mem_total/1024/1024/1024) mem_total_gb, round((mem_total-mem_assigned)/1024/1024/1024) mem_free_gb, usec_to_time(b.last_offline_time) last_offline_time, usec_to_time(b.start_service_time) start_service_time, b.status, usec_to_time(b.stop_time) stop_time, b.build_version from __all_virtual_server_stat a join __all_server b on (a.svr_ip=b.svr_ip and a.svr_port=b.svr_port) order by a.zone, a.svr_ip; +-------+----------------------+-----------+----------+--------------+-------------+----------------------------+----------------------------+----------+----------------------------+------------------------------------------------------------------------+ | zone | observer | cpu_total | cpu_free | mem_total_gb | mem_free_gb | last_offline_time | start_service_time | status | stop_time | build_version | +-------+----------------------+-----------+----------+--------------+-------------+----------------------------+----------------------------+----------+----------------------------+------------------------------------------------------------------------+ | zone1 | 192.168.117.171:2882 | 14 | 11.5 | 5 | 4 | 1970-01-01 08:00:00.000000 | 2021-12-25 11:07:55.559403 | active | 1970-01-01 08:00:00.000000 | 3.1.1_4-8c615943cbd25a6f7b8bdfd8677a13a21709a05e(Oct 21 2021 10:33:14) | | zone2 | 192.168.117.172:2882 | 14 | 11.5 | 5 | 4 | 1970-01-01 08:00:00.000000 | 2021-12-25 10:59:23.040243 | active | 1970-01-01 08:00:00.000000 | 3.1.1_4-8c615943cbd25a6f7b8bdfd8677a13a21709a05e(Oct 21 2021 10:33:14) | | zone3 | 192.168.117.173:2882 | 14 | 11.5 | 5 | 4 | 1970-01-01 08:00:00.000000 | 1970-01-01 08:00:00.000000 | active | 1970-01-01 08:00:00.000000 | 3.1.1_4-8c615943cbd25a6f7b8bdfd8677a13a21709a05e(Oct 21 2021 10:33:14) | +-------+----------------------+-----------+----------+--------------+-------------+----------------------------+----------------------------+----------+----------------------------+------------------------------------------------------------------------+ # 集群已经修改start_service_time 信息,集群启动完成 MySQL [oceanbase]> select a.zone,concat(a.svr_ip,':',a.svr_port) observer, cpu_total, (cpu_total-cpu_assigned) cpu_free, round(mem_total/1024/1024/1024) mem_total_gb, round((mem_total-mem_assigned)/1024/1024/1024) mem_free_gb, usec_to_time(b.last_offline_time) last_offline_time, usec_to_time(b.start_service_time) start_service_time, b.status, usec_to_time(b.stop_time) stop_time, b.build_version from __all_virtual_server_stat a join __all_server b on (a.svr_ip=b.svr_ip and a.svr_port=b.svr_port) order by a.zone, a.svr_ip; +-------+----------------------+-----------+----------+--------------+-------------+----------------------------+----------------------------+--------+----------------------------+------------------------------------------------------------------------+ | zone | observer | cpu_total | cpu_free | mem_total_gb | mem_free_gb | last_offline_time | start_service_time | status | stop_time | build_version | +-------+----------------------+-----------+----------+--------------+-------------+----------------------------+----------------------------+--------+----------------------------+------------------------------------------------------------------------+ | zone1 | 192.168.117.171:2882 | 14 | 11.5 | 5 | 4 | 1970-01-01 08:00:00.000000 | 2021-12-25 11:07:55.559403 | active | 1970-01-01 08:00:00.000000 | 3.1.1_4-8c615943cbd25a6f7b8bdfd8677a13a21709a05e(Oct 21 2021 10:33:14) | | zone2 | 192.168.117.172:2882 | 14 | 11.5 | 5 | 4 | 1970-01-01 08:00:00.000000 | 2021-12-25 10:59:23.040243 | active | 1970-01-01 08:00:00.000000 | 3.1.1_4-8c615943cbd25a6f7b8bdfd8677a13a21709a05e(Oct 21 2021 10:33:14) | | zone3 | 192.168.117.173:2882 | 14 | 11.5 | 5 | 4 | 1970-01-01 08:00:00.000000 | 2021-12-25 11:12:21.488835 | active | 1970-01-01 08:00:00.000000 | 3.1.1_4-8c615943cbd25a6f7b8bdfd8677a13a21709a05e(Oct 21 2021 10:33:14) | +-------+----------------------+-----------+----------+--------------+-------------+----------------------------+----------------------------+--------+----------------------------+------------------------------------------------------------------------+

 

9. 了解 zone

本次部署的集群有三个节点,分别是zone1,zone2,zone3,每个zone各一台服务器。

如果OB cluster的集群不需要占用太多的资源,那么通过扩容实践将能更好地理解OceanBase架构。
同时,笔者也认为会有更多的学习者投入时间和精力研究。

OceanBase 集群由若干个 Zone 组成。

zone可以代表是一个机房的某些服务器,也可以是一个机房、一个数据中心。
OceanBase 可以跨机房部署,即可理解成不同地域(Region),一个 Region 可以包含一个或者多个 Zone。

YY一下,本集群的每个zone各来自不同地域(Region)的机房的1个服务器。

如果想要扩展,可以使用以下命令,由于笔者的电脑资源有限,只能等资源充足再进行演示。

把各机房再划分一台服务器到对应的zone上。

alter system add server '192.168.117.174:3882' zone 'zone1', '192.168.117.175:3882' zone 'zone2', '192.168.117.176:3882' zone 'zone3';

 

10. 了解observer

见第9点,可以知道每一个服务器都可以成为一个observer。
 

11. 了解资源池和资源单位

 
首先,创建资源单元。[它是资源分配的最小单位,同一个Unit不能跨OBServer]
其次,把资源单位按照一定的数量分配给资源池。
最后,把资源池绑定到租户/实例。
 

数据是有多份冗余的。把一个资源单位给了资源池,意味着每一个zone都要挖出一块资源给到资源池。
每一张表在每一个zone都会有备份,这和mycat中间件架构是完全不同的。

 
创建资源单元

MySQL [oceanbase]> create resource unit mysql_s1 max_cpu=1,min_cpu=1,max_memory='1G',min_memory='1G',max_iops=20000,min_iops=20000,max_session_num=10000,max_disk_size='4G'; Query OK, 0 rows affected (0.053 sec) MySQL [oceanbase]> create resource unit mysql_s2 max_cpu=1,min_cpu=1,max_memory='1G',min_memory='1G',max_iops=40000,min_iops=40000,max_session_num=100000,max_disk_size='4G'; Query OK, 0 rows affected (0.070 sec)

 
创建资源池

MySQL [oceanbase]> create resource pool POO1_MYSQL_S1 unit='mysql_s1' ,unit_num=1; Query OK, 0 rows affected (0.070 sec) MySQL [oceanbase]> create resource pool POO1_MYSQL_S2 unit='mysql_s2' , unit_num=1, zone_list=('zone1' ,'zone2') ; Query OK, 0 rows affected (0.045 sec)

 

12. 了解 primary zone 和 locality

12.1 primary zone

参考官方 OBCP学习资料 。

数据的leader副本在哪个节点的Unit里,业务读写请求就落在哪个节点上。

leader副本默认位置由表的primary_zone属性控制,可以继承自数据库以及租户(实例)的primary_zone设置。

笔者尝试查看某表的primary_zone,信息显示为zone1,zone2,zone3。
那究竟哪个才是是leader副本呢?见下面信息。

 
重点:即使primary_zone设置为RANDOM,在只有三个Unit的情况下(即Resource Pool的unit_num=1的情况下),每个leader副本会默认在第一个Zone里。

本集群的租户分配的资源池的资源单元unit_num=1,所以,leader副本在第一个zone。
那如果unit_num>=2呢?待笔者了解后,再来演示,敬请期待!
 

12.2 locality

参考官方文档 。

不同的租户在同一个集群内可以配置不同的 Locality 并且彼此之间相互独立不受影响。

Locality 的设置通常用于集群的副本数升级、降级或集群的搬迁:

集群副本数升级

以租户为粒度,对集群中的每一个租户,增加租户下 Partition 的副本数。例如,将 Locality 由 F@z1,F@z2,F@z3变更为 F@z1,F@z2,F@z3,F@z4,F@z5,租户从 3 副本变为 5 副本。

集群副本数降级

以租户为粒度,对集群内的每一个租户减少其中 Partition 的副本数。例如,将 Locality 由 F@z1,F@z2,F@z3,F@z4,F@z5变更为 F@z1,F@z2,F@z3,F@z4,租户从 5 副本变为 4 副本。

集群搬迁

以租户为粒度,对集群内的每一个租户通过若干次 Locality 变更。比如,将 Locality 从 F@hz1,F@hz2,F@hz3变更为 F@hz1,F@sh1,F@sh2即代表将原集群中属于杭州的两个 Zone 迁到上海。

简单理解,Zone的数量多或者少了,locality的可选择性也就多了。
副本可以增减、也可以变更地点。

 

13. 了解租户

13.1 租户概念

参考官方 OBCP学习资料 。

OceanBase集群是一个大资源池,不可能把所有资源全部给某一个业务用。

OceanBase集群实行的是按需分配给不同的业务,以租户/实例的形式实现。

实现资源统一管理,便于IT人员掌控。

 

13.2 创建租户

MySQL [oceanbase]> create tenant mysql_sales resource_pool_list=('POO1_MYSQL_S1'),primary_zone='zone1;zone2,zone3',charset='utf8mb4' set ob_tcp_invited_nodes='%',ob_compatibility_mode='mysql'; Query OK, 0 rows affected (3.827 sec) MySQL [oceanbase]> create tenant mysql_conf resource_pool_list=('POO1_MYSQL_S2'), primary_zone='RANDOM',comment 'mysql tenant/instance', charset='utf8mb4' set ob_tcp_invited_nodes='%' ; Query OK, 0 rows affected (5.152 sec) # 检查 MySQL [oceanbase]> select * from __all_tenant; +----------------------------+----------------------------+-----------+-------------+-------------+-------------------+-------------------+--------+----------------+-----------------------+-----------+-----------------------+---------------------------------------------+---------------------+-------------------+------------------------+-----------------------------+-----------------------+--------------------+------------------+----------------------+---------------+ | gmt_create | gmt_modified | tenant_id | tenant_name | replica_num | zone_list | primary_zone | locked | collation_type | info | read_only | rewrite_merge_version | locality | logonly_replica_num | previous_locality | storage_format_version | storage_format_work_version | default_tablegroup_id | compatibility_mode | drop_tenant_time | status | in_recyclebin | +----------------------------+----------------------------+-----------+-------------+-------------+-------------------+-------------------+--------+----------------+-----------------------+-----------+-----------------------+---------------------------------------------+---------------------+-------------------+------------------------+-----------------------------+-----------------------+--------------------+------------------+----------------------+---------------+ | 2021-12-24 22:02:46.183518 | 2021-12-24 22:02:46.183518 | 1 | sys | -1 | zone1;zone2;zone3 | zone1;zone2,zone3 | 0 | 0 | system tenant | 0 | 0 | FULL{1}@zone1, FULL{1}@zone2, FULL{1}@zone3 | 0 | | 0 | 0 | -1 | 0 | -1 | TENANT_STATUS_NORMAL | 0 | | 2021-12-25 12:35:39.886851 | 2021-12-25 12:35:39.886851 | 1001 | mysql_sales | -1 | zone1;zone2;zone3 | zone1;zone2,zone3 | 0 | 0 | | 0 | 0 | FULL{1}@zone1, FULL{1}@zone2, FULL{1}@zone3 | 0 | | 0 | 0 | -1 | 0 | -1 | TENANT_STATUS_NORMAL | 0 | | 2021-12-25 12:48:20.793068 | 2021-12-25 12:48:20.793068 | 1003 | mysql_conf | -1 | zone1;zone2 | RANDOM | 0 | 0 | mysql tenant/instance | 0 | 0 | FULL{1}@zone1, FULL{1}@zone2 | 0 | | 0 | 0 | -1 | 0 | -1 | TENANT_STATUS_NORMAL | 0 | +----------------------------+----------------------------+-----------+-------------+-------------+-------------------+-------------------+--------+----------------+-----------------------+-----------+-----------------------+---------------------------------------------+---------------------+-------------------+------------------------+-----------------------------+-----------------------+--------------------+------------------+----------------------+---------------+ 3 rows in set (0.005 sec) # 查看现有的租户/实例。 # 一个视图可以看到一个集群的所有实例。 # 试想一下,对于一个公司来说,一个集群就可以满足管理N个实例,一目了然,是不是很酷。 MySQL [oceanbase]> select * from gv$tenant; +-----------+-------------+-------------------+-------------------+----------------+-----------------------+-----------+---------------------------------------------+ | tenant_id | tenant_name | zone_list | primary_zone | collation_type | info | read_only | locality | +-----------+-------------+-------------------+-------------------+----------------+-----------------------+-----------+---------------------------------------------+ | 1 | sys | zone1;zone2;zone3 | zone1;zone2,zone3 | 0 | system tenant | 0 | FULL{1}@zone1, FULL{1}@zone2, FULL{1}@zone3 | | 1001 | mysql_sales | zone1;zone2;zone3 | zone1;zone2,zone3 | 0 | | 0 | FULL{1}@zone1, FULL{1}@zone2, FULL{1}@zone3 | | 1003 | mysql_conf | zone1;zone2 | RANDOM | 0 | mysql tenant/instance | 0 | FULL{1}@zone1, FULL{1}@zone2 | +-----------+-------------+-------------------+-------------------+----------------+-----------------------+-----------+---------------------------------------------+ 3 rows in set (0.026 sec) # 查看剩余的资源 MySQL [oceanbase]> select a.zone,concat(a.svr_ip,':',a.svr_port) observer, cpu_total, cpu_assigned, (cpu_total-cpu_assigned) cpu_free, mem_total/1024/1024/1024 mem_total_gb, mem_assigned/1024/1024/1024 mem_assign_gb, (mem_total-mem_assigned)/1024/1024/1024 mem_free_gb from __all_virtual_server_stat a join __all_server b on (a.svr_ip=b.svr_ip and a.svr_port=b.svr_port) order by a.zone, a.svr_ip; +-------+----------------------+-----------+--------------+----------+----------------+----------------+----------------+ | zone | observer | cpu_total | cpu_assigned | cpu_free | mem_total_gb | mem_assign_gb | mem_free_gb | +-------+----------------------+-----------+--------------+----------+----------------+----------------+----------------+ | zone1 | 192.168.117.171:2882 | 14 | 4.5 | 9.5 | 5.000000000000 | 3.250000000000 | 1.750000000000 | | zone2 | 192.168.117.172:2882 | 14 | 4.5 | 9.5 | 5.000000000000 | 3.250000000000 | 1.750000000000 | | zone3 | 192.168.117.173:2882 | 14 | 3.5 | 10.5 | 5.000000000000 | 2.250000000000 | 2.750000000000 | +-------+----------------------+-----------+--------------+----------+----------------+----------------+----------------+ 3 rows in set (0.009 sec)

 

13.3 还原环境[可选]

删除租户
MySQL [oceanbase]> DROP TENANT mysql_conf FORCE;

删除资源池
MySQL [oceanbase]> DROP RESOURCE POOL POO1_MYSQL_S2;

删除资源单位
MySQL [oceanbase]> DROP RESOURCE UNIT mysql_s2;
 

13.4 登录租户/实例

13.4.1 修改新建实例的root密码

生成一个随机密码

[admin@RHOB-DB00 ~]$ strings /dev/urandom |tr -dc A-Za-z0-9 | head -c8; echo ZbkErGYS

 

13.4.2 登录新建的mysql租户

登录租户,修改密码
mysql租户.png
 

14. 建表测试

14.1 分区表

use test; CREATE TABLE test_range( id INT, gmt_create TIMESTAMP, info VARCHAR(20), PRIMARY KEY (gmt_create)) PARTITION BY RANGE(UNIX_TIMESTAMP(gmt_create)) (PARTITION p2014 VALUES LESS THAN (UNIX_TIMESTAMP('2015-01-01 00:00:00')), PARTITION p2015 VALUES LESS THAN (UNIX_TIMESTAMP('2016-01-01 00:00:00')), PARTITION p2016 VALUES LESS THAN (UNIX_TIMESTAMP('2017-01-01 00:00:00'))); MySQL [test]> insert into test_range values (1,'2014-06-02','hhha'); Query OK, 1 row affected (0.200 sec) MySQL [test]> insert into test_range values (2,'2015-07-03','hxxhha'); Query OK, 1 row affected (0.015 sec) MySQL [test]> insert into test_range values (3,'2016-09-09','xxx'); Query OK, 1 row affected (0.011 sec) MySQL [test]> select * from test_range; +------+---------------------+--------+ | id | gmt_create | info | +------+---------------------+--------+ | 1 | 2014-06-02 00:00:00 | hhha | | 2 | 2015-07-03 00:00:00 | hxxhha | | 3 | 2016-09-09 00:00:00 | xxx | +------+---------------------+--------+ 3 rows in set (0.042 sec) MySQL [test]> select * from test_range partition (p2016); +------+---------------------+------+ | id | gmt_create | info | +------+---------------------+------+ | 3 | 2016-09-09 00:00:00 | xxx | +------+---------------------+------+ 1 row in set (0.004 sec)

14.2 分区表的分布

笔者想通过视图了解各个分区表的分区的leader副本在哪里,但没有找出,后续有时间再试试看。

即使primary_zone设置为RANDOM,在只有三个Unit的情况下(即Resource Pool的unit_num=1的情况下),每个leader副本会默认在第一个Zone里。

MySQL [oceanbase]> select * from gv$table where table_name ='test_range'\G *************************** 1. row *************************** tenant_id: 1001 tenant_name: mysql_sales table_id: 1100611139453777 table_name: test_range database_id: 1100611139404776 database_name: test tablegroup_id: -1 tablegroup_name: NULL table_type: 3 zone_list: zone1;zone2;zone3 primary_zone: zone1;zone2,zone3 collation_type: 45 locality: FULL{1}@zone1, FULL{1}@zone2, FULL{1}@zone3 schema_version: 1640439545025432 read_only: 0 comment: index_status: 1 index_type: 0 part_level: 1 part_func_type: 3 part_func_expr: UNIX_TIMESTAMP(gmt_create) part_num: 3 sub_part_func_type: 0 sub_part_func_expr: sub_part_num: 1 dop: 1 auto_part: 0 auto_part_size: -1 1 row in set (0.015 sec) MySQL [oceanbase]> select * from v$partition where table_id =1100611139453777 order by zone; +-----------+------------------+---------------+--------------+-----------------+----------+----------+---------+---------------+-------+------+-------------------------------------------------------------------------------------------------------------------+-----------+-----------+--------------+--------------------+---------------+--------------+-----------------+---------+--------------+---------------+-----------------------+------------+--------+ | tenant_id | table_id | tablegroup_id | partition_id | svr_ip | svr_port | sql_port | unit_id | partition_cnt | zone | role | member_list | row_count | data_size | data_version | partition_checksum | data_checksum | row_checksum | column_checksum | rebuild | replica_type | required_size | status | is_restore | quorum | +-----------+------------------+---------------+--------------+-----------------+----------+----------+---------+---------------+-------+------+-------------------------------------------------------------------------------------------------------------------+-----------+-----------+--------------+--------------------+---------------+--------------+-----------------+---------+--------------+---------------+-----------------------+------------+--------+ | 1001 | 1100611139453777 | -1 | 0 | 192.168.117.171 | 2882 | 2881 | 1001 | 0 | zone1 | 1 | 192.168.117.171:2882:1640439545129458,192.168.117.172:2882:1640439545129458,192.168.117.173:2882:1640439545129458 | 0 | 0 | 1 | 0 | 0 | 0 | | 0 | 0 | 0 | REPLICA_STATUS_NORMAL | 0 | 3 | | 1001 | 1100611139453777 | -1 | 1 | 192.168.117.171 | 2882 | 2881 | 1001 | 0 | zone1 | 1 | 192.168.117.171:2882:1640439545129583,192.168.117.172:2882:1640439545129583,192.168.117.173:2882:1640439545129583 | 0 | 0 | 1 | 0 | 0 | 0 | | 0 | 0 | 0 | REPLICA_STATUS_NORMAL | 0 | 3 | | 1001 | 1100611139453777 | -1 | 2 | 192.168.117.171 | 2882 | 2881 | 1001 | 0 | zone1 | 1 | 192.168.117.171:2882:1640439545129686,192.168.117.172:2882:1640439545129686,192.168.117.173:2882:1640439545129686 | 0 | 0 | 1 | 0 | 0 | 0 | | 0 | 0 | 0 | REPLICA_STATUS_NORMAL | 0 | 3 | | 1001 | 1100611139453777 | -1 | 0 | 192.168.117.172 | 2882 | 2881 | 1002 | 0 | zone2 | 2 | 192.168.117.171:2882:1640439545129458,192.168.117.172:2882:1640439545129458,192.168.117.173:2882:1640439545129458 | 0 | 0 | 1 | 0 | 0 | 0 | | 0 | 0 | 0 | REPLICA_STATUS_NORMAL | 0 | 3 | | 1001 | 1100611139453777 | -1 | 1 | 192.168.117.172 | 2882 | 2881 | 1002 | 0 | zone2 | 2 | 192.168.117.171:2882:1640439545129583,192.168.117.172:2882:1640439545129583,192.168.117.173:2882:1640439545129583 | 0 | 0 | 1 | 0 | 0 | 0 | | 0 | 0 | 0 | REPLICA_STATUS_NORMAL | 0 | 3 | | 1001 | 1100611139453777 | -1 | 2 | 192.168.117.172 | 2882 | 2881 | 1002 | 0 | zone2 | 2 | 192.168.117.171:2882:1640439545129686,192.168.117.172:2882:1640439545129686,192.168.117.173:2882:1640439545129686 | 0 | 0 | 1 | 0 | 0 | 0 | | 0 | 0 | 0 | REPLICA_STATUS_NORMAL | 0 | 3 | | 1001 | 1100611139453777 | -1 | 0 | 192.168.117.173 | 2882 | 2881 | 1003 | 0 | zone3 | 2 | 192.168.117.171:2882:1640439545129458,192.168.117.172:2882:1640439545129458,192.168.117.173:2882:1640439545129458 | 0 | 0 | 1 | 0 | 0 | 0 | | 0 | 0 | 0 | REPLICA_STATUS_NORMAL | 0 | 3 | | 1001 | 1100611139453777 | -1 | 1 | 192.168.117.173 | 2882 | 2881 | 1003 | 0 | zone3 | 2 | 192.168.117.171:2882:1640439545129583,192.168.117.172:2882:1640439545129583,192.168.117.173:2882:1640439545129583 | 0 | 0 | 1 | 0 | 0 | 0 | | 0 | 0 | 0 | REPLICA_STATUS_NORMAL | 0 | 3 | | 1001 | 1100611139453777 | -1 | 2 | 192.168.117.173 | 2882 | 2881 | 1003 | 0 | zone3 | 2 | 192.168.117.171:2882:1640439545129686,192.168.117.172:2882:1640439545129686,192.168.117.173:2882:1640439545129686 | 0 | 0 | 1 | 0 | 0 | 0 | | 0 | 0 | 0 | REPLICA_STATUS_NORMAL | 0 | 3 | +-----------+------------------+---------------+--------------+-----------------+----------+----------+---------+---------------+-------+------+-------------------------------------------------------------------------------------------------------------------+-----------+-----------+--------------+--------------------+---------------+--------------+-----------------+---------+--------------+---------------+-----------------------+------------+--------+ 9 rows in set (0.048 sec)

 

15. 了解表组

分区的分布以及leader的分布是随机的,在分布式数据库里,跨节点的请求时性能会有下降。

OceanBase 对于同一个表分组中的表的同号分区会管理为一个分区组。
同一个分区组中的分区,OceanBase 会尽可能的分配到同一个节点内部,这样就可以规避跨节点的请求。

举例:前提条件: 表>=2 且为分区表。
ordr 和 ordl都为hash分区表,它们的同个分区号 如p0会在同一个observer里面,而不会跨observer分布。
目前官方举例hash分区,不知道range分区是否可行?后续有机会测试一波。

15.1 创建表组

MySQL [test]> show tablegroups; +-----------------+------------+---------------+ | Tablegroup_name | Table_name | Database_name | +-----------------+------------+---------------+ | oceanbase | NULL | NULL | +-----------------+------------+---------------+ 3 rows in set (0.017 sec) MySQL [test]> create tablegroup tgorder partition by hash partitions 3; Query OK, 0 rows affected (0.053 sec) MySQL [test]> show tablegroups; +-----------------+------------+---------------+ | Tablegroup_name | Table_name | Database_name | +-----------------+------------+---------------+ | oceanbase | NULL | NULL | | tgorder | NULL | NULL | +-----------------+------------+---------------+ 3 rows in set (0.017 sec)

15.2 新建表并加入表组

create table ordr ( o_w_id int , o_d_id int , o_id int , o_c_id int , o_carrier_id int , o_ol_cnt int , o_all_local int , o_entry_d date , index iordr(o_w_id, o_d_id, o_c_id, o_id) local , primary key ( o_w_id, o_d_id, o_id ) )tablegroup tgorder partition by hash(o_w_id) partitions 3; create table ordl ( ol_w_id int , ol_d_id int , ol_o_id int , ol_number int , ol_delivery_d date , ol_amount decimal(6, 2) , ol_i_id int , ol_supply_w_id int , ol_quantity int , ol_dist_info char(24) , primary key (ol_w_id, ol_d_id, ol_o_id, ol_number ) )tablegroup tgorder partition by hash(ol_w_id) partitions 3; MySQL [test]> show tablegroups; +-----------------+------------+---------------+ | Tablegroup_name | Table_name | Database_name | +-----------------+------------+---------------+ | oceanbase | NULL | NULL | | tgorder | ordl | test | | tgorder | ordr | test | +-----------------+------------+---------------+ 4 rows in set (0.014 sec) # 或者使用以下方法,加入表组。 MySQL [test]> alter tablegroup tgorder add ordl , ordr ; Query OK, 0 rows affected (0.016 sec) MySQL [test]> show tablegroups; +-----------------+------------+---------------+ | Tablegroup_name | Table_name | Database_name | +-----------------+------------+---------------+ | oceanbase | NULL | NULL | | tgorder | ordl | test | | tgorder | ordr | test | +-----------------+------------+---------------+ 3 rows in set (0.004 sec)

15.3 将表从表组移出

MySQL [test]> alter table ordl tablegroup = ''; Query OK, 0 rows affected (0.148 sec) MySQL [test]> alter table ordr tablegroup = ''; Query OK, 0 rows affected (0.018 sec) MySQL [test]> show tablegroups; +-----------------+------------+---------------+ | Tablegroup_name | Table_name | Database_name | +-----------------+------------+---------------+ | oceanbase | NULL | NULL | | tgorder | NULL | NULL | +-----------------+------------+---------------+ 2 rows in set (0.004 sec)

 

16. 了解复制表

参考官方文档。

普通表
普通的表在生产环境,默认有三副本,其中一个主副本和两个备副本。备副本通过同步主副本的事务日志 clog 保持同步,同步协议是 Paxos 协议,主副本的事务日志只有在多数成员里确认落盘后,事务修改才会生效。

复制表

普通表可以变为复制表,然后主副本和所有备副本之间使用全同步协议,主副本的事务日志只有在所有副本成员里确认落盘后,事务修改才会生效。

注意 : 不要被 冗余副本影响了,通过区分协议理解普通表和复制表的概念。

复制表解决的问题

业务上存在一些表,这些表的更新的频率很低,但是访问的频率非常高,并且要求总是能够访问到最新的数据
而普通表的风险就是备副本的读会有些许延迟,并且可能有远程SQL的问题。

create table test_dup ( id bigint not null auto_increment , c1 varchar(50), c2 timestamp not null default current_timestamp ) duplicate_scope='cluster' ; Query OK, 0 rows affected (0.12 sec) MySQL [oceanbase]> select table_name,duplicate_scope from oceanbase.__all_table_v2 where table_name in ('test_range','test_dup'); +------------+-----------------+ | table_name | duplicate_scope | +------------+-----------------+ | test_range | 0 | | test_dup | 1 | +------------+-----------------+ 2 rows in set (0.003 sec)

17. 查看执行计划

OceanBase的执行计划,阅读起来还是可以的。后续有机会对比下Oracle和MySQL8的执行计划。

17.1 复制表/普通表

 
简单了解非分区表的执行计划。

MySQL [test]> desc test_dup; +-------+-------------+------+-----+-------------------+----------------+ | Field | Type | Null | Key | Default | Extra | +-------+-------------+------+-----+-------------------+----------------+ | id | bigint(20) | NO | | NULL | auto_increment | | c1 | varchar(50) | YES | | NULL | | | c2 | timestamp | NO | | CURRENT_TIMESTAMP | | +-------+-------------+------+-----+-------------------+----------------+ # 估算行数 好像有点不对劲 ? 空表 100000 rows 还是挺多的。 MySQL [test]> explain select * from test_dup\G *************************** 1. row *************************** Query Plan: ======================================== |ID|OPERATOR |NAME |EST. ROWS|COST | ---------------------------------------- |0 |TABLE SCAN|test_dup|100000 |66272| ======================================== Outputs & filters: ------------------------------------- 0 - output([test_dup.id], [test_dup.c1], [test_dup.c2]), filter(nil), access([test_dup.id], [test_dup.c1], [test_dup.c2]), partitions(p0) 1 row in set (1.085 sec) MySQL [test]> insert into test_dup values (1,'haha','2021-12-25'); Query OK, 1 row affected (1.683 sec) MySQL [test]> commit; Query OK, 0 rows affected (0.003 sec) MySQL [test]> select * from test_dup; +----+------+---------------------+ | id | c1 | c2 | +----+------+---------------------+ | 1 | haha | 2021-12-25 00:00:00 | +----+------+---------------------+ 1 row in set (0.070 sec) # 插入数据后,应该是重新估算了,行数正常。 # partitions(p0) , 非分区表的partitions默认为p0。 MySQL [test]> explain select * from test_dup\G *************************** 1. row *************************** Query Plan: ======================================= |ID|OPERATOR |NAME |EST. ROWS|COST| --------------------------------------- |0 |TABLE SCAN|test_dup|1 |36 | ======================================= Outputs & filters: ------------------------------------- 0 - output([test_dup.id], [test_dup.c1], [test_dup.c2]), filter(nil), access([test_dup.id], [test_dup.c1], [test_dup.c2]), partitions(p0)

17.2 分区表

根据使用分区条件简单了解OceanBase的执行计划。

17.2.1 查询分区表全表

可以看到左下角的partitionsp[0-2],代表所有分区都会被扫描。
Snipaste_20211226_121032.png

17.2.1 查询分区

可以看到执行计划中,显示的partitions是p2,即只扫描了P2016分区,不过不是笔者指定的分区名字。

Snipaste_20211226_121136.png

 

18. 总结经验

  1. 内存要充足。因为笔者的电脑资源有限,所以关闭了一个节点,发现无法truncate和新建表。OceanBase 数据库中的系统变量 ob_create_table_strict_mode= TRUE,表示会严格按照 Locality 建立副本,任何副本建立失败,则创建表失败。将 ob_create_table_strict_mode 设置为 FALSE,保证在节点下线期间可以正常创建表。
    所以,资源充足的情况下,要看看是不是有个节点下线了。
ERROR 4624 (HY000): machine resource is not enough to hold a new unit
  1. 因为OBD工具有维护功能,笔者尝试利用OBD工具加入手动部署的集群,但失败告终。其实问题不大,查询状态的信息基本上可以通过脚本实现,修改配置也可以通过修改启动observer的配置完成。另外,手动部署的集群在启停方面的灵活性比OBD工具部署的集群更好,请见本文第8点!
     

  2. 为了进一步了解zone的扩展,笔者尝试先部署zone1-observer1,然后初始化集群,最后加入新的zone,但发现新增的zone的zone_type信息为LOCAL,而不是ReadWrite。通过查询__all_tenant的zone_list、primary_zone、locality只有zone1,尝试ALTER TENANT 增加 zone2和zone3,但该版本不支持修改。所以,笔者建议社区版初始化集群时,就把rs_list写全。
     

  3. 了解分区表的leader主副本分布,由于笔者经验有限,没有找出,但笔者相信肯定是可以找到的,因为了解主副本的分布,对于架构优化,还是有挺大的帮助。
     

  4. 由于笔者电脑资源有限,无法演示 1-1-1 集群扩展至 3-3-3 集群,比较遗憾,否则将会更好地理解OceanBase的易扩展性。后续有机会再补上。
     

  5. 集群的某个节点正常关闭,过一段时间后,启动该节点,集群正常,且相应的数据变更会同步到刚启动的节点。
     
    最后,感谢您的阅读,如有不妥之处,欢迎提出,谢谢!

最后修改时间:2022-01-12 21:25:01
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论