本章节主要介绍MogDB数据库TPCC测试方法,以及为达到最佳tpmC性能所依赖的关键系统级调优。
硬件环境
- 服务器
- 最佳TPCC结果是使用4路鲲鹏服务器(256C, 512G-1024G内存) + 一个2路鲲鹏服务器
- 常规可使用2个2路鲲鹏服务器,128C, 512G-1024G内存
- 2台X86也可以,但测试指南未使用NUMA优化
- 硬盘
- 数据库端尽可能使用两块NVME闪存卡
- 其次使用3-4块SSD硬盘
- 网卡
- 鲲鹏配套的Hi1822
- X86尽可能使用万兆网卡
软件环境
-
数据库:MogDB 2.1.1
-
TPCC客户端:使用tidb优化过的BenchmarkSQL 5.0(https://github.com/pingcap/benchmarksql)
-
依赖包
所需软件 建议版本 numactl – jdk 1.8.0-242 ant 1.10.5 htop –
测试步骤
-
安装MogDB。参考安装MogDB,单机部署即可。
-
初始化参数设置,并重启数据库,使参数生效。参考推荐参数设置及新建测试库。
-
下载TPCC标准测试工具BenchmarkSQL5.0。
[root@node151 ~]# git clone -b 5.0-mysql-support-opt-2.1 https://github.com/pingcap/benchmarksql.git Cloning into 'tpcc-mysql'... remote: Enumerating objects: 106, done. remote: Total 106 (delta 0), reused 0 (delta 0), pack-reused 106 Receiving objects: 100% (106/106), 64.46 KiB | 225.00 KiB/s, done. Resolving deltas: 100% (30/30), done.
复制 -
下载安装JDK和ant依赖包。
[root@node151 ~]# rpm -ivh ant-1.10.5-6.oe1.noarch.rpm jdk-8u281-linux-aarch64.rpm --force --nodeps warning: ant-1.10.5-6.oe1.noarch.rpm: Header V3 RSA/SHA1 Signature, key ID b25e7f66: NOKEY warning: jdk-8u281-linux-aarch64.rpm: Header V3 RSA/SHA256 Signature, key ID ec551f03: NOKEY Verifying... ################################# [100%] Preparing... ################################# [100%] Updating / installing... 1:jdk1.8-2000:1.8.0_281-fcs ################################# [ 50%] Unpacking JAR files... tools.jar... rt.jar... jsse.jar... charsets.jar... localedata.jar... 2:ant-0:1.10.5-6.oe1 ################################# [100%]
复制 -
配置JAVA环境变量
[root@node151 ~]# tail -3 /root/.bashrc export JAVA_HOME=/usr/java/jdk1.8.0_281-aarch64 export PATH=$JAVA_HOME/bin:$PATH export CLASSPATH=.:JAVA_HOME/lib:$BENCHMARKSQLPATH/run/ojdbc7.jar
复制 -
在BenchmarkSQL所在目录下输入ant命令进行编译,编译成功后会生成build和dist两个目录。
[root@node151 benchmarksql-5.0-mysql-support-opt-2.1]# pwd /tmp/benchmarksql-5.0-mysql-support-opt-2.1 [root@node151 benchmarksql-5.0-mysql-support-opt-2.1]# ant Buildfile: /tmp/benchmarksql-5.0-mysql-support-opt-2.1/build.xml init: [mkdir] Created dir: /tmp/benchmarksql-5.0-mysql-support-opt-2.1/build compile: [javac] Compiling 12 source files to /tmp/benchmarksql-5.0-mysql-support-opt-2.1/build dist: [mkdir] Created dir: /tmp/benchmarksql-5.0-mysql-support-opt-2.1/dist [jar] Building jar: /tmp/benchmarksql-5.0-mysql-support-opt-2.1/dist/BenchmarkSQL-5.0.jar BUILD SUCCESSFUL Total time: 1 second
复制 -
根据您的系统架构下载对应的JDBC驱动(https://opengauss.org/zh/download.html)至BenchmarkSQL目录的lib/postgresql文件夹,并解压,删除自带的JDBC驱动。
[root@node151 postgres]# pwd /tmp/benchmarksql-5.0-mysql-support-opt-2.1/lib/postgres/ [root@node151 postgres]# ls openGauss-2.0.0-JDBC.tar.gz postgresql-9.3-1102.jdbc41.jar [root@node151 postgres]# rm -f postgresql-9.3-1102.jdbc41.jar [root@node151 postgres]# tar -xf openGauss-2.0.0-JDBC.tar.gz [root@node151 postgres]# ls openGauss-2.0.0-JDBC.tar.gz postgresql.jar
复制 -
数据库端准备,创建数据库tpcc_db及用户tpcc。
[omm@node151 ~]$ gsql -d postgres -p 26000 -r postgres=# create database tpcc_db; CREATE DATABASE postgres=# \q [omm@node151 ~]$ gsql -d tpcc_db -p 26000 -r tpcc_db=# CREATE USER tpcc WITH PASSWORD "tpcc@123"; CREATE ROLE tpcc_db=# GRANT ALL ON schema public TO tpcc; GRANT tpcc_db=# ALTER User tpcc sysadmin; ALTER ROLE
复制 -
客户端准备,进入BenchmarkSQL目录下的run文件夹,编辑benchmarksql配置文件,修改测试参数,包括数据库用户名、密码、IP、端口、数据库。
[root@node151 db1]# cd /tmp/benchmarksql-5.0-mysql-support-opt-2.1/run [root@node151 run]# vim props.mogdb db=postgres driver=org.postgresql.Driver conn=jdbc:postgresql://172.16.0.176:26000/tpcc_db?prepareThreshold=1&batchMode=on&fetchsize=10&loggerLevel=off #修改连接字符串, 包含IP、端口号、数据库 user=tpcc #用户名 password=tpcc@123 #密码 warehouses=100 #仓位数 terminals=300 #并发数 runMins=5 #运行时间 runTxnsPerTerminal=0 loadWorkers=100 limitTxnsPerMin=0 terminalWarehouseFixed=false newOrderWeight=45 paymentWeight=43 orderStatusWeight=4 deliveryWeight=4 stockLevelWeight=4
复制 -
初始化数据
[root@node151 run]# sh runDatabaseBuild.sh props.mogdb # ------------------------------------------------------------ # Loading SQL file ./sql.common/tableCreates.sql # ------------------------------------------------------------ create table bmsql_config ( cfg_name varchar(30) primary key, cfg_value varchar(50) );
复制...... # ------------------------------------------------------------ # Loading SQL file ./sql.postgres/buildFinish.sql # ------------------------------------------------------------ -- ---- -- Extra commands to run after the tables are created, loaded, -- indexes built and extra's created. -- PostgreSQL version. -- ---- vacuum analyze;
复制 -
修改runBenchmark.sh文件中funcs.sh所在的实际路径。
[root@node151 run]# vim runBenchmark.sh #!/usr/bin/env bash if [ $# -ne 1 ] ; then echo "usage: $(basename $0) PROPS_FILE" >&2 exit 2 fi SEQ_FILE="./.jTPCC_run_seq.dat" if [ ! -f "${SEQ_FILE}" ] ; then echo "0" > "${SEQ_FILE}" fi SEQ=$(expr $(cat "${SEQ_FILE}") + 1) || exit 1 echo "${SEQ}" > "${SEQ_FILE}" source /tmp/benchmarksql-5.0-mysql-support-opt-2.1/run/funcs.sh $1 #将此处路径修改为文件所在的实际路径 setCP || exit 1 myOPTS="-Dprop=$1 -DrunID=${SEQ}" java -cp "$myCP" $myOPTS jTPCC
复制 -
开始测试,运行tpcc跑分,tpmC部分即为测试结果,结果同时保存在 runLog_mmdd-hh24miss.log 下。
[root@node151 run]# sh runBenchmark.sh props.mogdb| tee runLog_`date +%m%d-%H%M%S`.log ... 15:08:26,663 [Thread-16] INFO jTPCC : Term-00, Measured tpmC (NewOrders) = 106140.46 15:08:26,663 [Thread-16] INFO jTPCC : Term-00, Measured tpmTOTAL = 235800.39 15:08:26,664 [Thread-16] INFO jTPCC : Term-00, Session Start = 2021-08-04 15:03:26 15:08:26,664 [Thread-16] INFO jTPCC : Term-00, Session End = 2021-08-04 15:08:26 15:08:26,664 [Thread-16] INFO jTPCC : Term-00, Transaction Count = 1179449 15:08:26,664 [Thread-16] INFO jTPCC : executeTime[Payment]=29893614 15:08:26,664 [Thread-16] INFO jTPCC : executeTime[Order-Status]=2564424 15:08:26,664 [Thread-16] INFO jTPCC : executeTime[Delivery]=4438389 15:08:26,664 [Thread-16] INFO jTPCC : executeTime[Stock-Level]=4259325 15:08:26,664 [Thread-16] INFO jTPCC : executeTime[New-Order]=48509926
复制调整props.mogdb,或者使用多个props文件,根据需要进行多次测试。
-
为了避免多次测试导致数据量太大,影响性能,可以把数据清空重新开始。
[root@node151 run]# sh runDatabaseDestroy.sh props.mogdb # ------------------------------------------------------------ # Loading SQL file ./sql.common/tableDrops.sql # ------------------------------------------------------------ drop table bmsql_config; drop table bmsql_new_order; drop table bmsql_order_line; drop table bmsql_oorder; drop table bmsql_history; drop table bmsql_customer; drop table bmsql_stock; drop table bmsql_item; drop table bmsql_district; drop table bmsql_warehouse; drop sequence bmsql_hist_id_seq;
复制在调试时,可以一次Build, 多次Run,但是如果是正式测试,建议每次都是 Build / Run / Destroy。
调优
1. 主机优化(鲲鹏专享)
调整BIOS
- BIOS>Advanced>MISC Config,配置Support Smmu为Disabled
- BIOS>Advanced>MISC Config,配置CPU Prefetching Configuration为Disabled
- BIOS>Advanced>Memory Config,配置Die Interleaving为Disable
2. 操作系统优化(鲲鹏专享)
-
修改操作系统内核PAGESIZE为64KB(一般默认值)
-
关闭irqbalance
systemctl stop irqbalance
复制 -
调整numa_balance
echo 0 > /proc/sys/kernel/numa_balancing
复制 -
调整透明大页
echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled echo 'never' > /sys/kernel/mm/transparent_hugepage/defrag
复制 -
针对nvme磁盘io队列调度机制设置。
echo none > /sys/block/nvme*n*/queue/scheduler
复制
3. 文件系统配置
-
格式为xfs,数据库大小为8K
mkfs.xfs -b size=8192 /dev/nvme0n1 -f
复制
4. 网络配置
-
网卡多中断队列设置
下载 IN500_solution_5.1.0.SPC401.zip 安装hinicadm
[root@node151 fc]# pwd /root/IN500_solution_5/tools/linux_arm/fc [root@node151 fc]# rpm -ivh hifcadm-2.4.1.0-1.aarch64.rpm Verifying... ################################# [100%] Preparing... ################################# [100%] package hifcadm-2.4.1.0-1.aarch64 is already installed [root@node151 fc]#
复制 -
修改系统支持的最大中断队列数
[root@node151 config]# pwd /root/IN500_solution_5/tools/linux_arm/nic/config [root@node151 config]# ./hinicconfig hinic0 -f std_sh_4x25ge_dpdk_cfg_template0.ini [root@node151 config]# reboot [root@node151 config]# ethtool -L enp3s0 combined 48
复制不同平台,不同应用的优化值可能不同,当前128核的平台,服务器端调优值为12,客户端调优值为48。
-
中断调优开启tso,lro,gro,gso特性。
ethtool -K enp3s0 tso on ethtool -K enp3s0 lro on ethtool -K enp3s0 gro on ethtool -K enp3s0 gso on
复制 -
网卡固件确认与更新
[root@node151 ~]# ethtool -i enp3s0 driver: hinic version: 2.3.2.11 firmware-version: 2.4.1.0 expansion-rom-version: bus-info: 0000:03:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: no supports-register-dump: no supports-priv-flags: no
复制网卡固件版本应为2.4.1.0
-
更新网卡固件。
[root@node151 cfg_data_nic_prd_1h_4x25G]# pwd /root/IN500_solution_5/firmware/update_bin/cfg_data_nic_prd_1h_4x25G [root@node151 cfg_data_nic_prd_1h_4x25G]# hinicadm updatefw -i enp3s0 -f /root/IN500_solution_5/firmware/update_bin/cfg_data_nic_prd_1h_4x25G/Hi1822_nic_prd_1h_4x25G.bin
复制重启服务器,再确认网卡固件版本成功更新为2.4.1.0。
5. 数据库服务端及客户端绑核
-
鲲鹏上numa绑定优化(128核)
-
数据库主机和客户端主机安装numa
yum install numa* -y
复制网卡驱动安装(包括数据库主机和客户端主机),参考上文网络配置部分。
-
数据库主机
cp `find /opt -name "bind*.sh"|head -1 ` /root sh /root/bind_net_irq.sh 12
复制设置数据库参数:
thread_pool_attr = '345,4,(cpubind:1-28,32-60,64-92,96-124)' enable_thread_pool = on
复制关闭数据库 启动数据库命令替换成:
numactl -C 1-28,32-60,64-92,96-124 mogdb --single_node -D /opt/data/db2/ -p 26000 &
复制 -
客户端 拷贝/root/bind_net_irq.sh 到客户端
sh /root/bind_net_irq.sh 48
复制benchmark启动命令改成
numactl -C 0-19,32-51,64-83,96-115 sh runBenchmark.sh props.mog
复制
-
-
鲲鹏上numa绑定优化(256核)
-
数据库主机和客户端主机安装numa
yum install numa* -y
复制网卡驱动安装(包括数据库主机和客户端主机),参考上文网络配置部分。
-
数据库主机
cp `find /opt -name "bind*.sh"|head -1 ` /root sh /root/bind_net_irq.sh 24
复制设置数据库参数:
thread_pool_attr = '696,4,(cpubind:1-28,32-60,64-92,96-124,128-156,160-188,192-220,224-252)' enable_thread_pool = on
复制关闭数据库 启动数据库命令替换成:
numactl -C 1-28,32-60,64-92,96-124,128-156,160-188,192-220,224-252 mogdb --single_node -D /opt/data/db2/ -p 26000 &
复制 -
客户端 拷贝/root/bind_net_irq.sh 到客户端
sh /root/bind_net_irq.sh 48
复制benchmark启动命令改成
numactl -C 0-19,32-51,64-83,96-115 sh runBenchmark.sh props.mog
复制
-
6. 数据库参数优化(通用)
修改PGDATA下的postgresql.conf,并重启
max_connections = 4096
allow_concurrent_tuple_update = true
audit_enabled = off
cstore_buffers =16MB
enable_alarm = off
enable_codegen = false
enable_data_replicate = off
full_page_writes = off
max_files_per_process = 100000
max_prepared_transactions = 2048
shared_buffers = 350GB
use_workload_manager = off
wal_buffers = 1GB
work_mem = 1MB
transaction_isolation = 'read committed'
default_transaction_isolation = 'read committed'
synchronous_commit = on
fsync = on
maintenance_work_mem = 2GB
vacuum_cost_limit = 2000
autovacuum = on
autovacuum_mode = vacuum
autovacuum_vacuum_cost_delay =10
xloginsert_locks = 48
update_lockwait_timeout =20min
enable_mergejoin = off
enable_nestloop = off
enable_hashjoin = off
enable_bitmapscan = on
enable_material = off
wal_log_hints = off
log_duration = off
checkpoint_timeout = 15min
autovacuum_vacuum_scale_factor = 0.1
autovacuum_analyze_scale_factor = 0.02
enable_save_datachanged_timestamp =FALSE
log_timezone = 'PRC'
timezone = 'PRC'
lc_messages = 'C'
lc_monetary = 'C'
lc_numeric = 'C'
lc_time = 'C'
enable_double_write = on
enable_incremental_checkpoint = on
enable_opfusion = on
numa_distribute_mode = 'all'
track_activities = off
enable_instr_track_wait = off
enable_instr_rt_percentile = off
track_counts =on
track_sql_count = off
enable_instr_cpu_timer = off
plog_merge_age = 0
session_timeout = 0
enable_instance_metric_persistent = off
enable_logical_io_statistics = off
enable_user_metric_persistent =off
enable_xlog_prune = off
enable_resource_track = off
instr_unique_sql_count = 0
enable_beta_opfusion = on
enable_thread_pool = on
#0核用于walwriter线程绑核
enable_partition_opfusion=off
wal_writer_cpu=0
xlog_idle_flushes_before_sleep = 500000000
max_io_capacity = 2GB
dirty_page_percent_max = 0.1
candidate_buf_percent_target = 0.7
bgwriter_delay = 500
pagewriter_sleep = 30
checkpoint_segments =10240
advance_xlog_file_num = 100
autovacuum_max_workers = 20
autovacuum_naptime = 5s
bgwriter_flush_after = 256kB
data_replicate_buffer_size = 16MB
enable_stmt_track = off
remote_read_mode=non_authentication
wal_level = archive
hot_standby = off
hot_standby_feedback = off
client_min_messages = ERROR
log_min_messages = FATAL
enable_asp = off
enable_bbox_dump = off
enable_ffic_log = off
enable_twophase_commit = off
minimum_pool_size = 200
wal_keep_segments = 1025
incremental_checkpoint_timeout = 5min
max_process_memory = 12GB
vacuum_cost_limit = 10000
xloginsert_locks = 8
wal_writer_delay = 100
wal_file_init_num = 30
wal_level=minimal
max_wal_senders=0
fsync=off
synchronous_commit = off
enable_indexonlyscan=on
thread_pool_attr = '345,4,(cpubind:1-28,32-60,64-92,96-124)'
enable_page_lsn_check = off
enable_double_write = off
复制
7. benchmarksql调优
-
连接串
conn=jdbc:postgresql://10.10.10.40:26000/tpcc?prepareThreshold=1&batchMode=on&fetchsize=10&loggerLevel=off
复制 -
修改文件内容将数据分散,调整FILLFACTOR,数据分区。
[root@node151 ~]# ls benchmarksql-5.0-mysql-support-opt-2.1/run/sql.common/tableCreates.sql benchmarksql-5.0-mysql-support-opt-2.1/run/sql.common/tableCreates.sql [root@node151 sql.common]# cat tableCreates.sql CREATE TABLESPACE example2 relative location 'tablespace2'; CREATE TABLESPACE example3 relative location 'tablespace3'; create table bmsql_config ( cfg_name varchar(30), cfg_value varchar(50) ); create table bmsql_warehouse ( w_id integer not null, w_ytd decimal(12,2), w_tax decimal(4,4), w_name varchar(10), w_street_1 varchar(20), w_street_2 varchar(20), w_city varchar(20), w_state char(2), w_zip char(9) ) WITH (FILLFACTOR=80); create table bmsql_district ( d_w_id integer not null, d_id integer not null, d_ytd decimal(12,2), d_tax decimal(4,4), d_next_o_id integer, d_name varchar(10), d_street_1 varchar(20), d_street_2 varchar(20), d_city varchar(20), d_state char(2), d_zip char(9) ) WITH (FILLFACTOR=80); create table bmsql_customer ( c_w_id integer not null, c_d_id integer not null, c_id integer not null, c_discount decimal(4,4), c_credit char(2), c_last varchar(16), c_first varchar(16), c_credit_lim decimal(12,2), c_balance decimal(12,2), c_ytd_payment decimal(12,2), c_payment_cnt integer, c_delivery_cnt integer, c_street_1 varchar(20), c_street_2 varchar(20), c_city varchar(20), c_state char(2), c_zip char(9), c_phone char(16), c_since timestamp, c_middle char(2), c_data varchar(500) ) WITH (FILLFACTOR=80) tablespace example2; create sequence bmsql_hist_id_seq; create table bmsql_history ( hist_id integer, h_c_id integer, h_c_d_id integer, h_c_w_id integer, h_d_id integer, h_w_id integer, h_date timestamp, h_amount decimal(6,2), h_data varchar(24) ) WITH (FILLFACTOR=80); create table bmsql_new_order ( no_w_id integer not null, no_d_id integer not null, no_o_id integer not null ) WITH (FILLFACTOR=80); create table bmsql_oorder ( o_w_id integer not null, o_d_id integer not null, o_id integer not null, o_c_id integer, o_carrier_id integer, o_ol_cnt integer, o_all_local integer, o_entry_d timestamp ) WITH (FILLFACTOR=80); create table bmsql_order_line ( ol_w_id integer not null, ol_d_id integer not null, ol_o_id integer not null, ol_number integer not null, ol_i_id integer not null, ol_delivery_d timestamp, ol_amount decimal(6,2), ol_supply_w_id integer, ol_quantity integer, ol_dist_info char(24) ) WITH (FILLFACTOR=80); create table bmsql_item ( i_id integer not null, i_name varchar(24), i_price decimal(5,2), i_data varchar(50), i_im_id integer ); create table bmsql_stock ( s_w_id integer not null, s_i_id integer not null, s_quantity integer, s_ytd integer, s_order_cnt integer, s_remote_cnt integer, s_data varchar(50), s_dist_01 char(24), s_dist_02 char(24), s_dist_03 char(24), s_dist_04 char(24), s_dist_05 char(24), s_dist_06 char(24), s_dist_07 char(24), s_dist_08 char(24), s_dist_09 char(24), s_dist_10 char(24) ) WITH (FILLFACTOR=80) tablespace example3;
复制
8.数据库文件位置优化(通用)
通过将默认主目录、xlog、example2、example3分开多个底层磁盘来避免I/O瓶颈。 如果只有2个性能好的盘,优先移走xlog,如果有3个,优先xlog+ example2。 分开的示例方法如下:
PGDATA=/opt/data/mogdb
cd $PGDATA
mv pg_xlog /tpccdir1
ln -s /tpccdir1/pg_xlog .
cd pg_location
mv tablespace2 /tpccdir2
ln -s /tpccdir2/tablespace2 .
mv tablespace3 /tpccdir3
ln -s /tpccdir3/tablespace3 .
复制
9.观察系统资源工具
-
htop 观察CPU使用情况,arm平台需要从源码编译。
使用htop监控数据库服务端和tpcc客户端CPU利用情况,最佳性能测试情况下,各个业务CPU的占用率都非常高(> 90%)。如果有CPU占用率没有达标,可能是绑核方式不对或其他问题,需要定位找到根因进行调整。
-
iostat 查看系统IO使用情况。
-
sar 查看网络使用情况。
-
nmon 作为系统资源整体监控。
部分数据截图
-
数据库htop
-
客户端htop
-
iostat 查看系统IO使用情况
-
sar 查看网络使用情况
理想结果
4路鲲鹏 256C, 1000仓500并发:250W TPMC
2路鲲鹏服务器 100仓100并发:90W
2路鲲鹏服务器 100仓300并发:140W