暂无图片
暂无图片
2
暂无图片
暂无图片
暂无图片

OB 租户线程池、队列测试

OCEANBASE 处理客户端请求(SQL)为线程池模式,在出现性能问题时可以一定程度上避免OS LOAD过高,但也会导致慢SQL(大查询)占用worker线程导致正常业务SQL小查询不能被处理响应变慢、无法响应而影响业务。

OB相关与线程池及相关机制

线程分类

OBServer 会启动很多不同功能的线程,细化的分类请参考ObServer线程的相关章节,本节按照最粗略的标准可以分为以下两类:

  • 一类是处理 SQL 和事务提交的线程,统称为 Worker 线程。

  • 其余的是处理网络 IO、磁盘 IO、Compaction 以及定时任务的线程。

Worker 线程是分租户的,非 Worker 线程是所有租户共享的。本节的租户 CPU 隔离都是针对 Worker线程的。

基于线程数的 CPU 隔离

Unit 的 CPU 隔离是通过一个 Unit 的活跃 Worker 线程数实现的。

由于 SQL 执行过程中可能会有 IO 等待、锁等待等,所以一个线程无法用满一个物理 CPU,故在缺省配置下,OBServer 会给每个 CPU 启动 4 个线程,4 这个倍数可以通过配置 cpu_quota_concurrency 来控制。这就意味着如果一个 Unit 的 MAX_CPU 是10,那么它能同时运行的活跃线程是 40,最大物理 CPU 的占用是 400%, 也就是 10 个 CPU 被超卖成了 40 个 CPU。

大查询处理

我们认为相比于大查询,让短查询尽快返回对用户更有意义,即大查询的查询优先级更低,当大查询和短查询同时争抢 CPU 时,系统会限制大查询的 CPU 使用。

当一个线程执行的 SQL 查询耗时太长,这条查询就会被判定为大查询, 一旦判定为大查询,执行大查询的 Worker 会等在一个 Pthread Condition 上,这样就为其它的 Worker 线程让出了 CPU。

具体实现上,OBServer 在代码中插入了很多检查点,Worker 线程在运行过程中会通过检查点定期检查自己的状态,如果判断应该挂起,那么线程就会等待在一个 Pthread Conditon 上,等到合适的时机再被唤醒。

如果同时有大查询和小查询,大查询最多占用 30% 的 Worker 线程,30% 这个百分比值可以通过配置项 large_query_worker_percentage 来设置。

  • 最大线程数 = unit_max_cpu * worker_per_cpu_quota
  • 活跃线程数 = unit_min_cpu * cpu_quota_concurrency
  • 大查询线程数 = 活跃线程数 * large_query_worker_percentage

有两点需要说明:

  • 当没有小查询的时候,大查询可以用到 100% 的 Worker 线程。只有当同时有大查询和小查询时,30% 的比例才生效。

  • 一个 Worker 因为执行大查询被挂起时,作为补偿,系统可能会新创建一个 Worker 线程,但是总的 Worker 线程不能超过 MAX_CPU 的 10 倍,10 这个倍数可以通过配置项 workers_per_cpu_quota 来设置。

  • large_query_worker_percentage 当大查询较多,并发较高时,并不能限制住活跃线程数的使用,仍然会导致活跃线程数全部占用,导致无法运行小查询,未能起到隔离大查询的效果。

提前识别大查询

由于 OBServer 挂起一个大查询线程,就会启动一个新的 Worker 线程, 但是如果有大量大查询涌入,OBServer 新创建的线程还是被用来处理大查询,很快达到 Worker 数上限,在这批大查询消耗完之前就没有机会再处理短查询了。

为了优化这个场景,OBServer 会在 SQL 开始执行之前预判它是不是大查询,预判的本质就是估计 SQL 的执行时间。预判主要依据以下假设场景:如果两条 SQL 的执行 Plan 是一样的,可以猜测它们的执行时间也是相似的,这样就可以用 Plan 最近的执行时间来判断 SQL 会不会是大查询。

如果某条 SQL 被预判为大查询,那么该查询就会被放入一个特殊的大查询队列,其 Worker 线程会被释放,系统就会接着执行后面的请求了。

相关参数

  • cpu_quota_concurrency

    配置项用于设置集群中每个 CPU 所允许的最大并发数,也可以理解为活跃线程数,限制的最大并发处理客户端请求的线程数,超过会队列阻塞,包括登录

  • workers_per_cpu_quota

    最大worker线程数,目前还不确定作用,因为有cpu_quota_concurrency限制,不确定内部rpc调用或者px是否能额外使用

  • large_query_worker_percentage

    大查询可用的worker比例

上面3个参数均为集群级参数

针对租户最大活跃线程数、大查询隔离效果、队列测试

调整参数

cpu_quota_concurrency <<<限制的最大并发处理客户端请求的线程数,超过会队列阻塞,包括登录
workers_per_cpu_quota <<<最大worker线程数,目前还不确定作用,因为有cpu_quota_concurrency限制,不确定内部rpc调用或者px是否能额外使用
large_query_worker_percentage <<<大查询可用的worker比例

MySQL [oceanbase]> alter system set cpu_quota_concurrency=2;
Query OK, 0 rows affected (0.070 sec)

MySQL [oceanbase]> alter system set workers_per_cpu_quota=4;
Query OK, 0 rows affected (0.066 sec)

MySQL [oceanbase]>

MySQL [oceanbase]> select distinct  name,value from __all_virtual_sys_parameter_stat where name  in ('workers_per_cpu_quota','cpu_quota_concurrency','px_workers_per_cpu_quota')
    -> ;
+--------------------------+-------+
| name                     | value |
+--------------------------+-------+
| px_workers_per_cpu_quota | 10    |
| cpu_quota_concurrency    | 2     |
| workers_per_cpu_quota    | 4     |
+--------------------------+-------+
3 rows in set (0.030 sec)


obclient [SYS]> show parameters like '%large_query_worker_percentage%';
+------+----------+---------------+----------+-------------------------------+-----------+-------+----------------------------------------------------------------------------------------------------+---------+---------+---------+-------------------+
| ZONE | SVR_TYPE | SVR_IP        | SVR_PORT | NAME                          | DATA_TYPE | VALUE | INFO                                                                                               | SECTION | SCOPE   | SOURCE  | EDIT_LEVEL        |
+------+----------+---------------+----------+-------------------------------+-----------+-------+----------------------------------------------------------------------------------------------------+---------+---------+---------+-------------------+
| z2   | observer | 192.168.56.37 |     2882 | large_query_worker_percentage | NULL      | 30    | the percentage of the workers reserved to serve large query request. Range: [0, 100] in percentage | TENANT  | CLUSTER | DEFAULT | DYNAMIC_EFFECTIVE |
| z1   | observer | 192.168.56.36 |     2882 | large_query_worker_percentage | NULL      | 30    | the percentage of the workers reserved to serve large query request. Range: [0, 100] in percentage | TENANT  | CLUSTER | DEFAULT | DYNAMIC_EFFECTIVE |
+------+----------+---------------+----------+-------------------------------+-----------+-------+----------------------------------------------------------------------------------------------------+---------+---------+---------+-------------------+
2 rows in set (0.060 sec)

租户max_cpu为3
cpu_quota_concurrency = 2 <<允许最大并发活动线程数为6
workers_per_cpu_quota = 4
large_query_worker_percentage = 30 << large query 限制 30%线程,也就是1-2个


[root@ocp ~]# sh obm queue
+-----------+---------------+--------------+-----------+---------------+----------------+--------------------+----------------------+---------+---------+---------+---------+---------+---------+--------------+
| tenant_id | svr_ip        | unit_max_cpu | token_cnt | ass_token_cnt | used_lq_tokens | lq_waiting_workers | req_queue_total_size | queue_0 | queue_1 | queue_2 | queue_3 | queue_4 | queue_5 | large_queued |
+-----------+---------------+--------------+-----------+---------------+----------------+--------------------+----------------------+---------+---------+---------+---------+---------+---------+--------------+
|      1004 | 192.168.56.36 |           12 |        24 |            24 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
|      1006 | 192.168.56.36 |            3 |         6 |             6 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
|      1004 | 192.168.56.37 |           12 |        24 |            24 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
|      1006 | 192.168.56.37 |            3 |         6 |             6 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
|      1006 | 192.168.56.38 |            3 |         6 |             6 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
+-----------+---------------+--------------+-----------+---------------+----------------+--------------------+----------------------+---------+---------+---------+---------+---------+---------+--------------+

查询测试租户ID

obclient [oceanbase]> select tenant_id,tenant_name from __all_tenant;
+-----------+-------------+
| tenant_id | tenant_name |
+-----------+-------------+
|         1 | sys         |
|      1004 | test        |
|      1006 | test2       |<<<
+-----------+-------------+
3 rows in set (0.003 sec)

准备测试角本,执行,把active worker压满

角本连接192.168.56.36这台observer,并发6个客户端,运行慢查询
vi test.sh
for i in {1..6}
do
obclient -h192.168.56.36 -P2881 -usys@test2 -pEnmo_2024 -c<<EOF &
with t as (select rownum n from dual connect by rownum<=100000) select count(*) from t a,t b,t c;
EOF
done;
wait

sh test.sh

---6个正在执行的客户端进程
[root@ocp ~]# ps -ef |grep obclient
root     3739370 3739369  0 15:09 pts/0    00:00:00 obclient -h192.168.56.36 -P2881 -usys@test2 -px xxxxxxxxx -c
root     3739371 3739369  0 15:09 pts/0    00:00:00 obclient -h192.168.56.36 -P2881 -usys@test2 -px xxxxxxxxx -c
root     3739372 3739369  0 15:09 pts/0    00:00:00 obclient -h192.168.56.36 -P2881 -usys@test2 -px xxxxxxxxx -c
root     3739373 3739369  0 15:09 pts/0    00:00:00 obclient -h192.168.56.36 -P2881 -usys@test2 -px xxxxxxxxx -c
root     3739374 3739369  0 15:09 pts/0    00:00:00 obclient -h192.168.56.36 -P2881 -usys@test2 -px xxxxxxxxx -c
root     3739375 3739369  0 15:09 pts/0    00:00:00 obclient -h192.168.56.36 -P2881 -usys@test2 -px xxxxxxxxx -c
[root@ocp ~]#

查询队列

[root@ocp ~]# sh obm queue
+-----------+---------------+--------------+-----------+---------------+----------------+--------------------+----------------------+---------+---------+---------+---------+---------+---------+--------------+
| tenant_id | svr_ip        | unit_max_cpu | token_cnt | ass_token_cnt | used_lq_tokens | lq_waiting_workers | req_queue_total_size | queue_0 | queue_1 | queue_2 | queue_3 | queue_4 | queue_5 | large_queued |
+-----------+---------------+--------------+-----------+---------------+----------------+--------------------+----------------------+---------+---------+---------+---------+---------+---------+--------------+
|      1004 | 192.168.56.37 |           12 |        24 |            24 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
|      1006 | 192.168.56.37 |            3 |         6 |             6 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
|      1004 | 192.168.56.36 |           12 |        24 |            24 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
|      1006 | 192.168.56.36 |            3 |         6 |             6 |              1 |                  0 |                    2 |       0 |       0 |       0 |       0 |       2 |       0 |            0 |
|      1006 | 192.168.56.38 |            3 |         6 |             6 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
+-----------+---------------+--------------+-----------+---------------+----------------+--------------------+----------------------+---------+---------+---------+---------+---------+---------+--------------+

关注192.168.56.36
used_lq_tokens 1
req_queue_total_size 2

查询活动会话

查看活动会话,test2 只有 4个
[root@ocp ~]# sh obm as2
+---------------+------------+-------------+--------+-----------+----------------------------------+------------+-----------+---------+--------------------------------------------------------------+----------------+
| svr_ip        | id         | user        | tenant | db        | sql_id                           | time       | retry_cnt | command | info                                                         | user_client_ip |
+---------------+------------+-------------+--------+-----------+----------------------------------+------------+-----------+---------+--------------------------------------------------------------+----------------+
| 192.168.56.36 | 3221588599 | SYS         | test2  | SYS       | B384CE21C242FBA38FB4FF826A7080B6 | 128.443141 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221617751 | SYS         | test2  | SYS       | B384CE21C242FBA38FB4FF826A7080B6 | 128.443264 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221598994 | SYS         | test2  | SYS       | B384CE21C242FBA38FB4FF826A7080B6 | 128.443415 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221594000 | SYS         | test2  | SYS       | B384CE21C242FBA38FB4FF826A7080B6 | 128.443523 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.37 | 3221849116 | ocp_monitor | sys    | oceanbase | 2A6D6C59E6D86C57519267B5EC2CBEC3 |   0.506379 |         0 | Query   | select /*+ query_timeout(10000000) */ sid as session_id, eve | 127.0.0.1      |
| 192.168.56.38 | 3222511209 | root        | sys    | oceanbase | 675CEF94E296A769020C364B4BF14F4A |    0.02959 |         0 | Query   | select svr_ip,id,user,tenant,db,ifnull(sql_id,'') sql_id,tim | 192.168.56.43  |
+---------------+------------+-------------+--------+-----------+----------------------------------+------------+-----------+---------+--------------------------------------------------------------+----------------+


这时登录192.168.56.36,被阻塞,登录无响应

登录不进去
[root@ocp ~]# obclient -h192.168.56.36 -P2881 -usys@test2 -pEnmo_2024 -c

无响应

做简单分析

[root@ocp ~]# sh obm queue
+-----------+---------------+--------------+-----------+---------------+----------------+--------------------+----------------------+---------+---------+---------+---------+---------+---------+--------------+
| tenant_id | svr_ip        | unit_max_cpu | token_cnt | ass_token_cnt | used_lq_tokens | lq_waiting_workers | req_queue_total_size | queue_0 | queue_1 | queue_2 | queue_3 | queue_4 | queue_5 | large_queued |
+-----------+---------------+--------------+-----------+---------------+----------------+--------------------+----------------------+---------+---------+---------+---------+---------+---------+--------------+
|      1004 | 192.168.56.36 |           12 |        24 |            24 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
|      1006 | 192.168.56.36 |            3 |         6 |             6 |              1 |                  0 |                    3 |       0 |       0 |       0 |       0 |       3 |       0 |            0 |
|      1006 | 192.168.56.38 |            3 |         6 |             6 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
|      1004 | 192.168.56.37 |           12 |        24 |            24 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
|      1006 | 192.168.56.37 |            3 |         6 |             6 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
+-----------+---------------+--------------+-----------+---------------+----------------+--------------------+----------------------+---------+---------+---------+---------+---------+---------+--------------+

queue0: high prio rpc,事务相关 rpc
queue1: lock retry,锁冲突重试
queue2: normal prio rpc,普通 rpc,inner sql,sql 触发的 rpc
queue3: lock retry SQL req;OB_TASK;OB_GTS_TASK
queue4: normal SQL req;OB_SQL_TASK
queue5: warmup req


应该有6个可以运行的worker,为什么看不到?
[root@ocp ~]# sh obm as2
+---------------+------------+-------------+--------+-----------+----------------------------------+------------+-----------+---------+--------------------------------------------------------------+----------------+
| svr_ip        | id         | user        | tenant | db        | sql_id                           | time       | retry_cnt | command | info                                                         | user_client_ip |
+---------------+------------+-------------+--------+-----------+----------------------------------+------------+-----------+---------+--------------------------------------------------------------+----------------+
| 192.168.56.36 | 3221588599 | SYS         | test2  | SYS       | B384CE21C242FBA38FB4FF826A7080B6 | 128.443141 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221617751 | SYS         | test2  | SYS       | B384CE21C242FBA38FB4FF826A7080B6 | 128.443264 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221598994 | SYS         | test2  | SYS       | B384CE21C242FBA38FB4FF826A7080B6 | 128.443415 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221594000 | SYS         | test2  | SYS       | B384CE21C242FBA38FB4FF826A7080B6 | 128.443523 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.37 | 3221849116 | ocp_monitor | sys    | oceanbase | 2A6D6C59E6D86C57519267B5EC2CBEC3 |   0.506379 |         0 | Query   | select /*+ query_timeout(10000000) */ sid as session_id, eve | 127.0.0.1      |
| 192.168.56.38 | 3222511209 | root        | sys    | oceanbase | 675CEF94E296A769020C364B4BF14F4A |    0.02959 |         0 | Query   | select svr_ip,id,user,tenant,db,ifnull(sql_id,'') sql_id,tim | 192.168.56.43  |
+---------------+------------+-------------+--------+-----------+----------------------------------+------------+-----------+---------+--------------------------------------------------------------+----------------+

活动会话未达到6, 就不能登录了,是不是有固定2个预留worker,不能用来执行普通请求?

used_lq_tokens(1)+req_queue_total_size(3)=4 < token_cnt(6)  


cat observer.log|grep "dump tenant info(tenant={id:1006"|tr ',' '\n'


[2024-10-21 15:15:14.083539] INFO  [SERVER.OMT] ob_multi_tenant.cpp:834 [3714414][0][Y0-0000000000000000-0-0] [lt=10] [dc=0] dump tenant info(tenant={id:1006
 compat_mode:1
 unit_min_cpu:"3.000000000000000000e+00"
 unit_max_cpu:"3.000000000000000000e+00"
 slice:"0.000000000000000000e+00"
 slice_remain:"0.000000000000000000e+00"
 token_cnt:6
 sug_token_cnt:6
 ass_token_cnt:6
 lq_wait_timeout_workers:0
 lq_tokens:1
 used_lq_tokens:1
 stopped:false
 idle_us:925888
 recv_hp_rpc_cnt:372446
 recv_np_rpc_cnt:763336
 recv_lp_rpc_cnt:0
 recv_mysql_ps_close_cnt:0
 recv_mysql_cnt:222
 recv_task_cnt:64
 recv_large_req_cnt:0
 tt_large_quries:58110948
 pop_normal_cnt:5094841
 actives:6
 workers:6
 nesting workers:7
 lq waiting workers:0
 req_queue:total_size=3 queue[0]=0 queue[1]=0 queue[2]=0 queue[3]=0 queue[4]=3 queue[5]=0
 large queued:0
 reserve queued:0
 multi_level_queue:total_size=0 queue[0]=0 queue[1]=0 queue[2]=0 queue[3]=0 queue[4]=0 queue[5]=0 queue[6]=0 queue[7]=0
 recv_level_rpc_cnt:cnt[0]=0 cnt[1]=0 cnt[2]=0 cnt[3]=0 cnt[4]=0 cnt[5]=45424 cnt[6]=0 cnt[7]=0
 group_map:
 rpc_stat_info: pcode=0x717:cnt=397 pcode=0x710:cnt=200 pcode=0x51c:cnt=32})
[admin@ob1 log]$

下一步测试计划

调整参数再测试能否看到更多活动会话,是否最后2个worker不能使用?

调大cpu_quota_concurrency 未生效

cpu_quota_concurrency修改后未能立即生效,token_cnt增加,ass_token_cnt未增加

这次测试还到一个信息,大查询的worker比例是基于token_cnt计算的

obclient [oceanbase]> alter system set cpu_quota_concurrency=4;
Query OK, 0 rows affected (0.072 sec)

obclient [oceanbase]> alter system set workers_per_cpu_quota=8;
Query OK, 0 rows affected (0.075 sec)


obclient [oceanbase]> select distinct  name,value from __all_virtual_sys_parameter_stat where name  in ('workers_per_cpu_quota','cpu_quota_concurrency','px_workers_per_cpu_quota')
    -> ;
+--------------------------+-------+
| name                     | value |
+--------------------------+-------+
| px_workers_per_cpu_quota | 10    |
| cpu_quota_concurrency    | 4     |
| workers_per_cpu_quota    | 8     |
+--------------------------+-------+
3 rows in set (0.035 sec)



[root@ocp ~]# sh obm queue
+-----------+---------------+--------------+-----------+---------------+----------------+--------------------+----------------------+---------+---------+---------+---------+---------+---------+--------------+
| tenant_id | svr_ip        | unit_max_cpu | token_cnt | ass_token_cnt | used_lq_tokens | lq_waiting_workers | req_queue_total_size | queue_0 | queue_1 | queue_2 | queue_3 | queue_4 | queue_5 | large_queued |
+-----------+---------------+--------------+-----------+---------------+----------------+--------------------+----------------------+---------+---------+---------+---------+---------+---------+--------------+
|      1004 | 192.168.56.37 |           12 |        48 |            24 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
|      1006 | 192.168.56.37 |            3 |        12 |             6 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
|      1004 | 192.168.56.36 |           12 |        48 |            24 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
|      1006 | 192.168.56.36 |            3 |        
|             6 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
|      1006 | 192.168.56.38 |            3 |        12 |             6 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
+-----------+---------------+--------------+-----------+---------------+----------------+--------------------+----------------------+---------+---------+---------+---------+---------+---------+--------------+

cpu_quota_concurrency从2改为4,token_cnt增长到12了,为什么ass_token_cnt没有动?


cpu_quota_concurrency修改后未能立即生效


把并发调到12,重复进行前面的测试
vi test.sh
for i in {1..12}
do
obclient -h192.168.56.36 -P2881 -usys@test2 -pEnmo_2024 -c<<EOF &
with t as (select rownum n from dual connect by rownum<=100000) select count(*) from t a,t b,t c;
EOF
done;
wait


[root@ocp ~]# sh obm as2
+---------------+------------+-------------+--------+-----------+----------------------------------+-----------+-----------+---------+--------------------------------------------------------------+----------------+
| svr_ip        | id         | user        | tenant | db        | sql_id                           | time      | retry_cnt | command | info                                                         | user_client_ip |
+---------------+------------+-------------+--------+-----------+----------------------------------+-----------+-----------+---------+--------------------------------------------------------------+----------------+
| 192.168.56.36 | 3221551599 | SYS         | test2  | SYS       | B384CE21C242FBA38FB4FF826A7080B6 | 18.921777 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221535022 | SYS         | test2  | SYS       | B384CE21C242FBA38FB4FF826A7080B6 | 18.921817 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221531880 | SYS         | test2  | SYS       | B384CE21C242FBA38FB4FF826A7080B6 | 18.921851 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221543406 | SYS         | test2  | SYS       | B384CE21C242FBA38FB4FF826A7080B6 | 18.921883 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.38 | 3222513182 | root        | sys    | oceanbase | 675CEF94E296A769020C364B4BF14F4A |  0.026282 |         0 | Query   | select svr_ip,id,user,tenant,db,ifnull(sql_id,'') sql_id,tim | 192.168.56.43  |
| 192.168.56.38 | 3222513092 | ocp_monitor | sys    | oceanbase | 2A6D6C59E6D86C57519267B5EC2CBEC3 |  2.911484 |         0 | Query   | select /*+ query_timeout(10000000) */ sid as session_id, eve | 127.0.0.1      |
+---------------+------------+-------------+--------+-----------+----------------------------------+-----------+-----------+---------+--------------------------------------------------------------+----------------+
可使用以下方法获取trace_id
select svr_ip,trace_id from __all_virtual_processlist where sql_id='xxxx'
select svr_ip,trace_id from __all_virtual_processlist where id='xxxx'
[root@ocp ~]#


[root@ocp ~]# sh obm queue
+-----------+---------------+--------------+-----------+---------------+----------------+--------------------+----------------------+---------+---------+---------+---------+---------+---------+--------------+
| tenant_id | svr_ip        | unit_max_cpu | token_cnt | ass_token_cnt | used_lq_tokens | lq_waiting_workers | req_queue_total_size | queue_0 | queue_1 | queue_2 | queue_3 | queue_4 | queue_5 | large_queued |
+-----------+---------------+--------------+-----------+---------------+----------------+--------------------+----------------------+---------+---------+---------+---------+---------+---------+--------------+
|      1004 | 192.168.56.37 |           12 |        48 |            24 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
|      1006 | 192.168.56.37 |            3 |        12 |             6 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
|      1004 | 192.168.56.36 |           12 |        48 |            24 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
|      1006 | 192.168.56.36 |            3 |        12 |             6 |              3 |                  0 |                   10 |       0 |       0 |       0 |       0 |      10 |       0 |            0 |
|      1006 | 192.168.56.38 |            3 |        12 |             6 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
+-----------+---------------+--------------+-----------+---------------+----------------+--------------------+----------------------+---------+---------+---------+---------+---------+---------+--------------+


这时虽然ass_token_cnt 还是6,但used_lq_tokens 增加到了3,req_queue_total_size增加到了10
并发运行worker没有增长,但可用的lq worker增加了,queue增加了

[root@ocp ~]# sh obm as2
+---------------+------------+------+--------+-----------+----------------------------------+------------+-----------+---------+--------------------------------------------------------------+----------------+
| svr_ip        | id         | user | tenant | db        | sql_id                           | time       | retry_cnt | command | info                                                         | user_client_ip |
+---------------+------------+------+--------+-----------+----------------------------------+------------+-----------+---------+--------------------------------------------------------------+----------------+
| 192.168.56.36 | 3221551599 | SYS  | test2  | SYS       | B384CE21C242FBA38FB4FF826A7080B6 |  211.08908 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221535022 | SYS  | test2  | SYS       | B384CE21C242FBA38FB4FF826A7080B6 | 211.089114 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221531880 | SYS  | test2  | SYS       | B384CE21C242FBA38FB4FF826A7080B6 | 211.089161 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221543406 | SYS  | test2  | SYS       | B384CE21C242FBA38FB4FF826A7080B6 | 211.089199 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.37 | 3221851781 | root | sys    | oceanbase | 675CEF94E296A769020C364B4BF14F4A |   0.027516 |         0 | Query   | select svr_ip,id,user,tenant,db,ifnull(sql_id,'') sql_id,tim | 192.168.56.43  |
+---------------+------------+------+--------+-----------+----------------------------------+------------+-----------+---------+--------------------------------------------------------------+----------------+


在另一个unit较大的租户再次进行测试,验证会预留2个worker不能使用

# 1004 test 租户测试

[root@ocp ~]# sh obm queue
+-----------+---------------+--------------+-----------+---------------+----------------+--------------------+----------------------+---------+---------+---------+---------+---------+---------+--------------+
| tenant_id | svr_ip        | unit_max_cpu | token_cnt | ass_token_cnt | used_lq_tokens | lq_waiting_workers | req_queue_total_size | queue_0 | queue_1 | queue_2 | queue_3 | queue_4 | queue_5 | large_queued |
+-----------+---------------+--------------+-----------+---------------+----------------+--------------------+----------------------+---------+---------+---------+---------+---------+---------+--------------+
|      1004 | 192.168.56.36 |           12 |        48 |            24 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
|      1006 | 192.168.56.36 |            3 |        12 |             6 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
|      1006 | 192.168.56.38 |            3 |        12 |             6 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
|      1004 | 192.168.56.37 |           12 |        48 |            24 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
|      1006 | 192.168.56.37 |            3 |        12 |             6 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
+-----------+---------------+--------------+-----------+---------------+----------------+--------------------+----------------------+---------+---------+---------+---------+---------+---------+--------------+


create or replace function test_func(wait_second in number) return  number
as
begin
dbms_lock.sleep(wait_second);
return wait_second;
end;
/


for i in {1..24}
do
obclient -h192.168.56.36 -P2881 -usys@test -pEnmo_2024 -c<<EOF &
with t as (select rownum n from     dual connect by rownum<=100) select sum(test_func(5)) from t;
EOF
done;
wait

[root@ocp ~]# sh obm queue
+-----------+---------------+--------------+-----------+---------------+----------------+--------------------+----------------------+---------+---------+---------+---------+---------+---------+--------------+
| tenant_id | svr_ip        | unit_max_cpu | token_cnt | ass_token_cnt | used_lq_tokens | lq_waiting_workers | req_queue_total_size | queue_0 | queue_1 | queue_2 | queue_3 | queue_4 | queue_5 | large_queued |
+-----------+---------------+--------------+-----------+---------------+----------------+--------------------+----------------------+---------+---------+---------+---------+---------+---------+--------------+
|      1004 | 192.168.56.36 |           12 |        48 |            24 |             14 |                  0 |                    2 |       0 |       0 |       0 |       0 |       2 |       0 |            0 |
|      1006 | 192.168.56.36 |            3 |        12 |             6 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
|      1004 | 192.168.56.37 |           12 |        48 |            24 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
|      1006 | 192.168.56.37 |            3 |        12 |             6 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
|      1006 | 192.168.56.38 |            3 |        12 |             6 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
+-----------+---------------+--------------+-----------+---------------+----------------+--------------------+----------------------+---------+---------+---------+---------+---------+---------+--------------+

used_lq_tokens(14)+req_queue_total_size(2)= 16

通过队列无法正确判断并发运行中的worker


-- 22个活动会话,也是最后2个worker无法被使用  ass_token_cnt(24)-活动会话(22)=2   
[root@ocp ~]# sh obm as2
+---------------+------------+------+--------+-----------+----------------------------------+-----------+-----------+---------+--------------------------------------------------------------+----------------+
| svr_ip        | id         | user | tenant | db        | sql_id                           | time      | retry_cnt | command | info                                                         | user_client_ip |
+---------------+------------+------+--------+-----------+----------------------------------+-----------+-----------+---------+--------------------------------------------------------------+----------------+
| 192.168.56.36 | 3221584420 | SYS  | test   | oceanbase | 1B1CB04BA6F21B2561F8367C8BE88F7A | 28.043479 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221572879 | SYS  | test   | oceanbase | 1B1CB04BA6F21B2561F8367C8BE88F7A | 28.046967 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221576580 | SYS  | test   | oceanbase | 1B1CB04BA6F21B2561F8367C8BE88F7A | 28.047764 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221557740 | SYS  | test   | oceanbase | 1B1CB04BA6F21B2561F8367C8BE88F7A | 28.047777 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221585289 | SYS  | test   | oceanbase | 1B1CB04BA6F21B2561F8367C8BE88F7A |  28.04987 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221581799 | SYS  | test   | oceanbase | 1B1CB04BA6F21B2561F8367C8BE88F7A |  28.05049 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221579547 | SYS  | test   | oceanbase | 1B1CB04BA6F21B2561F8367C8BE88F7A | 28.050655 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221549268 | SYS  | test   | oceanbase | 1B1CB04BA6F21B2561F8367C8BE88F7A | 28.050743 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221554868 | SYS  | test   | oceanbase | 1B1CB04BA6F21B2561F8367C8BE88F7A | 28.050785 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221562763 | SYS  | test   | oceanbase | 1B1CB04BA6F21B2561F8367C8BE88F7A | 28.050881 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221586303 | SYS  | test   | oceanbase | 1B1CB04BA6F21B2561F8367C8BE88F7A | 28.050897 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221569728 | SYS  | test   | oceanbase | 1B1CB04BA6F21B2561F8367C8BE88F7A | 28.050943 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221552406 | SYS  | test   | oceanbase | 1B1CB04BA6F21B2561F8367C8BE88F7A | 28.050945 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221572694 | SYS  | test   | oceanbase | 1B1CB04BA6F21B2561F8367C8BE88F7A |  28.05097 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221577831 | SYS  | test   | oceanbase | 1B1CB04BA6F21B2561F8367C8BE88F7A | 28.051235 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221708742 | SYS  | test   | oceanbase | 1B1CB04BA6F21B2561F8367C8BE88F7A | 28.051412 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221555810 | SYS  | test   | oceanbase | 1B1CB04BA6F21B2561F8367C8BE88F7A | 28.051475 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221581413 | SYS  | test   | oceanbase | 1B1CB04BA6F21B2561F8367C8BE88F7A | 28.051669 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221579358 | SYS  | test   | oceanbase | 1B1CB04BA6F21B2561F8367C8BE88F7A | 28.051845 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221572829 | SYS  | test   | oceanbase | 1B1CB04BA6F21B2561F8367C8BE88F7A | 28.051887 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221720029 | SYS  | test   | oceanbase | 1B1CB04BA6F21B2561F8367C8BE88F7A | 28.052159 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.36 | 3221561335 | SYS  | test   | oceanbase | 1B1CB04BA6F21B2561F8367C8BE88F7A | 28.053014 |         0 | Query   | with t as (select rownum n from     dual connect by rownum<= | 192.168.56.43  |
| 192.168.56.38 | 3222522024 | root | sys    | oceanbase | 675CEF94E296A769020C364B4BF14F4A |  0.025835 |         0 | Query   | select svr_ip,id,user,tenant,db,ifnull(sql_id,'') sql_id,tim | 192.168.56.43  |
+---------------+------------+------+--------+-----------+----------------------------------+-----------+-----------+---------+--------------------------------------------------------------+----------------+
可使用以下方法获取trace_id
select svr_ip,trace_id from __all_virtual_processlist where sql_id='xxxx'
select svr_ip,trace_id from __all_virtual_processlist where id='xxxx'
[root@ocp ~]#



[root@ocp ~]# obsys


无法登录

select *  from __all_virtual_dump_tenant_info
 where tenant_id=1004\G
 
 
 
obclient [oceanbase]> select *  from __all_virtual_dump_tenant_info
    ->  where tenant_id=1004\G
*************************** 2. row ***************************
              svr_ip: 192.168.56.36
            svr_port: 2882
           tenant_id: 1004
         compat_mode: 1
        unit_min_cpu: 12
        unit_max_cpu: 12
               slice: 0
        remain_slice: 0
           token_cnt: 48
       ass_token_cnt: 24
           lq_tokens: 14
      used_lq_tokens: 2
             stopped: 0
             idle_us: 3507120
     recv_hp_rpc_cnt: 43055525
     recv_np_rpc_cnt: 45501678
     recv_lp_rpc_cnt: 0
      recv_mysql_cnt: 20066
       recv_task_cnt: 310
  recv_large_req_cnt: 2
  recv_large_queries: 454297
             actives: 24
             workers: 24
  lq_waiting_workers: 0
req_queue_total_size: 0
             queue_0: 0
             queue_1: 0
             queue_2: 0
             queue_3: 0
             queue_4: 0
             queue_5: 0
        large_queued: 0
2 rows in set (0.031 sec)



# 前面的测试是当大查询并发高时,无法登录,如果是已经登录上的长连接,执行小查询是否受影响?




前面的测试是当大查询并发高时,无法登录,如果是已经登录上的长连接,执行小查询是否受影响?

提前登录
obclient [SYS]>


执行并发大查询
[root@ocp test]# sh test.sh


看到已经发生队列等待req_queue_total_size
[root@ocp ~]# sh obm queue
+-----------+---------------+--------------+-----------+---------------+----------------+--------------------+----------------------+---------+---------+---------+---------+---------+---------+--------------+
| tenant_id | svr_ip        | unit_max_cpu | token_cnt | ass_token_cnt | used_lq_tokens | lq_waiting_workers | req_queue_total_size | queue_0 | queue_1 | queue_2 | queue_3 | queue_4 | queue_5 | large_queued |
+-----------+---------------+--------------+-----------+---------------+----------------+--------------------+----------------------+---------+---------+---------+---------+---------+---------+--------------+
|      1004 | 192.168.56.37 |           12 |        48 |            24 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
|      1006 | 192.168.56.37 |            3 |        12 |             6 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
|      1004 | 192.168.56.36 |           12 |        48 |            24 |             14 |                  0 |                    2 |       0 |       0 |       0 |       0 |       2 |       0 |            0 |
|      1006 | 192.168.56.36 |            3 |        12 |             6 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
|      1006 | 192.168.56.38 |            3 |        12 |             6 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
+-----------+---------------+--------------+-----------+---------------+----------------+--------------------+----------------------+---------+---------+---------+---------+---------+---------+--------------+


在原来已经连上库上窗口,执行小查询,同样被阻塞,无法执行,登录队列2增加到3,增加的这个为刚刚发起的小查询

obclient [SYS]> select count(*) from dba_objects;


[root@ocp ~]# sh obm queue
+-----------+---------------+--------------+-----------+---------------+----------------+--------------------+----------------------+---------+---------+---------+---------+---------+---------+--------------+
| tenant_id | svr_ip        | unit_max_cpu | token_cnt | ass_token_cnt | used_lq_tokens | lq_waiting_workers | req_queue_total_size | queue_0 | queue_1 | queue_2 | queue_3 | queue_4 | queue_5 | large_queued |
+-----------+---------------+--------------+-----------+---------------+----------------+--------------------+----------------------+---------+---------+---------+---------+---------+---------+--------------+
|      1006 | 192.168.56.38 |            3 |        12 |             6 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
|      1004 | 192.168.56.36 |           12 |        48 |            24 |             14 |                  0 |                    3 |       0 |       0 |       0 |       0 |       3 |       0 |            0 |
|      1006 | 192.168.56.36 |            3 |        12 |             6 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
|      1004 | 192.168.56.37 |           12 |        48 |            24 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
|      1006 | 192.168.56.37 |            3 |        12 |             6 |              0 |                  0 |                    0 |       0 |       0 |       0 |       0 |       0 |       0 |            0 |
+-----------+---------------+--------------+-----------+---------------+----------------+--------------------+----------------------+---------+---------+---------+---------+---------+---------+--------------+


[root@ocp ~]# sh obm as2
+---------------+------------+-------------+--------+-----------+----------------------------------+------------+-----------+---------+--------------------------------------------------------------+----------------+
| svr_ip        | id         | user        | tenant | db        | sql_id                           | time       | retry_cnt | command | info                                                         | user_client_ip |
+---------------+------------+-------------+--------+-----------+----------------------------------+------------+-----------+---------+--------------------------------------------------------------+----------------+
| 192.168.56.36 | 3221608445 | SYS         | test   | oceanbase | FF0D04A1B1C77C3799B47FF5FB3BAB85 | 320.099773 |         0 | Query   | with t as (select rownum n from   dual connect by rownum<=20 | 192.168.56.43  |
| 192.168.56.36 | 3221604096 | SYS         | test   | oceanbase | FF0D04A1B1C77C3799B47FF5FB3BAB85 | 320.100249 |         0 | Query   | with t as (select rownum n from   dual connect by rownum<=20 | 192.168.56.43  |
| 192.168.56.36 | 3221610132 | SYS         | test   | oceanbase | FF0D04A1B1C77C3799B47FF5FB3BAB85 | 320.101039 |         0 | Query   | with t as (select rownum n from   dual connect by rownum<=20 | 192.168.56.43  |
| 192.168.56.36 | 3221596916 | SYS         | test   | oceanbase | FF0D04A1B1C77C3799B47FF5FB3BAB85 | 320.101338 |         0 | Query   | with t as (select rownum n from   dual connect by rownum<=20 | 192.168.56.43  |
| 192.168.56.36 | 3221607117 | SYS         | test   | oceanbase | FF0D04A1B1C77C3799B47FF5FB3BAB85 | 320.101355 |         0 | Query   | with t as (select rownum n from   dual connect by rownum<=20 | 192.168.56.43  |
| 192.168.56.36 | 3221609368 | SYS         | test   | oceanbase | FF0D04A1B1C77C3799B47FF5FB3BAB85 | 320.101423 |         0 | Query   | with t as (select rownum n from   dual connect by rownum<=20 | 192.168.56.43  |
| 192.168.56.36 | 3221601517 | SYS         | test   | oceanbase | FF0D04A1B1C77C3799B47FF5FB3BAB85 | 320.101603 |         0 | Query   | with t as (select rownum n from   dual connect by rownum<=20 | 192.168.56.43  |
| 192.168.56.36 | 3221591591 | SYS         | test   | oceanbase | FF0D04A1B1C77C3799B47FF5FB3BAB85 | 320.101705 |         0 | Query   | with t as (select rownum n from   dual connect by rownum<=20 | 192.168.56.43  |
| 192.168.56.36 | 3221610301 | SYS         | test   | oceanbase | FF0D04A1B1C77C3799B47FF5FB3BAB85 | 320.101782 |         0 | Query   | with t as (select rownum n from   dual connect by rownum<=20 | 192.168.56.43  |
| 192.168.56.36 | 3221603451 | SYS         | test   | oceanbase | FF0D04A1B1C77C3799B47FF5FB3BAB85 | 320.101818 |         0 | Query   | with t as (select rownum n from   dual connect by rownum<=20 | 192.168.56.43  |
| 192.168.56.36 | 3221612605 | SYS         | test   | oceanbase | FF0D04A1B1C77C3799B47FF5FB3BAB85 | 320.101933 |         0 | Query   | with t as (select rownum n from   dual connect by rownum<=20 | 192.168.56.43  |
| 192.168.56.36 | 3221583930 | SYS         | test   | oceanbase | FF0D04A1B1C77C3799B47FF5FB3BAB85 | 320.101952 |         0 | Query   | with t as (select rownum n from   dual connect by rownum<=20 | 192.168.56.43  |
| 192.168.56.36 | 3221588358 | SYS         | test   | oceanbase | FF0D04A1B1C77C3799B47FF5FB3BAB85 | 320.101986 |         0 | Query   | with t as (select rownum n from   dual connect by rownum<=20 | 192.168.56.43  |
| 192.168.56.36 | 3221605576 | SYS         | test   | oceanbase | FF0D04A1B1C77C3799B47FF5FB3BAB85 | 320.102175 |         0 | Query   | with t as (select rownum n from   dual connect by rownum<=20 | 192.168.56.43  |
| 192.168.56.36 | 3221607591 | SYS         | test   | oceanbase | FF0D04A1B1C77C3799B47FF5FB3BAB85 | 320.102185 |         0 | Query   | with t as (select rownum n from   dual connect by rownum<=20 | 192.168.56.43  |
| 192.168.56.36 | 3221604791 | SYS         | test   | oceanbase | FF0D04A1B1C77C3799B47FF5FB3BAB85 | 320.102193 |         0 | Query   | with t as (select rownum n from   dual connect by rownum<=20 | 192.168.56.43  |
| 192.168.56.36 | 3221603794 | SYS         | test   | oceanbase | FF0D04A1B1C77C3799B47FF5FB3BAB85 | 320.102234 |         0 | Query   | with t as (select rownum n from   dual connect by rownum<=20 | 192.168.56.43  |
| 192.168.56.36 | 3221610659 | SYS         | test   | oceanbase | FF0D04A1B1C77C3799B47FF5FB3BAB85 | 320.102287 |         0 | Query   | with t as (select rownum n from   dual connect by rownum<=20 | 192.168.56.43  |
| 192.168.56.36 | 3221561649 | SYS         | test   | oceanbase | FF0D04A1B1C77C3799B47FF5FB3BAB85 | 320.102416 |         0 | Query   | with t as (select rownum n from   dual connect by rownum<=20 | 192.168.56.43  |
| 192.168.56.36 | 3221557310 | SYS         | test   | oceanbase | FF0D04A1B1C77C3799B47FF5FB3BAB85 | 320.102453 |         0 | Query   | with t as (select rownum n from   dual connect by rownum<=20 | 192.168.56.43  |
| 192.168.56.36 | 3221598819 | SYS         | test   | oceanbase | FF0D04A1B1C77C3799B47FF5FB3BAB85 | 320.102535 |         0 | Query   | with t as (select rownum n from   dual connect by rownum<=20 | 192.168.56.43  |
| 192.168.56.36 | 3221584417 | SYS         | test   | oceanbase | FF0D04A1B1C77C3799B47FF5FB3BAB85 | 320.102605 |         0 | Query   | with t as (select rownum n from   dual connect by rownum<=20 | 192.168.56.43  |
| 192.168.56.37 | 3221865739 | ocp_monitor | sys    | oceanbase | 2A6D6C59E6D86C57519267B5EC2CBEC3 |   1.024772 |         0 | Query   | select /*+ query_timeout(10000000) */ sid as session_id, eve | 127.0.0.1      |
| 192.168.56.38 | 3222525230 | root        | sys    | oceanbase | 675CEF94E296A769020C364B4BF14F4A |   0.026861 |         0 | Query   | select svr_ip,id,user,tenant,db,ifnull(sql_id,'') sql_id,tim | 192.168.56.43  |
+---------------+------------+-------------+--------+-----------+----------------------------------+------------+-----------+---------+--------------------------------------------------------------+----------------+

查看活跃会话,可以发现刚刚执行的小查询并未出现在活动会话中,也就是在队列等待时,SQL还未开始执行,不算活跃会话


总结与思考

  • large_query_worker_percentage` 无法避免慢SQL把worker耗尽,导致ob无法处理新请求
  • 行锁冲突等待时,需要回滚掉当前DML,然后让出线程,避免长时间等待消耗活跃线程,处理能力下降,所以active查询不到
  • 合理的设置相关参数,可以避免过度消耗OS资源导致服务中断,同时也要避免过于保守active worker不足降低处理能力
  • 要合理利用ob sql限流功能,限制慢SQL并行,避免导致active work不足导致无法处理正常业务请求
  • 要理解ob sql限流机制,限流数量基于每个plan_id计数,如某个SQL_ID 限流10,但由于plan cache未能共享,存在3个plan,某该SQL可以达到30并发。
  • cpu_quota_concurrency修改后未能立即生效,token_cnt增加,ass_token_cnt未增加,大查询的worker是基于token_cnt计算,也就是cpu_quota_concurrency调大后,实际可用的worker不会增加,但可用于大查询的worker增加了
最后修改时间:2024-10-23 19:03:44
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论