某客户每天凌晨会有根据前一日分区,多表关连生成临时表(CTAS )批量SQL任务, 数据量每天相差并不大,平时时间也基本相同,此次同一批次中的1个SQL 运行时间从原来半小时左右增加了近3个小时,最终是SQL的table创建成功,需要分析当时的原因。环境oracle 11g 4-node RAC on Linux
分析
这类问题通常可以从以下几个方面分析
- system load
- wait event
- session statistics
- process callstack
GV$ ASH 记录
确认是否有采集快照缺失
SQL> select inst_id,min(sample_time) starttime,max(sample_time)endtime,count(distinct sample_time) snaps from ash0709 group by inst_id;
INST_ID STARTTIME ENDTIME SNAPS
---------- ---------------------------- ---------------------------- ----------
1 09-JUL-22 05.00.00.361 AM 09-JUL-22 08.54.07.608 AM 12298
2 09-JUL-22 05.00.00.904 AM 09-JUL-22 08.54.07.716 AM 13837
4 09-JUL-22 05.00.00.784 AM 09-JUL-22 08.54.08.038 AM 13948
3 09-JUL-22 05.00.00.249 AM 09-JUL-22 08.54.08.085 AM 13948
select inst_id,extract (hour from sample_time) hh,min(sample_time) starttime,max(sample_time)endtime,count(distinct sample_time) snaps
from ash0709
group by inst_id,extract (hour from sample_time)
order by 1,2;
INST_ID HH STARTTIME ENDTIME SNAPS
---------- ---------- ---------------------------- ---------------------------- ----------
1 5 09-JUL-22 05.00.00.361 AM 09-JUL-22 05.59.59.015 AM 3485
1 6 09-JUL-22 06.00.00.025 AM 09-JUL-22 06.59.59.212 AM 3557
1 7 09-JUL-22 07.00.00.222 AM 09-JUL-22 07.59.57.066 AM 2601
1 8 09-JUL-22 08.00.01.116 AM 09-JUL-22 08.54.07.608 AM 2655
2 5 09-JUL-22 05.00.00.904 AM 09-JUL-22 05.59.59.274 AM 3551
2 6 09-JUL-22 06.00.00.284 AM 09-JUL-22 06.59.59.251 AM 3550
2 7 09-JUL-22 07.00.00.261 AM 09-JUL-22 07.59.59.964 AM 3551
2 8 09-JUL-22 08.00.00.964 AM 09-JUL-22 08.54.07.716 AM 3185
3 5 09-JUL-22 05.00.00.249 AM 09-JUL-22 05.59.59.458 AM 3575
3 6 09-JUL-22 06.00.00.468 AM 09-JUL-22 06.59.59.461 AM 3574
3 7 09-JUL-22 07.00.00.461 AM 09-JUL-22 07.59.59.823 AM 3574
3 8 09-JUL-22 08.00.00.833 AM 09-JUL-22 08.54.08.085 AM 3225
4 5 09-JUL-22 05.00.00.784 AM 09-JUL-22 05.59.59.692 AM 3575
4 6 09-JUL-22 06.00.00.692 AM 09-JUL-22 06.59.59.039 AM 3573
4 7 09-JUL-22 07.00.00.039 AM 09-JUL-22 07.59.59.268 AM 3574
4 8 09-JUL-22 08.00.00.268 AM 09-JUL-22 08.54.08.038 AM 3226
-- 检查SQL运行时长
select inst_id,sql_id,min(sample_time) starttime,max(sample_time)endtime,max(sample_time)-min(sample_time) during
from ash0709 where SQL_PLAN_HASH_VALUE=2391750817 group by inst_id,sql_id
3 order by 2,3;
INST_ID SQL_ID STARTTIME ENDTIME DURING
---------- ------------- ---------------------------- ---------------------------- ---------------------------------------------------------------------------
1 0h31pzh4m1rxb 09-JUL-22 05.43.32.089 AM 09-JUL-22 06.15.26.468 AM +000000000 00:31:54.379
2 0h31pzh4m1rxb 09-JUL-22 05.44.50.881 AM 09-JUL-22 06.15.36.270 AM +000000000 00:30:45.389
1 3zr5yfq5039dx 09-JUL-22 05.43.27.019 AM 09-JUL-22 06.27.40.978 AM +000000000 00:44:13.959
3 3zr5yfq5039dx 09-JUL-22 05.44.50.880 AM 09-JUL-22 06.27.34.010 AM +000000000 00:42:43.130
4 3zr5yfq5039dx 09-JUL-22 05.44.51.678 AM 09-JUL-22 06.27.40.259 AM +000000000 00:42:48.581
1 4005sxkg5ag51 09-JUL-22 05.43.14.929 AM 09-JUL-22 06.03.02.400 AM +000000000 00:19:47.471
1 48cbmz2htvjg7 09-JUL-22 05.43.27.019 AM 09-JUL-22 06.15.36.578 AM +000000000 00:32:09.559
2 48cbmz2htvjg7 09-JUL-22 05.44.50.881 AM 09-JUL-22 06.15.46.650 AM +000000000 00:30:55.769
1 4hff619j0w11m 09-JUL-22 05.43.38.129 AM 09-JUL-22 06.15.01.222 AM +000000000 00:31:23.093
4 4hff619j0w11m 09-JUL-22 05.44.51.678 AM 09-JUL-22 06.15.06.303 AM +000000000 00:30:14.625
1 888n9jcb93krn 09-JUL-22 05.43.37.129 AM 09-JUL-22 06.24.18.718 AM +000000000 00:40:41.589
3 888n9jcb93krn 09-JUL-22 05.44.50.880 AM 09-JUL-22 06.24.06.478 AM +000000000 00:39:15.598
2 888n9jcb93krn 09-JUL-22 05.44.50.881 AM 09-JUL-22 06.24.18.516 AM +000000000 00:39:27.635
1 8f9vpq05nzf5b 09-JUL-22 05.43.36.119 AM 09-JUL-22 06.24.05.598 AM +000000000 00:40:29.479
3 8f9vpq05nzf5b 09-JUL-22 05.44.50.880 AM 09-JUL-22 06.23.56.418 AM +000000000 00:39:05.538
2 8f9vpq05nzf5b 09-JUL-22 05.44.50.881 AM 09-JUL-22 06.24.05.366 AM +000000000 00:39:14.485
1 8wcnndfapzzn9 09-JUL-22 05.43.14.929 AM 09-JUL-22 07.16.47.932 AM +000000000 01:33:33.003
4 8wcnndfapzzn9 09-JUL-22 05.44.51.678 AM 09-JUL-22 08.28.10.860 AM +000000000 02:43:19.182
3 8wcnndfapzzn9 09-JUL-22 05.44.52.890 AM 09-JUL-22 08.28.11.872 AM +000000000 02:43:18.982
1 bhw9ycm20dw48 09-JUL-22 05.43.38.129 AM 09-JUL-22 05.44.50.089 AM +000000000 00:01:11.960
3 bhw9ycm20dw48 09-JUL-22 05.44.50.880 AM 09-JUL-22 06.18.46.183 AM +000000000 00:33:55.303
4 bhw9ycm20dw48 09-JUL-22 05.44.51.678 AM 09-JUL-22 06.18.41.011 AM +000000000 00:33:49.333
1 dm6wgxdcs73w1 09-JUL-22 05.43.31.079 AM 09-JUL-22 06.13.21.972 AM +000000000 00:29:50.893
4 dm6wgxdcs73w1 09-JUL-22 05.44.51.678 AM 09-JUL-22 06.13.17.478 AM +000000000 00:28:25.800
NOTE:
因为每天批处理CTAS的执行计划一样,只是分区名区别,可以看到同批次8wcnndfapzzn9
SQLID的运行时间近3小时,其它都在30分钟左右, 并且同一个sql ID是跨实例运行,说明SQL使用了parallel,
并且parallel force local应该是false。从现场得知调度都是连接实例1发起,部分SQL parallel slave
process跨节点完成。
这种部署方式是不建议的,建议以其它业务维度分解单SQL任务在多个实例上,每个实例并行只在本节点完成,减少跨节点的调度,从上面可以看出SQL
ID 4005sxkg5ag51 只在node1 完成,并且是用时最短。
分析问题SQL 8wcnndfapzzn9
-- 时间消耗的执行计划
select inst_id, IS_SQLID_CURRENT,SQL_PLAN_HASH_VALUE,SQL_PLAN_LINE_ID,SQL_PLAN_OPERATION,count(*) es_sec
-- ,SQL_EXEC_ID,SQL_EXEC_START,
-- QC_INSTANCE_ID,QC_SESSION_ID,SEQ# ,
-- SESSION_STATE,TIME_WAITED,BLOCKING_SESSION,
-- CURRENT_OBJ# ,CURRENT_FILE# ,CURRENT_BLOCK# ,
-- TOP_LEVEL_CALL_NAME,IN_PARSE,IN_SQL_EXECUTION,event, time_waited/1000 threshold_in_ms,
-- DELTA_READ_IO_REQUESTS,DELTA_READ_IO_BYTES,PGA_ALLOCATED,TEMP_SPACE_ALLOCATED
from ash0709
where sql_id='8wcnndfapzzn9'
group by inst_id, IS_SQLID_CURRENT,SQL_PLAN_HASH_VALUE,SQL_PLAN_LINE_ID,SQL_PLAN_OPERATION
11 order by 6 desc;
INST_ID I SQL_PLAN_HASH_VALUE SQL_PLAN_LINE_ID SQL_PLAN_OPERATION ES_SEC
---------- - ------------------- ---------------- ------------------------------ ----------
3 Y 2391750817 36 TABLE ACCESS 8931
INST_ID I SQL_PLAN_HASH_VALUE SQL_PLAN_LINE_ID SQL_PLAN_OPERATION ES_SEC
---------- - ------------------- ---------------- ------------------------------ ----------
4 Y 2391750817 36 TABLE ACCESS 8441
Y 2391750817 34 PX SEND 224
Y 2391750817 28 HASH JOIN 195
INST_ID I SQL_PLAN_HASH_VALUE SQL_PLAN_LINE_ID SQL_PLAN_OPERATION ES_SEC
---------- - ------------------- ---------------- ------------------------------ ----------
3 Y 2391750817 28 HASH JOIN 173
select -- inst_id, IS_SQLID_CURRENT,SQL_PLAN_HASH_VALUE,SQL_PLAN_LINE_ID,SQL_PLAN_OPERATION,count(*) es_sec
SQL_EXEC_ID,SQL_EXEC_START, count(*) es_sec
-- QC_INSTANCE_ID,QC_SESSION_ID,SEQ# ,
-- SESSION_STATE,TIME_WAITED,BLOCKING_SESSION,
-- CURRENT_OBJ# ,CURRENT_FILE# ,CURRENT_BLOCK# ,
-- TOP_LEVEL_CALL_NAME,IN_PARSE,IN_SQL_EXECUTION,event, time_waited/1000 threshold_in_ms,
-- DELTA_READ_IO_REQUESTS,DELTA_READ_IO_BYTES,PGA_ALLOCATED,TEMP_SPACE_ALLOCATED
from ash0709
where sql_id='8wcnndfapzzn9'
group by SQL_EXEC_ID,SQL_EXEC_START
11 order by 3 desc;
SQL_EXEC_ID SQL_EXEC_START ES_SEC
----------- ------------------- ----------
16777216 2022-07-09 05:43:17 19033
3
note
最耗时的是执行计划的36# full table scan. 消耗近90%, SQL并且确认运行了一次。
SQL MONITOR
SQL Plan Monitoring Details (Plan Hash Value=2391750817)
=========================================================================================================================================================================================================================================
| Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Read | Read | Write | Write | Mem | Temp | Activity | Activity Detail |
| | | | (Estim) | | Active(s) | Active | | (Actual) | Reqs | Bytes | Reqs | Bytes | (Max) | (Max) | (%) | (# samples) |
=========================================================================================================================================================================================================================================
| 0 | CREATE TABLE STATEMENT | | | | 2138 | +6712 | 5 | 2 | | | | | | | 0.01 | Cpu (1) |
| | | | | | | | | | | | | | | | | enq: TT - contention (1) |
| 1 | PX COORDINATOR | | | | 8849 | +1 | 5 | 2 | | | | | | | 0.02 | enq: KO - fast object checkpoint (2) |
...
| | | | | | | | | | | | | | | | | direct path read (44) |
| 33 | PX RECEIVE | | 305M | 514K | 8231 | +111 | 2 | 428M | | | | | | | 0.39 | Cpu (66) |
| | | | | | | | | | | | | | | | | PX Deq: reap credit (3) |
| 34 | PX SEND HASH | :TQ10005 | 305M | 514K | 8521 | +111 | 2 | 428M | | | | | | | 1.46 | Cpu (254) |
| | | | | | | | | | | | | | | | | PX Deq: reap credit (5) |
| 35 | PX PARTITION LIST ALL | | 305M | 514K | 8521 | +111 | 2 | 428M | | | | | | | 0.31 | Cpu (55) |
| 36 | TABLE ACCESS FULL | MID_xxx_xxx_xxx_DAY | 305M | 514K | 8522 | +110 | 14355 | 428M | 227K | 211GB | | | | | 89.99 | gc cr block lost (1) |
| | | | | | | | | | | | | | | | | gc cr grant 2-way (5) |
| | | | | | | | | | | | | | | | | gc cr multi block request (4) |
| | | | | | | | | | | | | | | | | gc current block 2-way (1) |
| | | | | | | | | | | | | | | | | gc current block 3-way (1) |
| | | | | | | | | | | | | | | | | Cpu (15228) |
| | | | | | | | | | | | | | | | | db file scattered read (13) |
| | | | | | | | | | | | | | | | | db file sequential read (49) |
| | | | | | | | | | | | | | | | | direct path read (656) |
| | | | | | | | | | | | | | | | | read by other session (6) |
=========================================================================================================================================================================================================================================
Global Stats
====================================================================================================================================
| Elapsed | Cpu | IO | Application | Concurrency | Cluster | PL/SQL | Other | Buffer | Read | Read | Write | Write |
| Time(s) | Time(s) | Waits(s) | Waits(s) | Waits(s) | Waits(s) | Time(s) | Waits(s) | Gets | Reqs | Bytes | Reqs | Bytes |
====================================================================================================================================
| 17934 | 1367 | 1285 | 0.40 | 0.15 | 12 | 0.01 | 15269 | 7M | 263K | 224GB | 33481 | 10GB |
====================================================================================================================================
note: 从SQLmonitor中的 buffer gets比平时并未增加,时间是消耗在other waits.
OS 负载
----system---- ----total-cpu-usage---- -dsk/total- -net/total- ------memory-usage----- ----swap--- ---load-avg---
time |usr sys idl wai hiq siq| read writ| recv send| used buff cach free| used free| 1m 5m 15m
08-07 05:59:09| 2 2 89 6 0 0|1169M 2917M| 18M 33M| 155G 1213M 267G 587G| 0 17G|21.2 21.8 21.6
08-07 05:59:10| 3 2 93 1 0 0|2908M 200M| 14M 23M| 155G 1213M 267G 587G| 0 17G|21.2 21.8 21.6
08-07 05:59:11| 3 2 94 1 0 0|1650M 136M|9809k 15M| 155G 1213M 267G 587G| 0 17G|20.9 21.7 21.6
08-07 05:59:12| 3 1 94 1 0 0|2287M 368M| 34M 59M| 155G 1213M 267G 587G| 0 17G|20.9 21.7 21.6
08-07 05:59:13| 3 2 93 2 0 0|3091M 296M| 12M 25M| 155G 1213M 267G 587G| 0 17G|20.9 21.7 21.6
08-07 05:59:14| 3 2 94 1 0 0|1632M 168M| 17M 32M| 155G 1213M 267G 587G| 0 17G|20.9 21.7 21.6
08-07 05:59:15| 3 2 94 1 0 0|2370M 363M| 20M 38M| 155G 1213M 267G 587G| 0 17G|20.6 21.6 21.6
08-07 05:59:16| 3 2 94 1 0 0|2408M 153M| 11M 20M| 155G 1213M 267G 587G| 0 17G|20.6 21.6 21.6
08-07 05:59:17| 3 2 94 1 0 0|1492M 149M| 23M 27M| 155G 1213M 267G 587G| 0 17G|20.6 21.6 21.6
08-07 05:59:18| 3 2 94 1 0 0|2755M 483M| 95M 53M| 155G 1213M 267G 587G| 0 17G|20.6 21.6 21.6
08-07 05:59:19| 3 2 94 1 0 0|2749M 339M| 23M 38M| 155G 1213M 267G 587G| 0 17G|20.6 21.6 21.6
08-07 05:59:21| 3 2 93 1 0 0|3370M 473M| 146M 64M| 155G 1213M 267G 587G| 0 17G|22.1 21.9 21.7
08-07 05:59:23| 3 2 93 1 0 0|4604M 593M| 162M 59M| 155G 1213M 267G 587G| 0 17G|22.1 21.9 21.7 missed 3 ticks
08-07 05:59:25| 3 3 93 1 0 0|3835M 525M| 92M 39M| 155G 1213M 267G 587G| 0 17G|22.1 21.9 21.7 missed 2 ticks
08-07 05:59:27| 2 2 94 1 0 0|2336M 216M| 56M 56M| 155G 1213M 267G 587G| 0 17G|22.1 21.9 21.7
08-07 05:59:29| 2 2 94 2 0 0|3497M 395M| 60M 51M| 155G 1213M 267G 587G| 0 17G|22.1 21.9 21.7 missed 3 ticks
08-07 05:59:31| 2 2 94 1 0 0|3217M 456M| 63M 60M| 155G 1213M 267G 587G| 0 17G|22.4 22.0 21.7 missed 2 ticks
08-07 05:59:33| 2 2 94 2 0 0|3823M 461M| 29M 51M| 155G 1213M 267G 587G| 0 17G|22.4 22.0 21.7 missed 2 ticks
08-07 05:59:35| 3 2 93 2 0 0|3131M 376M| 26M 40M| 155G 1213M 267G 587G| 0 17G|23.8 22.3 21.8 missed 2 ticks
----system---- ----total-cpu-usage---- -dsk/total- -net/total- ------memory-usage----- ----swap--- ---load-avg---
time |usr sys idl wai hiq siq| read writ| recv send| used buff cach free| used free| 1m 5m 15m
09-07 05:59:29| 3 1 95 1 0 0|1394M 124M| 100M 120M| 157G 1217M 267G 584G| 0 17G|19.6 19.7 17.9
09-07 05:59:30| 3 1 95 0 0 0|1378M 66M| 102M 82M| 157G 1217M 267G 584G| 0 17G|19.6 19.7 17.9
09-07 05:59:31| 3 2 95 0 0 0|1114M 69M| 97M 96M| 157G 1217M 267G 584G| 0 17G|19.6 19.7 17.9
09-07 05:59:32| 3 2 95 1 0 0|1728M 119M| 147M 172M| 157G 1217M 267G 584G| 0 17G|18.9 19.5 17.8
09-07 05:59:34| 3 2 94 1 0 0|2564M 136M| 212M 189M| 157G 1217M 267G 584G| 0 17G|18.9 19.5 17.8
09-07 05:59:36| 3 2 94 1 0 0|2560M 195M| 65M 86M| 157G 1217M 267G 584G| 0 17G|18.9 19.5 17.8 missed 2 ticks
09-07 05:59:37| 3 2 94 1 0 0|1492M 87M| 163M 140M| 157G 1217M 267G 584G| 0 17G|18.9 19.5 17.8
09-07 05:59:38| 3 2 94 1 0 0|2445M 197M| 199M 206M| 157G 1217M 267G 584G| 0 17G|18.7 19.5 17.8 missed 2 ticks
09-07 05:59:40| 3 3 94 1 0 0|3014M 215M| 191M 161M| 157G 1217M 267G 584G| 0 17G|18.7 19.5 17.8 missed 2 ticks
09-07 05:59:42| 3 2 94 1 0 0|2378M 146M| 83M 88M| 157G 1217M 267G 584G| 0 17G|18.7 19.5 17.8 missed 2 ticks
09-07 05:59:43| 3 2 94 1 0 0|1745M 117M| 247M 205M| 157G 1217M 267G 584G| 0 17G|18.4 19.4 17.8
09-07 05:59:46| 3 2 94 1 0 0|3229M 228M| 232M 218M| 157G 1217M 267G 584G| 0 17G|18.4 19.4 17.8 missed 2 ticks
09-07 05:59:48| 3 3 94 1 0 0|3091M 247M| 90M 80M| 157G 1217M 267G 584G| 0 17G|18.1 19.3 17.8 missed 3 ticks
09-07 05:59:49| 3 2 94 1 0 0|1363M 87M| 141M 125M| 157G 1217M 267G 584G| 0 17G|18.1 19.3 17.8
09-07 05:59:50| 3 2 93 1 0 0|1931M 153M| 145M 114M| 157G 1217M 267G 584G| 0 17G|18.1 19.3 17.8
09-07 05:59:52| 3 2 94 1 0 0|1647M 99M| 53M 63M| 157G 1217M 267G 584G| 0 17G|18.1 19.3 17.8
09-07 05:59:53| 4 2 93 1 0 0| 825M 107M| 72M 72M| 156G 1217M 267G 585G| 0 17G|18.1 19.3 17.8
note : 7/9问题时间和7/8正常时间负载并无太大差异。
top event
select inst_id,
--IS_SQLID_CURRENT,SQL_PLAN_HASH_VALUE,SQL_PLAN_LINE_ID,SQL_PLAN_OPERATION,count(*) es_sec
-- SQL_EXEC_ID,SQL_EXEC_START, count(*) es_sec
--QC_INSTANCE_ID,QC_SESSION_ID,
CASE WHEN h.SESSION_STATE != 'WAITING' THEN 'On CPU / runqueue' ELSE event end as event , SUM(h.wait_time + h.time_waited)/1000000 "Total Wait Time", SESSION_STATE, BLOCKING_SESSION,count(*) es_sec
-- CURRENT_OBJ# ,CURRENT_FILE# ,CURRENT_BLOCK# ,SEQ# ,
-- TOP_LEVEL_CALL_NAME,IN_PARSE,IN_SQL_EXECUTION,event, time_waited/1000 threshold_in_ms,
-- DELTA_READ_IO_REQUESTS,DELTA_READ_IO_BYTES,PGA_ALLOCATED,TEMP_SPACE_ALLOCATED
from ash0709 h
where sql_id='8wcnndfapzzn9'
group by inst_id,event,SESSION_STATE, BLOCKING_SESSION
12 order by 6 desc;
INST_ID EVENT Total Wait Time SESSION BLOCKING_SESSION ES_SEC
---------- -------------------------------- --------------- ------- ---------------- ----------
3 On CPU / runqueue .163804 ON CPU 9061
INST_ID EVENT Total Wait Time SESSION BLOCKING_SESSION ES_SEC
---------- -------------------------------- --------------- ------- ---------------- ----------
4 On CPU / runqueue .73538 ON CPU 8812
INST_ID EVENT Total Wait Time SESSION BLOCKING_SESSION ES_SEC
---------- -------------------------------- --------------- ------- ---------------- ----------
3 direct path read 28.776845 WAITING 458
INST_ID EVENT Total Wait Time SESSION BLOCKING_SESSION ES_SEC
---------- -------------------------------- --------------- ------- ---------------- ----------
4 direct path read 23.998951 WAITING 407
direct path read temp .492271 WAITING 40
-- top call
select inst_id, CASE WHEN h.SESSION_STATE != 'WAITING' THEN 'On CPU / runqueue' ELSE event end as event , SUM(h.wait_time + h.time_waited)/1000000 "Total Wait Time",
TOP_LEVEL_CALL_NAME,IN_PARSE,IN_SQL_EXECUTION
,count(*) es_sec
from ash0709 h
where sql_id='8wcnndfapzzn9' and SQL_PLAN_LINE_ID=36
6 group by inst_id,SESSION_STATE,event, TOP_LEVEL_CALL_NAME,IN_PARSE,IN_SQL_EXECUTION;
INST_ID EVENT Total Wait Time TOP_LEVEL_CALL_NAME I I ES_SEC
---------- ---------------------------------------- --------------- ---------------------------------------------------------------- - - ----------
4 On CPU / runqueue .102406 LOGOFF N Y 8059
INST_ID EVENT Total Wait Time TOP_LEVEL_CALL_NAME I I ES_SEC
---------- ---------------------------------------- --------------- ---------------------------------------------------------------- - - ----------
3 On CPU / runqueue .06867 VERSION2 N Y 8498
direct path read 20.722015 VERSION2 N Y 361
INST_ID EVENT Total Wait Time TOP_LEVEL_CALL_NAME I I ES_SEC
---------- ---------------------------------------- --------------- ---------------------------------------------------------------- - - ----------
4 direct path read 21.756692 LOGOFF N Y 380
INST_ID EVENT Total Wait Time TOP_LEVEL_CALL_NAME I I ES_SEC
---------- ---------------------------------------- --------------- ---------------------------------------------------------------- - - ----------
3 gc cr block lost .77136 VERSION2 N Y 1
INST_ID EVENT Total Wait Time TOP_LEVEL_CALL_NAME I I ES_SEC
---------- ---------------------------------------- --------------- ---------------------------------------------------------------- - - ----------
4 db file sequential read .006389 LOGOFF N Y 1
INST_ID EVENT Total Wait Time TOP_LEVEL_CALL_NAME I I ES_SEC
---------- ---------------------------------------- --------------- ---------------------------------------------------------------- - - ----------
3 db file sequential read 1.40969 VERSION2 N Y 37
NOTE:
大部分时间是 ON CPU. node4 是调用logoff, node3为version2, 查看历史正常时间段top level call name均是version2. 暂不不确认logoff原因(猜测可能是slave进程退出)。
到现在是逻辑读没增加,又存在2个并行slave节点进程大部时间是on cpu, 系统负载并不高,问题出在哪里?
分析OS层进程状态
SQL> r
select inst_id, CASE WHEN h.SESSION_STATE != 'WAITING' THEN 'On CPU / runqueue' ELSE event end as event , SUM(h.wait_time + h.time_waited)/1000000 "Total Wait Time",
program,TOP_LEVEL_CALL_NAME
,count(*) es_sec
from ash0709 h
where sql_id='8wcnndfapzzn9' and SQL_PLAN_LINE_ID=36
group by inst_id,program,event,TOP_LEVEL_CALL_NAME,SESSION_STATE
INST_ID EVENT Total Wait Time PROGRAM TOP_LEVEL_CALL_NAME ES_SEC
---------- ---------------------------------------- --------------- ------------------------------ ------------------------------ ----------
3 On CPU / runqueue .06867 oracle@anbob3 (P027) VERSION2 8498
3 gc cr block lost .77136 oracle@anbob3 (P027) VERSION2 1
3 gc current block 3-way .001514 oracle@anbob3 (P027) VERSION2 1
4 gc current block 2-way .002936 oracle@anbob4 (P009) LOGOFF 1
3 db file sequential read 1.40969 oracle@anbob3 (P027) VERSION2 37
3 gc cr grant 2-way .093752 oracle@anbob3 (P027) VERSION2 6
3 gc current block 2-way .011192 oracle@anbob3 (P027) VERSION2 5
4 On CPU / runqueue .102406 oracle@anbob4 (P009) LOGOFF 8059
4 direct path read 21.756692 oracle@anbob4 (P009) LOGOFF 380
4 db file sequential read .006389 oracle@anbob4 (P009) LOGOFF 1
3 direct path read 20.722015 oracle@anbob3 (P027) VERSION2 361
3 db file scattered read .579141 oracle@anbob3 (P027) VERSION2 16
3 gc cr multi block request .012058 oracle@anbob3 (P027) VERSION2 6
13 rows selected.
NOTE:
节点4的p009进程和节点3的p027进程。
OSW ps 分析
$ egrep '^zzz|ora_p027' anbob3_ps_22.07.09.0500.dat
zzz ***Sat Jul 9 05:42:59 CST 2022
oracle 23539 1 19 2.7 16.0 263043928 169854500 poll_s S Jun 30 06:11:01 ora_p027_anbob3
zzz ***Sat Jul 9 05:43:29 CST 2022
oracle 23539 1 19 2.7 16.0 263043928 169854972 poll_s S Jun 30 06:11:01 ora_p027_anbob3
zzz ***Sat Jul 9 05:43:59 CST 2022
oracle 23539 1 19 2.7 16.0 263043928 169857232 read_e S Jun 30 06:11:03 ora_p027_anbob3
zzz ***Sat Jul 9 05:44:29 CST 2022
oracle 23539 1 19 2.7 16.0 263043928 169874284 blockd D Jun 30 06:11:04 ora_p027_anbob3
zzz ***Sat Jul 9 05:44:59 CST 2022
oracle 23539 1 19 2.7 16.0 263044052 169955448 synchr D Jun 30 06:11:06 ora_p027_anbob3
zzz ***Sat Jul 9 05:45:29 CST 2022
oracle 23539 1 19 2.7 16.0 263044052 169956480 synchr D Jun 30 06:11:09 ora_p027_anbob3
zzz ***Sat Jul 9 05:45:59 CST 2022
oracle 23539 1 19 2.7 16.0 263043928 169956356 read_e S Jun 30 06:11:12 ora_p027_anbob3
zzz ***Sat Jul 9 05:46:29 CST 2022
oracle 23539 1 19 2.7 16.0 263043928 169956380 read_e S Jun 30 06:11:15 ora_p027_anbob3
zzz ***Sat Jul 9 05:46:59 CST 2022
oracle 23539 1 19 2.7 16.0 263043928 169956384 read_e S Jun 30 06:11:18 ora_p027_anbob3
zzz ***Sat Jul 9 05:47:29 CST 2022
oracle 23539 1 19 2.7 16.0 263044052 169956516 synchr D Jun 30 06:11:21 ora_p027_anbob3
zzz ***Sat Jul 9 05:47:59 CST 2022
oracle 23539 1 19 2.7 16.0 263043928 169956400 poll_s S Jun 30 06:11:24 ora_p027_anbob3
zzz ***Sat Jul 9 05:48:29 CST 2022
oracle 23539 1 19 2.7 16.0 263043928 169956404 poll_s S Jun 30 06:11:29 ora_p027_anbob3
zzz ***Sat Jul 9 05:49:00 CST 2022
oracle 23539 1 19 2.7 16.0 263043928 169956408 poll_s S Jun 30 06:11:35 ora_p027_anbob3
zzz ***Sat Jul 9 05:49:30 CST 2022
oracle 23539 1 19 2.7 16.0 263043928 169958472 ? R Jun 30 06:11:44 ora_p027_anbob3
zzz ***Sat Jul 9 05:50:00 CST 2022
oracle 23539 1 19 2.7 16.0 263043928 169958476 ? R Jun 30 06:11:54 ora_p027_anbob3
zzz ***Sat Jul 9 05:50:30 CST 2022
oracle 23539 1 19 2.7 16.0 263043928 169958496 poll_s S Jun 30 06:12:03 ora_p027_anbob3
zzz ***Sat Jul 9 05:51:00 CST 2022
oracle 23539 1 19 2.7 16.0 263044052 169958688 synchr D Jun 30 06:12:08 ora_p027_anbob3
zzz ***Sat Jul 9 05:51:30 CST 2022
oracle 23539 1 19 2.7 16.0 263044052 169958832 synchr D Jun 30 06:12:09 ora_p027_anbob3
zzz ***Sat Jul 9 05:52:00 CST 2022
oracle 23539 1 19 2.7 16.0 263044052 169958968 synchr D Jun 30 06:12:10 ora_p027_anbob3
zzz ***Sat Jul 9 05:52:30 CST 2022
oracle 23539 1 19 2.7 16.0 263043928 169958916 synchr D Jun 30 06:12:12 ora_p027_anbob3
zzz ***Sat Jul 9 05:53:00 CST 2022
oracle 23539 1 19 2.7 16.0 263044052 169959140 synchr D Jun 30 06:12:13 ora_p027_anbob3
$ egrep '^zzz|ora_p027' anbob3_ps_22.07.09.0600.dat
zzz ***Sat Jul 9 06:00:01 CST 2022
oracle 23539 1 19 2.7 16.0 263043928 169969468 ? R Jun 30 06:12:28 ora_p027_anbob3
zzz ***Sat Jul 9 06:00:31 CST 2022
oracle 23539 1 19 2.7 16.0 263044052 169970548 synchr D Jun 30 06:12:29 ora_p027_anbob3
zzz ***Sat Jul 9 06:01:01 CST 2022
oracle 23539 1 19 2.7 16.0 263044052 169971792 synchr D Jun 30 06:12:30 ora_p027_anbob3
zzz ***Sat Jul 9 06:01:31 CST 2022
oracle 23539 1 19 2.7 16.0 263044052 169973676 synchr D Jun 30 06:12:31 ora_p027_anbob3
zzz ***Sat Jul 9 06:02:01 CST 2022
oracle 23539 1 19 2.7 16.0 263044052 169975644 synchr D Jun 30 06:12:32 ora_p027_anbob3
zzz ***Sat Jul 9 06:02:31 CST 2022
oracle 23539 1 19 2.7 16.0 263044052 169980104 synchr D Jun 30 06:12:33 ora_p027_anbob3
zzz ***Sat Jul 9 06:03:01 CST 2022
oracle 23539 1 19 2.7 16.0 263044052 169981468 synchr D Jun 30 06:12:33 ora_p027_anbob3
zzz ***Sat Jul 9 06:03:31 CST 2022
oracle 23539 1 19 2.7 16.0 263043928 169982628 blockd D Jun 30 06:12:34 ora_p027_anbob3
zzz ***Sat Jul 9 06:04:02 CST 2022
oracle 23539 1 19 2.7 16.0 263044052 169983820 synchr D Jun 30 06:12:35 ora_p027_anbob3
zzz ***Sat Jul 9 06:04:32 CST 2022
oracle 23539 1 19 2.7 16.0 263044052 169985096 synchr D Jun 30 06:12:35 ora_p027_anbob3
zzz ***Sat Jul 9 06:05:02 CST 2022
oracle 23539 1 19 2.7 16.0 263044052 169987172 synchr D Jun 30 06:12:36 ora_p027_anbob3
zzz ***Sat Jul 9 06:05:32 CST 2022
oracle 23539 1 19 2.7 16.0 263044052 169988824 synchr D Jun 30 06:12:37 ora_p027_anbob3
zzz ***Sat Jul 9 06:06:02 CST 2022
oracle 23539 1 19 2.7 16.0 263044052 169989524 synchr D Jun 30 06:12:38 ora_p027_anbob3
zzz ***Sat Jul 9 06:06:32 CST 2022
oracle 23539 1 19 2.7 16.0 263044052 169989864 synchr D Jun 30 06:12:40 ora_p027_anbob3
...
行 83871: oracle 119729 1 19 5.2 12.9 263064632 136586392 blockd D Jul 05 04:07:02 ora_p009_anbob4
行 89096: oracle 119729 1 19 5.2 12.9 263064632 136586456 blockd D Jul 05 04:07:08 ora_p009_anbob4
行 94320: oracle 119729 1 19 5.2 12.9 263064632 136586508 blockd D Jul 05 04:07:13 ora_p009_anbob4
行 99543: oracle 119729 1 19 5.2 12.9 263064632 136586568 blockd D Jul 05 04:07:18 ora_p009_anbob4
行 104762: oracle 119729 1 19 5.2 12.9 263064632 136586624 ? R Jul 05 04:07:23 ora_p009_anbob4
行 110001: oracle 119729 1 19 5.2 12.9 263064632 136586700 blockd D Jul 05 04:07:30 ora_p009_anbob4
行 115238: oracle 119729 1 19 5.2 12.9 263064640 136587044 ? R Jul 05 04:07:36 ora_p009_anbob4
行 120458: oracle 119729 1 19 5.2 12.9 263064640 136587220 synchr D Jul 05 04:07:39 ora_p009_anbob4
行 125681: oracle 119729 1 19 5.2 12.9 263064764 136587572 synchr D Jul 05 04:07:39 ora_p009_anbob4
行 130894: oracle 119729 1 19 5.2 12.9 263064632 136587596 blockd D Jul 05 04:07:39 ora_p009_anbob4
行 136106: oracle 119729 1 19 5.2 12.9 263064640 136587844 synchr D Jul 05 04:07:40 ora_p009_anbob4
行 141315: oracle 119729 1 19 5.2 12.9 263064764 136588132 synchr D Jul 05 04:07:41 ora_p009_anbob4
行 146533: oracle 119729 1 19 5.2 12.9 263064764 136588312 synchr D Jul 05 04:07:41 ora_p009_anbob4
行 151745: oracle 119729 1 19 5.2 12.9 263064764 136588488 synchr D Jul 05 04:07:42 ora_p009_anbob4
行 156959: oracle 119729 1 19 5.2 12.9 263064764 136588636 synchr D Jul 05 04:07:42 ora_p009_anbob4
行 162175: oracle 119729 1 19 5.2 12.9 263064764 136589868 synchr D Jul 05 04:07:43 ora_p009_anbob4
行 167398: oracle 119729 1 19 5.2 12.9 263064764 136589940 synchr D Jul 05 04:07:44 ora_p009_anbob4
行 172646: oracle 119729 1 19 5.2 12.9 263064640 136590328 synchr D Jul 05 04:07:45 ora_p009_anbob4
行 177881: oracle 119729 1 19 5.2 12.9 263064640 136590480 synchr D Jul 05 04:07:45 ora_p009_anbob4
行 183109: oracle 119729 1 19 5.2 12.9 263064764 136590852 synchr D Jul 05 04:07:45 ora_p009_anbob4
行 188344: oracle 119729 1 19 5.2 12.9 263064632 136590872 blockd D Jul 05 04:07:46 ora_p009_anbob4
行 193567: oracle 119729 1 19 5.2 12.9 263064764 136591096 synchr D Jul 05 04:07:46 ora_p009_anbob4
行 198787: oracle 119729 1 19 5.2 12.9 263064764 136591264 synchr D Jul 05 04:07:47 ora_p009_anbob4
行 204007: oracle 119729 1 19 5.2 12.9 263064640 136591312 synchr D Jul 05 04:07:47 ora_p009_anbob4
行 209240: oracle 119729 1 19 5.2 12.9 263064764 136591608 synchr D Jul 05 04:07:48 ora_p009_anbob4
行 214467: oracle 119729 1 19 5.2 12.9 263064764 136591780 synchr D Jul 05 04:07:48 ora_p009_anbob4
行 219707: oracle 119729 1 19 5.2 12.9 263064764 136591952 synchr D Jul 05 04:07:49 ora_p009_anbob4
NOTE:
从osw ps每30秒快照可见存在近1-2个小时并行进程连续处于”D” 状态。
“D” 状态进程
A process in Linux can be in several states: running, sleeping, etc. Running process runs on a CPU just now, sleeping process waits for its turn on CPU or for some other event. First big S stands for Sleeping, R stands for running (“+” means that the process is foreground and small “s” means that the process is session leader, but it is not relevant for this article).
D state occurs then the process is in uninterruptible sleep,Processes in a “D” or uninterruptible sleep state are usually waiting on I/O. The ps command shows a “D” on processes in an uninterruptible sleep state. This state is bad, because you can’t do anything with the process in D state. Fortunately, process normally remains in such state not for so long. But if you have a heap of D state processes then some logic in system is disrupt. If that is happening, the very important thing is to determine where this unlucky sleep occurs. It is easy to do with ps command with l option. WCHAN column shows the name of the kernel function where the process is sleeping,You cannot kill “D” state processes, even with SIGKILL or kill -9. As the name implies, they are uninterruptible. You can only clear them by rebooting the server or waiting for the I/O to respond.
“D”状态进程通常是在等待I/O,无法终止只到i/o返回, 出现该问题时通常查看WCHAN列判断内核调用函数。但是像本安例中能确认是synchr开头,显示不完整,查看以synchr开头的所有函数。
[oracle@oel7db1 ~]$ cat /boot/System.map-$(uname -r)|awk '$3 ~ "^synchr"'
ffffffff810fd130 T synchronize_hardirq
ffffffff810fd240 T synchronize_irq
ffffffff81106ed0 t synchronize_rcu
ffffffff81108b30 T synchronize_srcu_expedited
ffffffff81108b60 T synchronize_srcu
ffffffff8110ac80 T synchronize_sched_expedited
ffffffff8110acb0 T synchronize_sched
ffffffff8110ad70 T synchronize_rcu_bh
ffffffff8110adf0 T synchronize_rcu_expedited
ffffffff811f7620 t synchronous_wake_function
ffffffff8170e630 T synchronize_net
[oracle@oel7db1 ~]$ cat /boot/System.map-$(uname -r)|awk '$3 ~ "^blockd"'
ffffffff8264a3f0 D blockdev_superblock
因为历史无法再查看,可以查看当前问题节点是否存在synchr开头进程。
# ps -eo pid,tid,class,state,pcpu,wchan:30,comm|grep ora_|awk '$4="D"&& $6 ~ "sync"'
发现存在几个D状态的oracle server进程,函数为synchronize_sched
-/**
– * synchronize_sched – block until all CPUs have exited any non-preemptive
– * kernel code sequences.
– *
– * This means that all preempt_disable code sequences, including NMI and
– * hardware-interrupt handlers, in progress on entry will have completed
– * before this primitive returns. However, this does not guarantee that
– * softirq handlers will have completed, since in some kernels, these
– * handlers can run in process context, and can block.
– *
– * This primitive provides the guarantees made by the (now removed)
– * synchronize_kernel() API. In contrast, synchronize_rcu() only
– * guarantees that rcu_read_lock() sections will have completed.
– * In “classic RCU”, these two guarantees happen to be one and
– * the same, but can differ in realtime RCU implementations.
– */
-#define synchronize_sched() __synchronize_sched()
cause
不确认当时CPU在做什么, 或是内核存在逻辑缺陷原因, 至少这样可以解释,在数据库层看到的长时间 ON CPU,是因为OS 层的进程处理D 状态,一直在等待内核某个调用完成。
OS层的内核调用未响应,导致DB 层会话为on cpu状态,SQL运行时间增加,直到OS响应后自动恢复。
Solution
建议找主机工程师或存储工程师分析问题时间点的存储性能 或 OS是否有瓶颈。