一、背景
最近在做数据库巡检的时候发现ASH报告里有一个:os thread startup 等待事件,感觉很奇怪,在以前的巡检的报告中未发现此等待事件,避免数据库存储潜在风险。于是做了个分析。
二、事件分析
由SQL_ID:56h2h7wxpnntt引起的,os thread startup 和 DFS lock handle(与序列有关,整个查询未发现与序列有关,此次先不讨论)
看来与并行有关:PX COORDINATOR
三、查询资料
os thread startup 是oracle 等待事件中Concurrency的一个等待,在进行并行分配是向os 申请进程启动,而在正常的系统中,该动作是非常快的,在我们高压力的数据库环境下,os存在响应问题,激发了该等待事件的等待。
MOS上的一篇文档 :
Solaris: database instance hangs intermittently with wait event: ‘os thread startup’ (Doc ID 1909881.1)
‘os thread startup’ indicates some high contention at OS level avoiding even new process startup.
翻译:
Solaris:数据库实例间歇性挂起,等待事件:“os thread startup”(文档ID 1909881.1)
“os thread startup”表示在操作系统级别存在一些高争用,甚至避免了新进程的启动。
四、查询SQL对像的并行度
- 表的并行属性:结果为空
select owner,table_name,degree from dba_tables where degree>1;
- 索引并行属性
select owner, index_name, table_name, degree
from dba_indexes
where degree > 1;
SQL_ID:56h2h7wxpnntt涉及表的索引的并行度为(8或16),因此表为业务大表,应该是在创建索引的时候加并行了,当时没注意。
注:并行创建索引完成后,需要调整并行度,避免在使用查询时数据库开启并行,生产OLTP数据库使用并行存在一定的风险 。如下:
- 重建索引开并行
SQL> alter index t1_idx_id rebuild parallel 4;
Index altered.
SQL> select degree from all_indexes where index_name='T1_IDX_ID';
DEGREE
------------
4
- 索引重建完毕后关闭并行
SQL> alter index T1_IDX_ID noparallel;
Index altered.
SQL> select degree from all_indexes where index_name='T1_IDX_ID';
DEGREE
------------
1
五、关闭索引并行度
select 'alter index '||owner||'.'||index_name||' noparallel;'
from dba_indexes
where degree > 1;
六、执行计划对比:
- 开并行的SQL执行计划:COST:553
COUNT(*)
----------
50997
Elapsed: 00:00:04.46
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Time |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 100 | 553 | 00:00:07 |
| 1 | SORT AGGREGATE | | 1 | 100 | | |
| 2 | PX COORDINATOR | | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10003 | 1 | 100 | | |
| 4 | SORT AGGREGATE | | 1 | 100 | | |
| 5 | NESTED LOOPS | | 43978 | 4397800 | 553 | 00:00:07 |
| 6 | NESTED LOOPS OUTER | | 43978 | 3870064 | 549 | 00:00:07 |
| * 7 | HASH JOIN | | 43978 | 3034482 | 547 | 00:00:07 |
| 8 | BUFFER SORT | | | | | |
| 9 | PX RECEIVE | | 50551 | 1162673 | 444 | 00:00:06 |
| 10 | PX SEND HASH | :TQ10001 | 50551 | 1162673 | 444 | 00:00:06 |
| * 11 | TABLE ACCESS FULL | A**_***_****T | 50551 | 1162673 | 444 | 00:00:06 |
| 12 | PX RECEIVE | | 43979 | 2023034 | 94 | 00:00:02 |
| 13 | PX SEND HASH | :TQ10002 | 43979 | 2023034 | 94 | 00:00:02 |
| * 14 | HASH JOIN | | 43979 | 2023034 | 94 | 00:00:02 |
| 15 | BUFFER SORT | | | | | |
| 16 | PX RECEIVE | | 65 | 1040 | 4 | 00:00:01 |
| 17 | PX SEND BROADCAST | :TQ10000 | 65 | 1040 | 4 | 00:00:01 |
| * 18 | TABLE ACCESS FULL | B**_*****T | 65 | 1040 | 4 | 00:00:01 |
| 19 | PX BLOCK ITERATOR | | 51110 | 1533300 | 81 | 00:00:01 |
| 20 | INDEX FAST FULL SCAN | B**_IDX1 | 51110 | 1533300 | 81 | 00:00:01 |
| * 21 | INDEX RANGE SCAN | D***_IDX1 | 1 | 19 | 0 | 00:00:01 |
| * 22 | INDEX UNIQUE SCAN | PK_C***_C***_I*** | 1 | 12 | 0 | 00:00:01 |
----------------------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
96 recursive calls
0 db block gets
26115 consistent gets
0 physical reads
0 redo size
528 bytes sent via SQL*Net to client
520 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
32 sorts (memory)
0 sorts (disk)
1 rows processed
- 去掉并行度后:COST:536
COUNT(*)
----------
50997
Elapsed: 00:00:00.21
------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Time |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 100 | 536 | 00:00:07 |
| 1 | SORT AGGREGATE | | 1 | 100 | | |
| 2 | NESTED LOOPS | | 43978 | 4397800 | 536 | 00:00:07 |
| * 3 | HASH JOIN RIGHT OUTER | | 43978 | 3870064 | 532 | 00:00:07 |
| 4 | INDEX FULL SCAN | D***_IDX1 | 87 | 1653 | 1 | 00:00:01 |
| * 5 | HASH JOIN | | 43978 | 3034482 | 530 | 00:00:07 |
| * 6 | TABLE ACCESS FULL | A**_***_****T | 50551 | 1162673 | 444 | 00:00:06 |
| * 7 | HASH JOIN | | 43979 | 2023034 | 85 | 00:00:02 |
| 8 | VIEW | index$_join$_003 | 65 | 1040 | 3 | 00:00:01 |
| * 9 | HASH JOIN | | | | | |
| 10 | INDEX FAST FULL SCAN | BK_P*****T_PK | 65 | 1040 | 1 | 00:00:01 |
| * 11 | INDEX FAST FULL SCAN | B***_IDX1 | 65 | 1040 | 1 | 00:00:01 |
| 12 | INDEX FAST FULL SCAN | B***_IDX1 | 51110 | 1533300 | 81 | 00:00:01 |
| * 13 | INDEX UNIQUE SCAN | PK_C***_C***_I*** | 1 | 12 | 0 | 00:00:01 |
------------------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2119 consistent gets
0 physical reads
0 redo size
528 bytes sent via SQL*Net to client
520 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
查询时间由原来的:00:00:04.46 降至:00:00:00.21 提高了4秒多。
逻辑读:26115 降至:2119
递归调用:96 降至:0
内存排序:32 降至:0
整体的查询效率有了很大的提高。
七、总结
1、os thread startup 等待事件一般与并行相关;
2、在创建索引与重建索引加并行时,索引调整完后,需要关闭并行度;
3、从最后的执行计划来看,并行不一定会使查询变快,可能会起到相反的作用,OLTP数据库慎用并行;
Oracle相关文章推荐
Oracle: | URL |
---|---|
《Oracle 自动收集统计信息机制》 | https://www.modb.pro/db/403670 |
《Oracle_索引重建—优化索引碎片》 | https://www.modb.pro/db/399543 |
《DBA_TAB_MODIFICATIONS表的刷新策略测试》 | https://www.modb.pro/db/414692 |
《FY_Recover_Data.dbf》 | https://www.modb.pro/doc/74682 |
《Oracle RAC 集群迁移文件操作.pdf》 | https://www.modb.pro/doc/72985 |
《Oracle Date 字段索引使用测试.dbf》 | https://www.modb.pro/doc/72521 |
《Oracle 诊断案例 :因应用死循环导致的CPU过高》 | https://www.modb.pro/db/483047 |
《Oracle 慢SQL监控脚本》 | https://www.modb.pro/db/479620 |
《Oracle 慢SQL监控测试及监控脚本.pdf》 | https://www.modb.pro/doc/76068 |
《Oracle 脚本实现简单的审计功能》 | https://www.modb.pro/db/450052 |
《记录一起索引rebuild与收集统计信息的事故》 | https://www.modb.pro/db/408934 |
《RAC DG删除备库redo时报ORA-01623》 | https://www.modb.pro/db/515939 |