问题描述
近日抓到一条Top SQL,描述如下:
1、SQL消耗如下:
ELAPSED_TIME CPU_TIME DISK_READS BUFFER_GETS ROWS_PROCESSED EXECUTIONS
------------ ----------- ----------- ---------------- ---------------- ------------
9,760,345 9,454,331 45 1,755,570,498 2,561 2,722
粗略计算下,单次执行消耗122w个逻辑读;
2、SQL文本如下:
SELECT * FROM TABLE1 WHERE COL1 = '0' AND COL2 = :1 AND PARTITION_ID BETWEEN :2 AND :3 AND SYSDATE BETWEEN VALID_DATE AND EXPIRE_DATE AND COL3 = :4;
执行计划如下:
---------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | E-Rows |E-Bytes| Cost (%CPU)| E-Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 9 (100)| | | |
|* 1 | FILTER | | | | | | | |
|* 2 | TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| TABLE1 | 1 | 107 | 9 (0)| 00:00:01 | ROWID | ROWID |
|* 3 | INDEX RANGE SCAN | IDX_TABLE1_COL2 | 6 | | 4 (0)| 00:00:01 | | |
---------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(:3>=:2)
2 - filter(("COL1"='0' AND "PARTITION_ID">=:2 AND "PARTITION_ID"<=:3 AND
"COL3"=TO_NUMBER(:4) AND "EXPIRE_DATE">=SYSDATE@! AND "VALID_DATE"<=SYSDATE@!))
3 - access("COL2"=:1)
该条SQL存在的问题为:执行计划为索引范围扫描,取数据量少,消耗为什么这么高?
排查过程
1、查询表table1的数据情况,如下:
table1为按PARTITION_ID进行range分区的分区表,有10个分区,总数据量一亿五千万,每个分区数据量大概在一千五百万左右;
统计信息上次收集时间是在2024-01-10 22:09,对表数据做了count(*),和统计的偏差不超过5%,判断统计信息是准确的。
Table Number Empty Chain Average Global Sample
Name of Rows Blocks Blocks Count Row Len Stats Size LAST_ANALYZED
------------------------------ -------------- ---------------- ------------ -------- ------- ------ -------------- --------------------
TABLE1 154,539,714 5,471,467 0 0 231 YES 154,539,714 2024-01-10 22:09
Column Distinct Null Number Number Sample
Name DATA_TYPE Values able Density Buckets Nulls AVG_COL_LEN Size LAST_ANALYZED
------------------------------ --------------- -------------- ---- ----------- ------- -------------- ----------- -------------- --------------------
COL1 VARCHAR2 231 N .00000000 231 0 11 154,539,714 2024-01-10 22:09
COL2 VARCHAR2 16,346,112 Y .00000000 254 60,641,059 8 9,874 2024-01-10 22:09
COL3 NUMBER 39,440,384 N .00000003 1 0 10 154,539,714 2024-01-10 22:09
EXPIRE_DATE DATE 762,112 N .00000100 254 0 8 16,315 2024-01-10 22:09
PARTITION_ID NUMBER 10,000 N .00010000 1 0 4 154,539,714 2024-01-10 22:09
VALID_DATE DATE 72,466,432 N .00000000 254 0 8 16,314 2024-01-10 22:09
Index Index Leaf Distinct Number AV Av Cluster
Name Type BLV Blks Keys of Rows PAR STATUS LEA Data Factor LAST_ANALYZED
------------------------------ --------------------------- --- ------------ -------------- -------------- --- -------- ---------- ---------- ------------ --------------------
IDX_TABLE1_COL2 NORMAL 3 446,835 16,346,112 93,899,298 NO VALID 1 4 67,805,986 2024-01-10 22:34
IDX_TABLE1_COL3_1 NORMAL 2 1,025,207 84,930,782 154,540,146 YES N/A 1 1 136,067,940 2024-01-10 22:28
Index Column Col Column
Name Name Pos Details
------------------------------ ------------------------------ ---- ------------------------
IDX_TABLE1_COL2 COL2 1 VARCHAR2(500)
IDX_TABLE1_COL3_1 COL3 1 NUMBER(16,0) NOT NULL
COL1 2 VARCHAR2(20) NOT NULL
问题分析
1、SQL中的谓词是同时具有COL1、COL2、COL3的条件,而表中索引包括:COL2的全局索引,(COL3,COL1)的组合本地索引;
2、根据执行计划可知SQL目前走的是COL2的全局索引,进行索引范围扫描后,再进行分区条件过滤。表数据量较大,假定COL2列的数据分布比较均匀,那么一个唯一的COL2列对应表中大约10条数据(154,539,714/16,346,112),IDX_TABLE1_COL2索引选择性一般,但全局索引需要扫描全部数据,所以会在这步消耗大量逻辑读;
3、而(COL3,COL1)的组合本地索引,在数据分布均匀情况下,一对唯一的(COL3,COL1)列对应表中大约2条数据(154,539,714/84,930,782),显然选择性更好,而且SQL中带有分区列条件,使用本地索引,可以直接进行分支修剪,索引扫描范围内选定的分区即可;
分析结论
由以上判断,该条SQL走(COL3,COL1)的组合本地索引性能会更好。
该条SQL走错执行计划的原因是Oracle CBO对COST估算错误,具体原因还需取10053 trace进行分析,为了尽快优化,采取加HINT或构造执行计划绑定的方式皆可。
性能前后对比
- 优化前SQL性能
--------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 107 | 9 (0)| 00:00:01 | | |
|* 1 | FILTER | | | | | | | |
|* 2 | TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| TABLE1 | 1 | 107 | 9 (0)| 00:00:01 | ROWID | ROWID |
|* 3 | INDEX RANGE SCAN | IDX_TABLE1_COL2 | 6 | | 4 (0)| 00:00:01 | | |
--------------------------------------------------------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
1243461 consistent gets
0 physical reads
0 redo size
2006 bytes sent via SQL*Net to client
1499 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
- 优化后SQL性能
--------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 107 | 20 (0)| 00:00:01 | | |
|* 1 | FILTER | | | | | | | |
| 2 | PARTITION RANGE ITERATOR | | 1 | 107 | 20 (0)| 00:00:01 | KEY | KEY |
|* 3 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| TABLE1 | 1 | 107 | 20 (0)| 00:00:01 | KEY | KEY |
|* 4 | INDEX RANGE SCAN | IDX_TABLE1_COL3_1 | 1 | | 19 (0)| 00:00:01 | KEY | KEY |
--------------------------------------------------------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
42 consistent gets
0 physical reads
0 redo size
2006 bytes sent via SQL*Net to client
1536 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL脚本分享
#sql脚本 tabstat.sql
--获取表信息,表列信息等(与统计信息相关)
select
TABLE_NAME,
NUM_ROWS,
BLOCKS,
EMPTY_BLOCKS,
CHAIN_CNT,
AVG_ROW_LEN,
GLOBAL_STATS,
SAMPLE_SIZE,
to_char(t.last_analyzed,'YYYY-MM-DD hh24:mi') last_analyzed
from all_tables t
where
owner = upper(nvl('&&Owner',user))
and table_name = upper('&&Table_name')
/
select
COLUMN_NAME,DATA_TYPE,
NUM_DISTINCT,
NULLABLE,
DENSITY,
NUM_BUCKETS,
NUM_NULLS,
AVG_COL_LEN,
SAMPLE_SIZE,
to_char(t.last_analyzed,'YYYY-MM-DD hh24:mi') last_analyzed
from all_tab_cols t
where
table_name = upper('&Table_name')
and owner = upper(nvl('&Owner',user))
order by case when VIRTUAL_COLUMN='NO' then 'A' else 'B' end, COLUMN_NAME
/
最后修改时间:2024-05-24 10:26:25
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。