暂无图片
暂无图片
9
暂无图片
暂无图片
暂无图片

SQL优化案例(一)

原创 1001 2024-05-24
1126

问题描述

近日抓到一条Top SQL,描述如下:

1、SQL消耗如下: ELAPSED_TIME CPU_TIME DISK_READS BUFFER_GETS ROWS_PROCESSED EXECUTIONS ------------ ----------- ----------- ---------------- ---------------- ------------ 9,760,345 9,454,331 45 1,755,570,498 2,561 2,722 粗略计算下,单次执行消耗122w个逻辑读; 2、SQL文本如下: SELECT * FROM TABLE1 WHERE COL1 = '0' AND COL2 = :1 AND PARTITION_ID BETWEEN :2 AND :3 AND SYSDATE BETWEEN VALID_DATE AND EXPIRE_DATE AND COL3 = :4; 执行计划如下: --------------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | E-Rows |E-Bytes| Cost (%CPU)| E-Time | Pstart| Pstop | --------------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 9 (100)| | | | |* 1 | FILTER | | | | | | | | |* 2 | TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| TABLE1 | 1 | 107 | 9 (0)| 00:00:01 | ROWID | ROWID | |* 3 | INDEX RANGE SCAN | IDX_TABLE1_COL2 | 6 | | 4 (0)| 00:00:01 | | | --------------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(:3>=:2) 2 - filter(("COL1"='0' AND "PARTITION_ID">=:2 AND "PARTITION_ID"<=:3 AND "COL3"=TO_NUMBER(:4) AND "EXPIRE_DATE">=SYSDATE@! AND "VALID_DATE"<=SYSDATE@!)) 3 - access("COL2"=:1)

该条SQL存在的问题为:执行计划为索引范围扫描,取数据量少,消耗为什么这么高?

排查过程

1、查询表table1的数据情况,如下: table1为按PARTITION_ID进行range分区的分区表,有10个分区,总数据量一亿五千万,每个分区数据量大概在一千五百万左右; 统计信息上次收集时间是在2024-01-10 22:09,对表数据做了count(*),和统计的偏差不超过5%,判断统计信息是准确的。 Table Number Empty Chain Average Global Sample Name of Rows Blocks Blocks Count Row Len Stats Size LAST_ANALYZED ------------------------------ -------------- ---------------- ------------ -------- ------- ------ -------------- -------------------- TABLE1 154,539,714 5,471,467 0 0 231 YES 154,539,714 2024-01-10 22:09 Column Distinct Null Number Number Sample Name DATA_TYPE Values able Density Buckets Nulls AVG_COL_LEN Size LAST_ANALYZED ------------------------------ --------------- -------------- ---- ----------- ------- -------------- ----------- -------------- -------------------- COL1 VARCHAR2 231 N .00000000 231 0 11 154,539,714 2024-01-10 22:09 COL2 VARCHAR2 16,346,112 Y .00000000 254 60,641,059 8 9,874 2024-01-10 22:09 COL3 NUMBER 39,440,384 N .00000003 1 0 10 154,539,714 2024-01-10 22:09 EXPIRE_DATE DATE 762,112 N .00000100 254 0 8 16,315 2024-01-10 22:09 PARTITION_ID NUMBER 10,000 N .00010000 1 0 4 154,539,714 2024-01-10 22:09 VALID_DATE DATE 72,466,432 N .00000000 254 0 8 16,314 2024-01-10 22:09 Index Index Leaf Distinct Number AV Av Cluster Name Type BLV Blks Keys of Rows PAR STATUS LEA Data Factor LAST_ANALYZED ------------------------------ --------------------------- --- ------------ -------------- -------------- --- -------- ---------- ---------- ------------ -------------------- IDX_TABLE1_COL2 NORMAL 3 446,835 16,346,112 93,899,298 NO VALID 1 4 67,805,986 2024-01-10 22:34 IDX_TABLE1_COL3_1 NORMAL 2 1,025,207 84,930,782 154,540,146 YES N/A 1 1 136,067,940 2024-01-10 22:28 Index Column Col Column Name Name Pos Details ------------------------------ ------------------------------ ---- ------------------------ IDX_TABLE1_COL2 COL2 1 VARCHAR2(500) IDX_TABLE1_COL3_1 COL3 1 NUMBER(16,0) NOT NULL COL1 2 VARCHAR2(20) NOT NULL

问题分析

1、SQL中的谓词是同时具有COL1、COL2、COL3的条件,而表中索引包括:COL2的全局索引,(COL3,COL1)的组合本地索引;
2、根据执行计划可知SQL目前走的是COL2的全局索引,进行索引范围扫描后,再进行分区条件过滤。表数据量较大,假定COL2列的数据分布比较均匀,那么一个唯一的COL2列对应表中大约10条数据(154,539,714/16,346,112),IDX_TABLE1_COL2索引选择性一般,但全局索引需要扫描全部数据,所以会在这步消耗大量逻辑读;
3、而(COL3,COL1)的组合本地索引,在数据分布均匀情况下,一对唯一的(COL3,COL1)列对应表中大约2条数据(154,539,714/84,930,782),显然选择性更好,而且SQL中带有分区列条件,使用本地索引,可以直接进行分支修剪,索引扫描范围内选定的分区即可;

分析结论

由以上判断,该条SQL走(COL3,COL1)的组合本地索引性能会更好。
该条SQL走错执行计划的原因是Oracle CBO对COST估算错误,具体原因还需取10053 trace进行分析,为了尽快优化,采取加HINT或构造执行计划绑定的方式皆可。

性能前后对比

  • 优化前SQL性能
-------------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | -------------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 107 | 9 (0)| 00:00:01 | | | |* 1 | FILTER | | | | | | | | |* 2 | TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| TABLE1 | 1 | 107 | 9 (0)| 00:00:01 | ROWID | ROWID | |* 3 | INDEX RANGE SCAN | IDX_TABLE1_COL2 | 6 | | 4 (0)| 00:00:01 | | | -------------------------------------------------------------------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 1243461 consistent gets 0 physical reads 0 redo size 2006 bytes sent via SQL*Net to client 1499 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
  • 优化后SQL性能
-------------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | -------------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 107 | 20 (0)| 00:00:01 | | | |* 1 | FILTER | | | | | | | | | 2 | PARTITION RANGE ITERATOR | | 1 | 107 | 20 (0)| 00:00:01 | KEY | KEY | |* 3 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| TABLE1 | 1 | 107 | 20 (0)| 00:00:01 | KEY | KEY | |* 4 | INDEX RANGE SCAN | IDX_TABLE1_COL3_1 | 1 | | 19 (0)| 00:00:01 | KEY | KEY | -------------------------------------------------------------------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 42 consistent gets 0 physical reads 0 redo size 2006 bytes sent via SQL*Net to client 1536 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed

SQL脚本分享

#sql脚本 tabstat.sql --获取表信息,表列信息等(与统计信息相关) select TABLE_NAME, NUM_ROWS, BLOCKS, EMPTY_BLOCKS, CHAIN_CNT, AVG_ROW_LEN, GLOBAL_STATS, SAMPLE_SIZE, to_char(t.last_analyzed,'YYYY-MM-DD hh24:mi') last_analyzed from all_tables t where owner = upper(nvl('&&Owner',user)) and table_name = upper('&&Table_name') / select COLUMN_NAME,DATA_TYPE, NUM_DISTINCT, NULLABLE, DENSITY, NUM_BUCKETS, NUM_NULLS, AVG_COL_LEN, SAMPLE_SIZE, to_char(t.last_analyzed,'YYYY-MM-DD hh24:mi') last_analyzed from all_tab_cols t where table_name = upper('&Table_name') and owner = upper(nvl('&Owner',user)) order by case when VIRTUAL_COLUMN='NO' then 'A' else 'B' end, COLUMN_NAME /
最后修改时间:2024-05-24 10:26:25
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论