背景:
SELECTROUND( SUM (COALESCE(station_month.maintain_labor_cost,0) + ............COALESCE(station_month.employee_compensation,0)), 2) AS totalCost,ROUND( SUM ( info.ast_original_value ), 2 ) AS totalAstValueFROMpowerX.cost_XXXXX_pl_info infoINNER JOIN powerX.cost_XXXXX_pl_month station_month ON info.psr_id = station_month.psr_idAND info.ast_id = station_month.ast_idwhere info.professional_type = '01'AND info.equip_type = 'xl'AND station_month.sun_year_month in ( '202301', '202302','202303','202304' )
Aggregate (cost=4938.02..4938.03 rows=1 width=64) (actual time=85.167..85.169 rows=1 loops=1)Buffers: shared hit=44893-> Nested Loop (cost=13.38..4937.97 rows=1 width=126) (actual time=0.720..82.459 rows=1343 loops=1)Buffers: shared hit=44893-> Bitmap Heap Scan on cost_XXXXX_pl_info info (cost=12.96..1420.40 rows=443 width=78) (actual time=0.695..2.500 rows=7013 loops=1)Recheck Cond: (((equip_type)::text = 'xl'::text) AND ((professional_type)::text = '01'::text))Heap Blocks: exact=764Buffers: shared hit=786-> Bitmap Index Scan on idx_cost_XXXXX_pl_info_equip_type_professional (cost=0.00..12.85 rows=443 width=0) (actual time=0.614..0.614 rows=7013 loops=1)Index Cond: (((equip_type)::text = 'xl'::text) AND ((professional_type)::text = '01'::text))Buffers: shared hit=22-> Index Scan using idx_cost_XXXXX_pl_month_idx_prsid_astid on cost_XXXXX_pl_month station_month (cost=0.42..7.93 rows=1 width=176) (actual time=0.011..0.011 rows=0 loops=7013)Index Cond: (((psr_id)::text = (info.psr_id)::text) AND ((ast_id)::text = (info.ast_id)::text))Filter: ((sun_year_month)::text = ANY ('{202301,202302,202303,202304}'::text[]))Rows Removed by Filter: 3Buffers: shared hit=44107Planning Time: 0.667 ms
发现“隐患”
Bitmap Heap Scan on cost_XXXXX_pl_info info (cost=12.96..1420.40 rows=443 width=78) (actual time=0.695..2.500 rows=7013 loops=1)Recheck Cond: (((equip_type)::text = 'xl'::text) AND ((professional_type)::text = '01'::text))Heap Blocks: exact=764Buffers: shared hit=786
其中 rows=443 VS rows=7013,隐藏比较深。毕竟就是2个数字很难直接定位到。其实源SQL执行计划更长,需要再一屏甚至更长的执行计划中发现隐患还是比较困难的。
疑问:为什么估算的rows会差异如此之多?是不是本次统计信息过期导致rows评估偏差过多?经过排查后发现统计信息并没有过期。根据SQL条件测试分别单独测试两个条件:
(((equip_type)::text = 'xl'::text) AND ((professional_type)::text = '01'::text))
执行计划1
Bitmap Heap Scan on cost_XXXXX_pl_info info (cost=129.15..7439.87 rows=6804 width=723) (actual time=219.819..221.361 rows=7013 loops=1)Recheck Cond: ((equip_type)::text = 'xl'::text)Heap Blocks: exact=764Buffers: shared hit=764 read=22I/O Timings: read=219.260-> Bitmap Index Scan on idx_cost_XXXXX_pl_info_equip_type_professional (cost=0.00..127.45 rows=6804 width=0) (actual time=219.735..219.736 rows=7013 loops=1)Index Cond: ((equip_type)::text = 'xl'::text)Buffers: shared read=22I/O Timings: read=219.260Planning Time: 0.126 msExecution Time: 221.680 ms
Seq Scan on cost_XXXXX_pl_info info (cost=0.00..8328.08 rows=6804 width=723) (actual time=0.007..22.252 rows=7014 loops=1)Filter: ((professional_type)::text = '01'::text)Rows Removed by Filter: 97392Buffers: shared hit=7023Planning Time: 0.117 msExecution Time: 22.569 ms
发现估算的rows 精确度都比较高,为啥两个一起组合rows偏差如此之大呢?
with taba as (select schemaname, tablename, attname, null_frac, array_to_string( most_common_vals ,',') most_common_vals ,array_to_string( most_common_freqs ,',') most_common_freqsfrom pg_stats where tablename = 'cost_XXXXX_pl_info' and attname in( 'professional_type','equip_type')) select schemaname, tablename, t.reltuples, attname, null_frac,regexp_split_to_table(most_common_vals, ',') vals, regexp_split_to_table(most_common_freqs, ',') cardinality,round( regexp_split_to_table(most_common_freqs, ',')::float * (t.reltuples - null_frac)) rowssfrom tabainner join pg_class t on taba.tablename = t.relname and t.relnamespace::regnamespace::text = taba.schemaname;
查询结果:
Schemaname | tablename |reltuples |attname | null_frac | vals | most_common_freqs |rowsspowerX cost_XXXXX_pl_info 104406 equip_type 0 dytq 0.7334 76571powerX cost_XXXXX_pl_info 104406 equip_type 0 dkx 0.1727 18031powerX cost_XXXXX_pl_info 104406 equip_type 0 xl 0.06516667 6804powerX cost_XXXXX_pl_info 104406 equip_type 0 zf01 0.019 1984powerX cost_XXXXX_pl_info 104406 equip_type 0 zf06 0.0059666666 623powerX cost_XXXXX_pl_info 104406 equip_type 0 zf08 0.0025 261powerX cost_XXXXX_pl_info 104406 equip_type 0 zf04 0.0010333334 108powerX cost_XXXXX_pl_info 104406 professional_type 0 03 0.91566664 95601powerX cost_XXXXX_pl_info 104406 professional_type 0 01 0.06516667 6804powerX cost_XXXXX_pl_info 104406 professional_type 0 02 0.0191 1994
而结果 6804 正是单一条件的执行计划中 估算rows。
计算方式:(reltuples - null_frac ) * most_common_freqs
104406 * 0.06516667 = 6804
同理两个条件的rows估算逻辑
104406 * 0.06516667 * 0.06516667 = 443 正是两个过滤条件的rows
Bitmap Heap Scan on cost_XXXXX_pl_info info (cost=12.96..1420.40 rows=443 width=78) (actual time=0.695..2.500 rows=7013 loops=1)Recheck Cond: (((equip_type)::text = 'xl'::text) AND ((professional_type)::text = '01'::text))Heap Blocks: exact=764Buffers: shared hit=786
优化方式:
如何纠正rows评估差异问题?根据官方文档通过创建扩展统计信息 创建扩展统计信息:
create statistics stat_p_e ( dependencies ) on professional_type, equip_type from powerX.cost_XXXXX_pl_info;
收集统计信息:Analyse powerX.cost_XXXXX_pl_info;
查看两个过滤条件的执行计划:之后发现执行计划符合预期。
---- 正确的执行计划Bitmap Heap Scan on cost_stationline_pl_info info (cost=146.09..7472.62 rows=6797 width=723) (actual time=0.730..2.339 rows=7013 loops=1)Recheck Cond: (((equip_type)::text = 'xl'::text) AND ((professional_type)::text = '01'::text))Heap Blocks: exact=764Buffers: shared hit=786-> Bitmap Index Scan on idx_cost_stationline_pl_info_equip_type_professional (cost=0.00..144.39 rows=6797 width=0) (actual time=0.648..0.649 rows=7013 loops=1)Index Cond: (((equip_type)::text = 'xl'::text) AND ((professional_type)::text = '01'::text))Buffers: shared hit=22Planning Time: 0.144 msExecution Time: 2.657 ms
优化成果: Hash Join
Aggregate (cost=14662.17..14662.18 rows=1 width=64) (actual time=16.336..16.339 rows=1 loops=1)Buffers: shared hit=1236-> Hash Join (cost=7574.99..14662.12 rows=1 width=126) (actual time=11.282..13.865 rows=1343 loops=1)Hash Cond: (((station_month.psr_id)::text = (info.psr_id)::text) AND ((station_month.ast_id)::text = (info.ast_id)::text))Buffers: shared hit=1236-> Index Scan using cost_stationline_pl_month_idx_sun_year_month on cost_stationline_pl_month station_month (cost=0.42..7070.73 rows=3202 width=176) (actual time=0.063..1.184 rows=3235 loops=1)Index Cond: ((sun_year_month)::text = ANY ('{202301,202302,202303,202304}'::text[]))Buffers: shared hit=450I/O Timings: read=290.886-> Hash (cost=7472.62..7472.62 rows=6797 width=78) (actual time=11.190..11.191 rows=7013 loops=1)Buckets: 8192 Batches: 1 Memory Usage: 620kBBuffers: shared hit=786-> Bitmap Heap Scan on cost_stationline_pl_info info (cost=146.09..7472.62 rows=6797 width=78) (actual time=2.371..8.550 rows=7013 loops=1)Recheck Cond: (((equip_type)::text = 'xl'::text) AND ((professional_type)::text = '01'::text))Heap Blocks: exact=764Buffers: shared hit=786-> Bitmap Index Scan on idx_cost_stationline_pl_info_equip_type_professional (cost=0.00..144.39 rows=6797 width=0) (actual time=0.615..0.615 rows=7013 loops=1)Index Cond: (((equip_type)::text = 'xl'::text) AND ((professional_type)::text = '01'::text))Buffers: shared hit=22Planning Time: 1.853 msExecution Time: 16.453 ms
效果对比:
优化前:(cost=12.96..1420.40 rows=443 width=78) (actual time=0.695..2.500 rows=7013 loops=1)
优化后:(cost=0.00..144.39 rows=6797 width=0) (actual time=0.615..0.615 rows=7013 loops=1)
优化前:执行计划:Nested Loop
优化后:执行计划:Hash Join
优化前:执行成本: shared hit=44893 85ms
优化后:执行计划: shared hit=1236 16ms
Rows评估 | 执行计划 | 执行成本 | |
优化前 | (cost=12.96..1420.40 rows=443 width=78) (actual time=0.695..2.500 rows=7013 loops=1) | Nested Loop | shared hit=44893 85ms |
优化后 | (cost=0.00..144.39 rows=6797 width=0) (actual time=0.615..0.615 rows=7013 loops=1) | Hash join | shared hit=1236 16ms |




