概述
使用过ORACLE的都知道,当SQL查询的字段、条件字段都在INDEX中可以找到时,SQL执行过程可以不用回表,大大提高查询效率。
postgresql中也有这样的机制,但有所不同。
由于postgresql index中不保存事物信息,因此postgresql 中的仅仅通过INDEX不能判断数据可见性。
postgresql 中的 INDEX ONLY SCAN,扫描的index tuple需要通过VM去判断可见性,当发现扫描的tuple所在的heap block 有dead tuple时,需要回表扫描来判断事务可见性。 INDEX ONLY SCAN并不是仅仅字面上的只通过INDEX查找数据。
VM是用BIT位标识page中是否有dead tuple的,占用空间小,判断可见性时效率很高。
实验记录
test=# prepare p(varchar(20)) as select name,setting,reset_val from pg_settings where name like '%'||$1||'%'; PREPARE test=# execute p('scan'); name | setting | reset_val ------------------------------+---------+----------- enable_bitmapscan | on | on enable_indexonlyscan | on | on enable_indexscan | on | on enable_seqscan | on | on enable_tidscan | on | on min_parallel_index_scan_size | 64 | 64 min_parallel_table_scan_size | 1024 | 1024 synchronize_seqscans | on | on (8 rows) test=# create extension pageinspect; CREATE EXTENSION create table tab4 (id int,s varchar(100)); create index idx_tab4 on tab4(id); insert into tab4 select generate_series,'a' from generate_series(1,400); select substring(ctid::text from '\(([0-9]+),[0-9]+\)')as page,min(id),max(id) from tab4 group by substring(ctid::text from '\(([0-9]+),[0-9]+\)'); page | min | max ------+-----+----- 0 | 1 | 226 1 | 227 | 400 test=# select *from get_raw_page('tab4','vm',0) ; ERROR: could not open file "base/16447/54036_vm": No such file or directory vacuum tab4; select substr(get_raw_page,1,100) from get_raw_page('tab4','vm',0) ; ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ \x18000000b8e3de860000000018000020002004200000000005000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 (1 row) test=# explain analyze select count(id) from tab4 where id between 100 and 300; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=12.65..12.66 rows=1 width=8) (actual time=0.066..0.066 rows=1 loops=1) -> Index Only Scan using idx_tab4 on tab4 (cost=0.15..12.15 rows=200 width=4) (actual time=0.027..0.044 rows=201 loops=1) Index Cond: ((id >= 100) AND (id <= 300)) Heap Fetches: 0 <<< Planning time: 0.126 ms Execution time: 0.101 ms (6 rows) delete from tab4 where id=110; select substr(get_raw_page,1,100) from get_raw_page('tab4','vm',0) ; ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ \x1800000028ffb3870000000018000020002004200000000004000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 (1 row) test=# explain analyze select count(id) from tab4 where id between 100 and 300; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=12.65..12.66 rows=1 width=8) (actual time=0.102..0.102 rows=1 loops=1) -> Index Only Scan using idx_tab4 on tab4 (cost=0.15..12.15 rows=200 width=4) (actual time=0.051..0.079 rows=200 loops=1) Index Cond: ((id >= 100) AND (id <= 300)) Heap Fetches: 127 <<< 226-99=127 Planning time: 0.092 ms Execution time: 0.136 ms (6 rows) delete from tab4 where id=250; select substr(get_raw_page,1,100) from get_raw_page('tab4','vm',0) ; ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ \x1800000028ffb3870000000018000020002004200000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 (1 row) test=# explain analyze select count(id) from tab4 where id between 100 and 300; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=12.65..12.66 rows=1 width=8) (actual time=0.096..0.096 rows=1 loops=1) -> Index Only Scan using idx_tab4 on tab4 (cost=0.15..12.15 rows=200 width=4) (actual time=0.028..0.073 rows=199 loops=1) Index Cond: ((id >= 100) AND (id <= 300)) Heap Fetches: 200 <<<<< Planning time: 0.089 ms Execution time: 0.141 ms (6 rows) vacuum tab4; select substr(get_raw_page,1,100) from get_raw_page('tab4','vm',0) ; ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ \x1800000060185a880000000018000020002004200000000005000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 (1 row) select '0 no delete' as t,x'05'::bit(8) union select '1 delete id=110' as t,x'04'::bit(8) union select '2 delete id=250' as t,x'00'::bit(8); t | bit -----------------+---------- 0 no delete | 00000101 1 delete id=110 | 00000100 2 delete id=250 | 00000000 (3 rows) /* Number of bits for one heap page */ #define BITS_PER_HEAPBLOCK 2 /* Flags for bit map */ #define VISIBILITYMAP_ALL_VISIBLE 0x01 #define VISIBILITYMAP_ALL_FROZEN 0x02 #define VISIBILITYMAP_VALID_BITS 0x03 /* OR of all valid visibilitymap * flags bits */
复制
实验结论
1、 vacuum 清理dead tuple更新VM可见性,同步清理了index中的dead tuple指针
2、 dml时会更新VM,置heap page存在dead tuple的标志位
3、 初步推测VM里面用2bit标识一个heap page里是否有dead tuple;
最后修改时间:2021-11-24 15:02:54
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。
评论
您好,您的文章已入选合格奖,10墨值奖励已经到账请查收!
❤️我们还会实时派发您的流量收益。
3年前

评论
相关阅读
外国CTO也感兴趣的开源数据库项目——openHalo
小满未满、
1650次阅读
2025-04-21 16:58:09
9.9 分高危漏洞,尽快升级到 pgAdmin 4 v9.2 进行修复
严少安
386次阅读
2025-04-11 10:43:23
3月“墨力原创作者计划”获奖名单公布
墨天轮编辑部
372次阅读
2025-04-15 14:48:05
转发有奖 | PostgreSQL 16 PGCM高级认证课程直播班招生中!
墨天轮小教习
184次阅读
2025-04-14 15:58:34
中国PostgreSQL培训认证体系新增PGAI应用工程师方向
开源软件联盟PostgreSQL分会
182次阅读
2025-05-06 10:21:13
PG生态赢得资本市场青睐:Databricks收购Neon,Supabase融资两亿美元,微软财报点名PG
老冯云数
142次阅读
2025-05-07 10:06:22
SQL 优化之 OR 子句改写
xiongcc
136次阅读
2025-04-21 00:08:06
告别老旧mysql_fdw,升级正当时
NickYoung
121次阅读
2025-04-29 11:15:18
PostgreSQL中文社区亮相于第八届数字中国峰会
PostgreSQL中文社区
105次阅读
2025-05-07 10:06:20
PostgreSQL的dblink扩展模块使用方法
szrsu
103次阅读
2025-04-24 17:39:30