【每日分享】2023.02.13 Bitmap scan VS. Index Scan

原创 Maleah 2023-02-13

635

PG Bitmap scan 和 Index Scan 区别

位图扫描：https://www.modb.pro/db/611186

索引：https://www.modb.pro/db/405851

	Index Scan	Bitmap Index Scan
前提	索引	索引
扫描过程	一次取一条数据	一次性的全部取出
需注意的问题	随机I/O	recheck的花销
适用场景		离散度非常高，并且离散IO的成本很高的场景；And/Or

index scan: index -> ctid (heap tuple id) -> read heap tuple

索引扫描：找出匹配的ctid，读取heap表
bitmap scan: index -> heap blockid -> heap block scan -> recheck (filter tuple)

位图扫描：bitmap index scan阶段取得符合条件的所有行号, 获得对应的block id，构建位图；bitmap heap scan阶段根据bitmap扫描block id顺序从heap 表搜索数据，最后recheck, 根据查询条件再做一次 recheck , 过滤放大的无效记录.

bitmap scan的作用就是通过建立位图的方式，将回表过程中的访问随机IO的转换为顺序扫描行为，从而减少查询过程中IO的消耗。但是引入了recheck。

适用场景：

- I thought "Bitmap Index Scan" was only used when there are two or more applicable indexes in the plan, so I don't understand why is it used now?

True, we can combine multiple bitmaps via AND/OR operations to merge results from multiple indexes before visiting the table ... but it's still potentially worthwhile even for one index.  A rule of thumb is that plain indexscan wins for fetching a small number of tuples, bitmap scan wins for a somewhat larger number of tuples, and seqscan wins if you're fetching a large percentage of the whole table.

👆引用自： https://www.postgresql.org/message-id/12553.1135634231@sss.pgh.pa.us

pg_stat_all_indexes的idx_tup_read和idx_tup_fetch字段：

idx_tup_read：从指定索引的leap page获取tid的条数（垃圾版本计算）

idx_tup_fetch：从索引的tid获取到的有效HEAP tuple的记数(垃圾版本不算)（每扫一条tuple计1）

bitmap scan不计算idx_tup_fetch。因为不是从leaf page的tid去查询heap tuple获取的，而是从HEAP BLOCK id去查询heap block然后再过滤出有效tuple的

👆摘自于：https://github.com/digoal/blog/blob/master/201610/20161018_03.md

墨力计划 pg

最后修改时间：2023-02-13 18:51:40

「喜欢这篇文章，您的关注和赞赏是给作者最好的鼓励」

关注作者

文章被以下合辑收录

PG每日分享（共22篇）

PostgreSQL 每日分享

【每日分享】2023.02.13 Bitmap scan VS. Index Scan

PG Bitmap scan 和 Index Scan 区别

文章被以下合辑收录

评论