
1.背景
今天大家伙讨论问题比较热闹,看到一位朋友提出一个问题,说整片删完数据的空间vacuum之后不能有效复用。感觉蛮有意思。于是几个人纷纷尝试着重现,都没有重现或者完全重现出来。
具体问题如下图: (这里直接引用原图)

2.复现及分析
仔细小心追问下,得知提问题的朋友使用的是一个相对低版本的PostgreSQL,用的是PG12.8。难怪大家都难以重现。我们几个人试的都是14.X。
2.1 先看14.X中的实验结果:
postgres=# select version();
version
---------------------------------------------------------------------------------------------------------
PostgreSQL 14.7 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
(1 row)
postgres=# create table t(id int primary key, col2 varchar(32));
CREATE TABLE
postgres=# insert into t select n, 'test' || n from generate_series(1, 200000) as n;
INSERT 0 200000
postgres=# select pg_indexes_size('t') index_size, pg_total_relation_size('t') as total, pg_relation_size('t') as relation;
index_size | total | relation
------------+----------+----------
4513792 | 13402112 | 8863744
(1 row)
postgres=# delete from t where id < 190001;
DELETE 190000
postgres=# select pg_indexes_size('t') index_size, pg_total_relation_size('t') as total, pg_relation_size('t') as relation;
index_size | total | relation
------------+----------+----------
4513792 | 13402112 | 8863744
(1 row)
postgres=# vacuum verbose t;
INFO: vacuuming "public.t"
INFO: index "t_pkey" now contains 10000 row versions in 551 pages
DETAIL: 0 index row versions were removed.
0 index pages were newly deleted.
520 index pages are currently deleted, of which 520 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO: table "t": found 0 removable, 15 nonremovable row versions in 1 out of 1082 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 401209
Skipped 0 pages due to buffer pins, 1027 frozen pages.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM
postgres=# select pg_indexes_size('t') index_size, pg_total_relation_size('t') as total, pg_relation_size('t') as relation;
index_size | total | relation
------------+----------+----------
4538368 | 13434880 | 8863744
(1 row)
postgres=# insert into t select n, 'test' || n from generate_series(1, 190000) as n;
INSERT 0 190000
postgres=# select pg_indexes_size('t') index_size, pg_total_relation_size('t') as total, pg_relation_size('t') as relation;
index_size | total | relation
------------+----------+----------
7962624 | 16859136 | 8863744
(1 row)
postgres=# vacuum verbose t;
INFO: vacuuming "public.t"
INFO: table "t": found 0 removable, 15 nonremovable row versions in 1 out of 1082 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 401211
Skipped 0 pages due to buffer pins, 0 frozen pages.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM
postgres=# select pg_indexes_size('t') index_size, pg_total_relation_size('t') as total, pg_relation_size('t') as relation;
index_size | total | relation
------------+----------+----------
7962624 | 16859136 | 8863744
(1 row)
postgres=# insert into t select n, 'test' || n from generate_series(200001, 210000) as n;
INSERT 0 10000
postgres=# select pg_indexes_size('t') index_size, pg_total_relation_size('t') as total, pg_relation_size('t') as relation;
index_size | total | relation
------------+----------+----------
8192000 | 17530880 | 9306112
(1 row)
如果我们细心一点会发现,relation的空间是可以复用的。它前后没什么变化。但是index所占空间还是有一些增长。索引空间从:4513792->4538368(vacuum后) -> 7962624。
2.2 PostgreSQL-12.14中的验证复现过程:
postgres=# select version();
version
----------------------------------------------------------------------------------------------------------
PostgreSQL 12.14 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
(1 row)
postgres=# create table t(id int primary key, col2 varchar(32));
CREATE TABLE
postgres=# insert into t select n, 'test' || n from generate_series(1, 200000) as n;
INSERT 0 200000
postgres=# select pg_indexes_size('t') index_size, pg_total_relation_size('t') as total, pg_relation_size('t') as relation;
index_size | total | relation
------------+----------+----------
4513792 | 13402112 | 8863744
(1 row)
postgres=# delete from t where id < 190001;
DELETE 190000
postgres=# select pg_indexes_size('t') index_size, pg_total_relation_size('t') as total, pg_relation_size('t') as relation;
index_size | total | relation
------------+----------+----------
4513792 | 13402112 | 8863744
(1 row)
能看到,delete前边19万条记录之扣,表的空间占用情况并没有改变。在做了下这的vacuum之后,依然如此。这个也基本符合预期。然后再看看后续的insert操作,索引空间在涨,而表本身的占用是能复用的 (relation size全程没有变化)。
postgres=# vacuum verbose t;
INFO: vacuuming "public.t"
INFO: scanned index "t_pkey" to remove 190000 row versions
DETAIL: CPU: user: 0.01 s, system: 0.00 s, elapsed: 0.01 s
INFO: "t": removed 190000 row versions in 1028 pages
DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
INFO: index "t_pkey" now contains 10000 row versions in 551 pages
DETAIL: 190000 index row versions were removed.
520 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO: "t": found 190000 removable, 10000 nonremovable row versions in 1082 out of 1082 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 505
There were 0 unused item identifiers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.03 s, system: 0.00 s, elapsed: 0.03 s.
VACUUM
postgres=# select pg_indexes_size('t') index_size, pg_total_relation_size('t') as total, pg_relation_size('t') as relation;
index_size | total | relation
------------+----------+----------
4513792 | 13410304 | 8863744
(1 row)
postgres=# insert into t select n, 'test' || n from generate_series(1, 190000) as n;
INSERT 0 190000
postgres=# select pg_indexes_size('t') index_size, pg_total_relation_size('t') as total, pg_relation_size('t') as relation;
index_size | total | relation
------------+----------+----------
12197888 | 21094400 | 8863744
(1 row)
postgres=# insert into t select n, 'test' || n from generate_series(200001, 210000) as n;
INSERT 0 10000
postgres=# select pg_indexes_size('t') index_size, pg_total_relation_size('t') as total, pg_relation_size('t') as relation;
index_size | total | relation
------------+----------+----------
12427264 | 21766144 | 9306112
(1 row)

如果有遗忘,total_relation_size, index_size, relation_size的关系可以参考上图。
在重新插入数据以后,索引空间从: 4513792-> 4513792(vacuum事)-> 12197888.
对比PG14, 索引空间从:4513792->4538368(vacuum后) -> 7962624。PG14里,效果明显比12要好很多。
但至少有一点可以肯定的是,纯数据行的表所占的空间确实是能够重复利用的。问题出在索引所占的空间那块儿。
cc大侠解释说是"准确来说,应该是13以后,索引的de-duplicate才开始像样。" 总之,上述现象的根源全在于索引空间的重复利用上。
感觉这个蛮有意思的。可以等cc后续发文详细解释其中的原理,期待中......
最后看来,在生产环境,周期性的使用pg_repack等进行完全回收还是很有意义的。
参考
有关表膨胀的相关话题,可以参考 PostgreSQL学徒 里的大量文章。





