作者
digoal
日期
2019-05-21
标签
PostgreSQL , 原生分区表 , ppas , lock , cache , plan , 性能 , pg_pathman
背景
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=428b260f87e8861ba8e58807b69d433db491c4f4
```
Speed up planning when partitions can be pruned at plan time.
Previously, the planner created RangeTblEntry and RelOptInfo structs
for every partition of a partitioned table, even though many of them
might later be deemed uninteresting thanks to partition pruning logic.
This incurred significant overhead when there are many partitions.
Arrange to postpone creation of these data structures until after
we've processed the query enough to identify restriction quals for
the partitioned table, and then apply partition pruning before not
after creation of each partition's data structures. In this way
we need not open the partition relations at all for partitions that
the planner has no real interest in.
For queries that can be proven at plan time to access only a small
number of partitions, this patch improves the practical maximum
number of partitions from under 100 to perhaps a few thousand.
Amit Langote, reviewed at various times by Dilip Kumar, Jesper Pedersen,
Yoshikazu Imai, and David Rowley
Discussion: https://postgr.es/m/9d7c5112-cb99-6a47-d3be-cf1ee6862a1d@lab.ntt.co.jp
```
对比单表、分区表(1024个分区)的查询、upsert性能。
查询性能提升469倍,UPSERT性能提升85倍。
测试详情
一、查询
1、单表1亿
```
drop table test;
create unlogged table test (id int primary key, info text, crt_time timestamp);
insert into test select generate_series(1,100000000), md5(random()::text), clock_timestamp();
```
2、分区表1024分区1亿记录
```
drop table p;
create unlogged table p (id int primary key, info text, crt_time timestamp) partition by hash (id);
do language plpgsql $$
declare
begin
for i in 0..1023 loop
execute format('create unlogged table p_%s partition of p for values WITH (MODULUS 1024, REMAINDER %s)', i, i);
end loop;
end;
$$;
insert into p select generate_series(1,100000000), md5(random()::text), clock_timestamp();
```
3、测试脚本
```
vi test.sql
\set id random(1,100000000)
select * from test where id=:id;
pgbench -M prepared -n -r -P 1 -f ./test.sql -c 64 -j 64 -T 120
vi p.sql
\set id random(1,100000000)
select * from p where id=:id;
pgbench -M prepared -n -r -P 1 -f ./p.sql -c 64 -j 64 -T 120
```
二、upsert
1、单表
```
drop table test;
create unlogged table test (id int primary key, info text, crt_time timestamp);
```
2、分区表1024分区
```
drop table p;
create unlogged table p (id int primary key, info text, crt_time timestamp) partition by hash (id);
do language plpgsql $$
declare
begin
for i in 0..1023 loop
execute format('create unlogged table p_%s partition of p for values WITH (MODULUS 1024, REMAINDER %s)', i, i);
end loop;
end;
$$;
```
3、测试脚本
```
vi test.sql
\set id random(1,1000000000)
insert into test values (:id, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
pgbench -M prepared -n -r -P 1 -f ./test.sql -c 64 -j 64 -T 120
vi p.sql
\set id random(1,1000000000)
insert into p values (:id, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
pgbench -M prepared -n -r -P 1 -f ./p.sql -c 64 -j 64 -T 120
```
三、单表测试结果
查询
1、PG 11
transaction type: ./test.sql
scaling factor: 1
query mode: prepared
number of clients: 64
number of threads: 64
duration: 120 s
number of transactions actually processed: 139429505
latency average = 0.055 ms
latency stddev = 0.028 ms
tps = 1161617.190409 (including connections establishing)
tps = 1161671.658375 (excluding connections establishing)
statement latencies in milliseconds:
0.001 \set id random(1,100000000)
0.054 select * from test where id=:id;
2、PG 12 beta1
transaction type: ./test.sql
scaling factor: 1
query mode: prepared
number of clients: 64
number of threads: 64
duration: 120 s
number of transactions actually processed: 139315103
latency average = 0.055 ms
latency stddev = 0.026 ms
tps = 1160834.845959 (including connections establishing)
tps = 1160909.508230 (excluding connections establishing)
statement latencies in milliseconds:
0.001 \set id random(1,100000000)
0.054 select * from test where id=:id;
upsert
1、PG 11
transaction type: ./test.sql
scaling factor: 1
query mode: prepared
number of clients: 64
number of threads: 64
duration: 120 s
number of transactions actually processed: 38097708
latency average = 0.202 ms
latency stddev = 0.427 ms
tps = 317363.894102 (including connections establishing)
tps = 317379.871026 (excluding connections establishing)
statement latencies in milliseconds:
0.001 \set id random(1,1000000000)
0.201 insert into test values (:id, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
2、PG 12 beta1
transaction type: ./test.sql
scaling factor: 1
query mode: prepared
number of clients: 64
number of threads: 64
duration: 120 s
number of transactions actually processed: 39910689
latency average = 0.192 ms
latency stddev = 0.468 ms
tps = 332535.544935 (including connections establishing)
tps = 332552.090926 (excluding connections establishing)
statement latencies in milliseconds:
0.001 \set id random(1,1000000000)
0.192 insert into test values (:id, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
四、分区表测试结果
查询
1、PG 11
transaction type: ./p.sql
scaling factor: 1
query mode: prepared
number of clients: 64
number of threads: 64
duration: 120 s
number of transactions actually processed: 139751
latency average = 54.965 ms
latency stddev = 29.741 ms
tps = 1163.142894 (including connections establishing)
tps = 1163.219206 (excluding connections establishing)
statement latencies in milliseconds:
0.002 \set id random(1,100000000)
54.962 select * from p where id=:id;
2、PG 12 beta1
transaction type: ./p.sql
scaling factor: 1
query mode: prepared
number of clients: 64
number of threads: 64
duration: 120 s
number of transactions actually processed: 65520035
latency average = 0.117 ms
latency stddev = 1.828 ms
tps = 545568.561868 (including connections establishing)
tps = 545602.048702 (excluding connections establishing)
statement latencies in milliseconds:
0.001 \set id random(1,100000000)
0.119 select * from p where id=:id;
upsert
1、PG 11
transaction type: ./p.sql
scaling factor: 1
query mode: prepared
number of clients: 64
number of threads: 64
duration: 120 s
number of transactions actually processed: 346441
latency average = 22.169 ms
latency stddev = 3.068 ms
tps = 2885.225962 (including connections establishing)
tps = 2885.448815 (excluding connections establishing)
statement latencies in milliseconds:
0.002 \set id random(1,1000000000)
22.168 insert into p values (:id, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
1、PG 12 beta1
transaction type: ./p.sql
scaling factor: 1
query mode: prepared
number of clients: 64
number of threads: 64
duration: 120 s
number of transactions actually processed: 29607930
latency average = 0.259 ms
latency stddev = 0.272 ms
tps = 246612.996979 (including connections establishing)
tps = 246627.432651 (excluding connections establishing)
statement latencies in milliseconds:
0.001 \set id random(1,1000000000)
0.258 insert into p values (:id, md5(random()::text), now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
小结
PG 12以前,优化器不管SQL是操作单一分区还是多个分区,统统把所有的分区元数据都创建RangeTblEntry and RelOptInfo,引入较多的OVERHEAD。分区很多时性能影响较大。
case | PG 11 | PG 12 beta1 | 性能提升倍数
---|---|---|---
单表查询 qps | 1161671 | 1160909 | 持平
单表upsert qps | 317379 | 332552 | 持平
分区表(1024分区)查询 qps | 1163 | 545602 | 469倍
分区表(1024分区)upsert qps | 2885 | 246627 | 85倍
《PostgreSQL 12 preview - partitions pruned at plan time》
PostgreSQL 许愿链接
您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.