Postgresql之range分区表

IT那活儿 2022-08-03

1574

点击上方“IT那活儿”公众号，关注后了解更多内容，不管IT什么活儿，干就完了！！！

Postgresql在市场的运用范围越来越广，分区表的性能在12版本已经得到很大提升，不再采用原来的表继承方式，对使用者越来越友好，以下是分区表使用的一些心得。

range分区，常用于月份表

建表
CREATE TABLE yxptest (
    id serial ,
    peaktemp int,
    logdate date not null
) PARTITION BY RANGE (logdate);
CREATE TABLE yxptest_p202201 PARTITION OF yxptest FOR VALUES FROM ('2022-01-01') TO ('2022-02-01');
CREATE TABLE yxptest_p202202 PARTITION OF yxptest FOR VALUES FROM ('2022-02-01') TO ('2022-03-01');
。
。
。
CREATE TABLE yxptest_p202211 PARTITION OF yxptest FOR VALUES FROM ('2022-11-01') TO ('2022-12-01');
CREATE TABLE yxptest_p202212 PARTITION OF yxptest FOR VALUES FROM ('2022-12-01') TO ('2023-01-01');

造数据
insert into yxptest (peaktemp,logdate)
select round(100000000*random()),generate_series('2022-01-01'::date,'2022-12-31'::date,'1 minute');

问题：分区表添加主键必须带上分区键，否则会出现如下报错：

testyxp=# ALTER TABLE yxptest ADD PRIMARY KEY (id);
ERROR: insufficient columns in PRIMARY KEY constraint definition
DETAIL: PRIMARY KEY constraint on table "yxptest" lacks column "logdate" which is part of the partition key.

分区索引在pg11之后的变化

testyxp=# create index idx_1_yxptest on yxptest (peaktemp);
CREATE INDEX
testyxp=# \d yxptest
Partitioned table "public.yxptest"
Column | Type | Collation | Nullable |               Default
----------+---------+-----------+----------+-------------------------------------
id | integer |           | not null | nextval('yxptest_id_seq'::regclass)
peaktemp | integer |           | |
logdate | date |           | not null |
Partition key: RANGE (logdate)
Indexes:
"yxptest_pkey" PRIMARY KEY, btree (id, logdate)
"idx_1_yxptest" btree (peaktemp)
Number of partitions: 12 (Use \d+ to list them.)

testyxp=# \d yxptest_p202211
Table "public.yxptest_p202211"
Column | Type | Collation | Nullable |               Default
----------+---------+-----------+----------+-------------------------------------
id | integer |           | not null | nextval('yxptest_id_seq'::regclass)
peaktemp | integer |           | |
logdate | date |           | not null |
Partition of: yxptest FOR VALUES FROM ('2022-11-01') TO ('2022-12-01')
Indexes:
"yxptest_p202211_pkey" PRIMARY KEY, btree (id, logdate)
"yxptest_p202211_peaktemp_idx" btree (peaktemp)

testyxp=# CREATE TABLE yxptest_p202301 PARTITION OF yxptest FOR VALUES FROM ('2023-01-01') TO ('2023-02-01');
CREATE TABLE
testyxp=# \d yxptest_p202301
Table "public.yxptest_p202301"
Column | Type | Collation | Nullable |               Default
----------+---------+-----------+----------+-------------------------------------
id | integer |           | not null | nextval('yxptest_id_seq'::regclass)
peaktemp | integer |           | |
logdate | date |           | not null |
Partition of: yxptest FOR VALUES FROM ('2023-01-01') TO ('2023-02-01')
Indexes:
"yxptest_p202301_pkey" PRIMARY KEY, btree (id, logdate)
"yxptest_p202301_peaktemp_idx" btree (peaktemp)

可以看到分区表的索引是会自行维护的，在此之前是加完分区，还要加索引继承的方式使用索引。

现在来说已经方便很多，大大减少维护量。

拆分分区

Pg没有split功能，只能移。

1）先把子表从父表移出，并改名；

2）建立新的表，并做子分区；

3）把新表挂到原父表上；

4）回插数据。

testyxp=# alter table yxptest DETACH partition yxptest_p202201;
ALTER TABLE
testyxp=# alter table yxptest_p202201 rename to yxptest_p202201_b;
ALTER TABLE
testyxp=# CREATE TABLE yxptest_p202201 (
testyxp(# id serial ,
testyxp(# peaktemp int,
testyxp(# logdate date not null
testyxp(# ) PARTITION BY RANGE (logdate);
CREATE TABLE
testyxp=# CREATE TABLE yxptest_p202201_1 partition of yxptest_p202201 FOR VALUES FROM ('2022-01-01') TO ('2022-01-10');
CREATE TABLE
testyxp=# CREATE TABLE yxptest_p202201_2 partition of yxptest_p202201 FOR VALUES FROM ('2022-01-10') TO ('2022-01-20');
CREATE TABLE
testyxp=# CREATE TABLE yxptest_p202201_3 partition of yxptest_p202201 FOR VALUES FROM ('2022-01-20') TO ('2022-02-01');
CREATE TABLE
testyxp=# alter table yxptest attach partition yxptest_p202201 for values from ('2010-01-01') to ('2011-01-01');
ALTER TABLE
testyxp=# alter table yxptest DETACH partition yxptest_p202201;
ALTER TABLE
testyxp=# alter table yxptest attach partition yxptest_p202201 for values from ('2022-01-01') TO ('2022-02-01');;
ALTER TABLE
testyxp=# insert into yxptest_p202201 select * from yxptest_p202201_b;
INSERT 0 44640
testyxp=# \d yxptest_p202201
Partitioned table "public.yxptest_p202201"
Column | Type | Collation | Nullable |                   Default
----------+---------+-----------+----------+---------------------------------------------
id | integer |           | not null | nextval('yxptest_p202201_id_seq'::regclass)
peaktemp | integer |           | |
logdate | date |           | not null |
Partition of: yxptest FOR VALUES FROM ('2022-01-01') TO ('2022-02-01')
Partition key: RANGE (logdate)
Indexes:
"yxptest_p202201_pkey1" PRIMARY KEY, btree (id, logdate)
"yxptest_p202201_peaktemp_idx1" btree (peaktemp)
Number of partitions: 3 (Use \d+ to list them.)

做完操作以后检查索引，发现索引是自动引用父表。

查分区表一定要先用分区键，不然会全分区扫描

testyxp=# explain select * from yxptest where peaktemp=70552623;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Append (cost=0.29..131.20 rows=24 width=12)
-> Index Scan using yxptest_p202201_1_peaktemp_idx on yxptest_p202201_1 (cost=0.29..8.30 rows=1 width=12)
Index Cond: (peaktemp = 70552623)
->  Index Scan using yxptest_p202201_2_peaktemp_idx on yxptest_p202201_2 (cost=0.29..8.30 rows=1 width=12)
Index Cond: (peaktemp = 70552623)
->  Index Scan using yxptest_p202201_3_peaktemp_idx on yxptest_p202201_3 (cost=0.29..8.30 rows=1 width=12)
Index Cond: (peaktemp = 70552623)
->  Index Scan using yxptest_p202202_peaktemp_idx on yxptest_p202202 (cost=0.29..8.31 rows=1 width=12)
Index Cond: (peaktemp = 70552623)
->  Index Scan using yxptest_p202203_peaktemp_idx on yxptest_p202203 (cost=0.29..8.31 rows=1 width=12)
Index Cond: (peaktemp = 70552623)
->  Index Scan using yxptest_p202204_peaktemp_idx on yxptest_p202204 (cost=0.29..8.31 rows=1 width=12)
Index Cond: (peaktemp = 70552623)
->  Index Scan using yxptest_p202205_peaktemp_idx on yxptest_p202205 (cost=0.29..8.31 rows=1 width=12)
Index Cond: (peaktemp = 70552623)
->  Index Scan using yxptest_p202206_peaktemp_idx on yxptest_p202206 (cost=0.29..8.31 rows=1 width=12)
Index Cond: (peaktemp = 70552623)
->  Index Scan using yxptest_p202207_peaktemp_idx on yxptest_p202207 (cost=0.29..8.31 rows=1 width=12)
Index Cond: (peaktemp = 70552623)
->  Index Scan using yxptest_p202208_peaktemp_idx on yxptest_p202208 (cost=0.29..8.31 rows=1 width=12)
Index Cond: (peaktemp = 70552623)
->  Index Scan using yxptest_p202209_peaktemp_idx on yxptest_p202209 (cost=0.29..8.31 rows=1 width=12)
Index Cond: (peaktemp = 70552623)
->  Index Scan using yxptest_p202210_peaktemp_idx on yxptest_p202210 (cost=0.29..8.31 rows=1 width=12)
Index Cond: (peaktemp = 70552623)
->  Index Scan using yxptest_p202211_peaktemp_idx on yxptest_p202211 (cost=0.29..8.31 rows=1 width=12)
Index Cond: (peaktemp = 70552623)
->  Index Scan using yxptest_p202212_peaktemp_idx on yxptest_p202212 (cost=0.29..8.31 rows=1 width=12)
Index Cond: (peaktemp = 70552623)
->  Bitmap Heap Scan on yxptest_p202301 (cost=4.23..14.79 rows=10 width=12)
Recheck Cond: (peaktemp = 70552623)
->  Bitmap Index Scan on yxptest_p202301_peaktemp_idx (cost=0.00..4.23 rows=10 width=0)
Index Cond: (peaktemp = 70552623)
(33 rows)

testyxp=# explain select * from yxptest where logdate='2022-01-01' and peaktemp=70552623;
QUERY PLAN
---------------------------------------------------------------------------------------------------------
Index Scan using yxptest_p202201_1_peaktemp_idx on yxptest_p202201_1 (cost=0.29..8.30 rows=1 width=12)
Index Cond: (peaktemp = 70552623)
Filter: (logdate = '2022-01-01'::date)
(3 rows)