暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

Postgresql之range分区表

IT那活儿 2022-08-03
1574
点击上方“IT那活儿”公众号,关注后了解更多内容,不管IT什么活儿,干就完了!!!

  

Postgresql在市场的运用范围越来越广,分区表的性能在12版本已经得到很大提升,不再采用原来的表继承方式,对使用者越来越友好以下是分区表使用的一些心得。


range分区,常用于月份表
建表
CREATE TABLE yxptest (
    id serial ,
    peaktemp int,
    logdate date not null
) PARTITION BY RANGE (logdate);
CREATE TABLE yxptest_p202201 PARTITION OF yxptest FOR VALUES FROM ('2022-01-01') TO ('2022-02-01');
CREATE TABLE yxptest_p202202 PARTITION OF yxptest FOR VALUES FROM ('2022-02-01') TO ('2022-03-01');



CREATE TABLE yxptest_p202211 PARTITION OF yxptest FOR VALUES FROM ('2022-11-01') TO ('2022-12-01');
CREATE TABLE yxptest_p202212 PARTITION OF yxptest FOR VALUES FROM ('2022-12-01') TO ('2023-01-01');

造数据
insert into yxptest (peaktemp,logdate)
select round(100000000*random()),generate_series('2022-01-01'::date,'2022-12-31'::date,'1 minute');

问题:分区表添加主键必须带上分区键,否则会出现如下报错:
testyxp=# ALTER TABLE yxptest ADD PRIMARY KEY (id);
ERROR: insufficient columns in PRIMARY KEY constraint definition
DETAIL: PRIMARY KEY constraint on table "yxptest" lacks column "logdate" which is part of the partition key.


分区索引在pg11之后的变化
testyxp=# create index idx_1_yxptest on yxptest (peaktemp);
CREATE INDEX
testyxp=# \d yxptest
Partitioned table "public.yxptest"
Column | Type | Collation | Nullable |               Default
----------+---------+-----------+----------+-------------------------------------
id | integer |           | not null | nextval('yxptest_id_seq'::regclass)
peaktemp | integer |           | |
logdate | date |           | not null |
Partition key: RANGE (logdate)
Indexes:
"yxptest_pkey" PRIMARY KEY, btree (id, logdate)
"idx_1_yxptest" btree (peaktemp)
Number of partitions: 12 (Use \d+ to list them.)

testyxp=# \d yxptest_p202211
Table "public.yxptest_p202211"
Column | Type | Collation | Nullable |               Default
----------+---------+-----------+----------+-------------------------------------
id | integer |           | not null | nextval('yxptest_id_seq'::regclass)
peaktemp | integer |           | |
logdate | date |           | not null |
Partition of: yxptest FOR VALUES FROM ('2022-11-01') TO ('2022-12-01')
Indexes:
"yxptest_p202211_pkey" PRIMARY KEY, btree (id, logdate)
"yxptest_p202211_peaktemp_idx" btree (peaktemp)

testyxp=# CREATE TABLE yxptest_p202301 PARTITION OF yxptest FOR VALUES FROM ('2023-01-01') TO ('2023-02-01');
CREATE TABLE
testyxp=# \d yxptest_p202301
Table "public.yxptest_p202301"
Column | Type | Collation | Nullable |               Default
----------+---------+-----------+----------+-------------------------------------
id | integer |           | not null | nextval('yxptest_id_seq'::regclass)
peaktemp | integer |           | |
logdate | date |           | not null |
Partition of: yxptest FOR VALUES FROM ('2023-01-01') TO ('2023-02-01')
Indexes:
"yxptest_p202301_pkey" PRIMARY KEY, btree (id, logdate)
"yxptest_p202301_peaktemp_idx" btree (peaktemp)

可以看到分区表的索引是会自行维护的,在此之前是加完分区,还要加索引继承的方式使用索引。

现在来说已经方便很多,大大减少维护量。

拆分分区
Pg没有split功能,只能移。
1)先把子表从父表移出,并改名;
2)建立新的表,并做子分区;
3)把新表挂到原父表上;
4)回插数据。
testyxp=# alter table yxptest DETACH partition yxptest_p202201;
ALTER TABLE
testyxp=# alter table yxptest_p202201 rename to yxptest_p202201_b;
ALTER TABLE
testyxp=# CREATE TABLE yxptest_p202201 (
testyxp(# id serial ,
testyxp(# peaktemp int,
testyxp(# logdate date not null
testyxp(# ) PARTITION BY RANGE (logdate);
CREATE TABLE
testyxp=# CREATE TABLE yxptest_p202201_1 partition of yxptest_p202201 FOR VALUES FROM ('2022-01-01') TO ('2022-01-10');
CREATE TABLE
testyxp=# CREATE TABLE yxptest_p202201_2 partition of yxptest_p202201 FOR VALUES FROM ('2022-01-10') TO ('2022-01-20');
CREATE TABLE
testyxp=# CREATE TABLE yxptest_p202201_3 partition of yxptest_p202201 FOR VALUES FROM ('2022-01-20') TO ('2022-02-01');
CREATE TABLE
testyxp=# alter table yxptest attach partition yxptest_p202201 for values from ('2010-01-01') to ('2011-01-01');
ALTER TABLE
testyxp=# alter table yxptest DETACH partition yxptest_p202201;
ALTER TABLE
testyxp=# alter table yxptest attach partition yxptest_p202201 for values from ('2022-01-01') TO ('2022-02-01');;
ALTER TABLE
testyxp=# insert into yxptest_p202201 select * from yxptest_p202201_b;
INSERT 0 44640
testyxp=# \d yxptest_p202201
Partitioned table "public.yxptest_p202201"
Column | Type | Collation | Nullable |                   Default
----------+---------+-----------+----------+---------------------------------------------
id | integer |           | not null | nextval('yxptest_p202201_id_seq'::regclass)
peaktemp | integer |           | |
logdate | date |           | not null |
Partition of: yxptest FOR VALUES FROM ('2022-01-01') TO ('2022-02-01')
Partition key: RANGE (logdate)
Indexes:
"yxptest_p202201_pkey1" PRIMARY KEY, btree (id, logdate)
"yxptest_p202201_peaktemp_idx1" btree (peaktemp)
Number of partitions: 3 (Use \d+ to list them.)

做完操作以后检查索引,发现索引是自动引用父表。

查分区表一定要先用分区键,不然会全分区扫描
testyxp=# explain select * from yxptest where peaktemp=70552623;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Append (cost=0.29..131.20 rows=24 width=12)
-> Index Scan using yxptest_p202201_1_peaktemp_idx on yxptest_p202201_1 (cost=0.29..8.30 rows=1 width=12)
Index Cond: (peaktemp = 70552623)
->
  Index Scan using yxptest_p202201_2_peaktemp_idx on yxptest_p202201_2 (cost=0.29..8.30 rows=1 width=12)
Index Cond: (peaktemp = 70552623)
->
  Index Scan using yxptest_p202201_3_peaktemp_idx on yxptest_p202201_3 (cost=0.29..8.30 rows=1 width=12)
Index Cond: (peaktemp = 70552623)
->
  Index Scan using yxptest_p202202_peaktemp_idx on yxptest_p202202 (cost=0.29..8.31 rows=1 width=12)
Index Cond: (peaktemp = 70552623)
->
  Index Scan using yxptest_p202203_peaktemp_idx on yxptest_p202203 (cost=0.29..8.31 rows=1 width=12)
Index Cond: (peaktemp = 70552623)
->
  Index Scan using yxptest_p202204_peaktemp_idx on yxptest_p202204 (cost=0.29..8.31 rows=1 width=12)
Index Cond: (peaktemp = 70552623)
->
  Index Scan using yxptest_p202205_peaktemp_idx on yxptest_p202205 (cost=0.29..8.31 rows=1 width=12)
Index Cond: (peaktemp = 70552623)
->
  Index Scan using yxptest_p202206_peaktemp_idx on yxptest_p202206 (cost=0.29..8.31 rows=1 width=12)
Index Cond: (peaktemp = 70552623)
->
  Index Scan using yxptest_p202207_peaktemp_idx on yxptest_p202207 (cost=0.29..8.31 rows=1 width=12)
Index Cond: (peaktemp = 70552623)
->
  Index Scan using yxptest_p202208_peaktemp_idx on yxptest_p202208 (cost=0.29..8.31 rows=1 width=12)
Index Cond: (peaktemp = 70552623)
->
  Index Scan using yxptest_p202209_peaktemp_idx on yxptest_p202209 (cost=0.29..8.31 rows=1 width=12)
Index Cond: (peaktemp = 70552623)
->
  Index Scan using yxptest_p202210_peaktemp_idx on yxptest_p202210 (cost=0.29..8.31 rows=1 width=12)
Index Cond: (peaktemp = 70552623)
->
  Index Scan using yxptest_p202211_peaktemp_idx on yxptest_p202211 (cost=0.29..8.31 rows=1 width=12)
Index Cond: (peaktemp = 70552623)
->
  Index Scan using yxptest_p202212_peaktemp_idx on yxptest_p202212 (cost=0.29..8.31 rows=1 width=12)
Index Cond: (peaktemp = 70552623)
->
  Bitmap Heap Scan on yxptest_p202301 (cost=4.23..14.79 rows=10 width=12)
Recheck Cond: (peaktemp = 70552623)
->
  Bitmap Index Scan on yxptest_p202301_peaktemp_idx (cost=0.00..4.23 rows=10 width=0)
Index Cond: (peaktemp = 70552623)
(33 rows)

testyxp=# explain select * from yxptest
where logdate='2022-01-01' and peaktemp=70552623;
QUERY PLAN
---------------------------------------------------------------------------------------------------------
Index Scan using yxptest_p202201_1_peaktemp_idx on yxptest_p202201_1 (cost=0.29..8.30 rows=1 width=12)
Index Cond: (peaktemp = 70552623)
Filter: (
logdate = '2022-01-01'::date)
(3 rows)

总结:

  • 主键要带上分区键,建组合主键,否则会失败。
  • 不支持在线分区split,就是我们常说的分区表拆分子分区,pg没有类似的功能,但是可以取消挂载的方式重新建立分区再挂到父表。
  • 查询使用分区键,where条件后先写分区键,否则是会查询所有分区,造成不必要的查询成本。

本文作者:叶小普(上海新炬王翦团队)

本文来源:“IT那活儿”公众号

文章转载自IT那活儿,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论

暂无图片
获得了256次点赞
暂无图片
内容获得92次评论
暂无图片
获得了70次收藏