暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

PostgreSQL lsm3 index access method

digoal 2020-08-07
324

作者

digoal

日期

2020-08-07

标签

PostgreSQL , lsm3


背景

lsm3, 2个top index, 1个basic index.

top index包括当前正在写入的(满足快速写入需求)index, 同时包含一个可以后天在线merge的index.

worker process负责merge, 每个lsm3 index需要1个worker processer进程.

当前还没有pg内置btree效果好, 也许是custom wal接口问题, 或者其他问题. 可以使用oprofile分析一下

https://github.com/postgrespro/lsm3

LSM tree implemented using standard Postgres B-Tree indexes.
Top index is used to perform fast inserts and on overflow it is merged
with base index. To perform merge operation concurrently
without blocking other operations with index, two top indexes are used:
active and merged. So totally there are three B-Tree indexes:
two top indexes and one base index.
When performing index scan we have to merge scans of all this three indexes.

This extension needs to create data structure in shared memory and this is why it should be loaded through
"shared_preload_library" list. Once extension is created, you can define indexes using lsm3 access method:

sql create extension lsm3; create table t(id integer, val text); create index idx on t using lsm3(id);

Lsm3 provides for the same types and set of operations as standard B-Tree.

Current restrictions of Lsm3:
- Parallel index scan is not supported.
- Array keys are not supported.
- Lsm3 index can not be declared as unique.

Lsm3 extension can be configured using the following parameters:
- lsm3.max_indexes: maximal number of Lsm3 indexes (default 1024).
- lsm3.top_index_size: size (kb) of top index (default 64Mb).

It is also possible to specify size of top index in relation options - this value will override lsm3.top_index_size GUC.

Although unique constraint can not be enforced using Lsm3 index, it is still possible to mark index as unique to
optimize index search. If index is marked as unique and searched key is found in active
top index, then lookup in other two indexes is not performed. As far as application is most frequently
searching for last recently inserted data, we can speedup this search by performing just one index lookup instead of 3.
Index can be marked as unique using index options:

sql create index idx on t using lsm3(id) with (unique=true);

Please notice that Lsm3 creates bgworker merge process for each Lsm3 index.
So you may need to adjust max_worker_processes in postgresql.conf to be large enough.

```
postgres=# create index idx on t using lsm3(id) with (unique=true);
postgres=# insert into t select generate_series(1,10000000), md5(random()::text);

postgres=# \dt+ t
List of relations
Schema | Name | Type | Owner | Persistence | Size | Description
--------+------+-------+----------+-------------+--------+-------------
public | t | table | postgres | permanent | 651 MB |

public | idx | index | postgres | t | permanent | 69 GB |
```

lsm3索引膨胀比较厉害.

PostgreSQL 许愿链接

您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.

9.9元购买3个月阿里云RDS PostgreSQL实例

PostgreSQL 解决方案集合

德哥 / digoal's github - 公益是一辈子的事.

digoal's wechat

文章转载自digoal,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论