暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

PostgreSQL 14 preview - gist和sp-gist索引AM支持sort接口, 大幅加速GiST和SP-GiST 索引build速度

digoal 2021-01-04
238

作者

digoal

日期

2021-04-08

标签

PostgreSQL , GiST , SP-GiST , sortsupport , Z-order (aka Morton Code)


背景

gist和sp-gist索引AM支持sort接口, 加速GiST和SP-GiST 索引build速度

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=16fa9b2b30

```
Add support for building GiST index by sorting.
author Heikki Linnakangas heikki.linnakangas@iki.fi
Thu, 17 Sep 2020 08:33:40 +0000 (11:33 +0300)
committer Heikki Linnakangas heikki.linnakangas@iki.fi
Thu, 17 Sep 2020 08:33:40 +0000 (11:33 +0300)
commit 16fa9b2b30a357b4aea982bd878ec2e5e002dbcc
tree d1ee3378a010ad424ce41db8af503d534566cf6e tree
parent 089da3c4778fdc1931f721a265caa0c6fca38584 commit | diff
Add support for building GiST index by sorting.

This adds a new optional support function to the GiST access method:
sortsupport. If it is defined, the GiST index is built by sorting all data
to the order defined by the sortsupport's comparator function, and packing
the tuples in that order to GiST pages. This is similar to how B-tree
index build works, and is much faster than inserting the tuples one by
one. The resulting index is smaller too, because the pages are packed more
tightly, upto 'fillfactor'. The normal build method works by splitting
pages, which tends to lead to more wasted space.

The quality of the resulting index depends on how good the opclass-defined
sort order is. A good order preserves locality of the input data.

As the first user of this facility, add 'sortsupport' function to the
point_ops opclass. It sorts the points in Z-order (aka Morton Code), by
interleaving the bits of the X and Y coordinates.

Author: Andrey Borodin
Reviewed-by: Pavel Borisov, Thomas Munro
Discussion: https://www.postgresql.org/message-id/1A36620E-CAD8-4267-9067-FB31385E7C0D%40yandex-team.ru
```

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=9f984ba6d23dc6eecebf479ab1d3f2e550a4e9be

```
Add sortsupport for gist_btree opclasses, for faster index builds.
author Heikki Linnakangas heikki.linnakangas@iki.fi
Wed, 7 Apr 2021 10:22:05 +0000 (13:22 +0300)
committer Heikki Linnakangas heikki.linnakangas@iki.fi
Wed, 7 Apr 2021 10:22:05 +0000 (13:22 +0300)
commit 9f984ba6d23dc6eecebf479ab1d3f2e550a4e9be
tree 1ec0fd0b4721f3c89960a2a0699cc398e6a659b3 tree
parent dd13ad9d39a1ba41cf329b6fe408b49be57c7b88 commit | diff
Add sortsupport for gist_btree opclasses, for faster index builds.

Commit 16fa9b2b30 introduced a faster way to build GiST indexes, by
sorting all the data. This commit adds the sortsupport functions needed
to make use of that feature for btree_gist.

Author: Andrey Borodin
Discussion: https://www.postgresql.org/message-id/2F3F7265-0D22-44DB-AD71-8554C743D943@yandex-team.ru
```

+ The <function>sortsupport</function> method is also optional and is used to + speed up building a <acronym>GiST</acronym> index.

+ + <varlistentry> + <term><function>sortsupport</function></term> + <listitem> + <para> + Returns a comparator function to sort data in a way that preserves + locality. It is used by <command>CREATE INDEX</command> and + <command>REINDEX</command> commands. The quality of the created index + depends on how well the sort order determined by the comparator function + preserves locality of the inputs. + </para> + <para> + The <function>sortsupport</function> method is optional. If it is not + provided, <command>CREATE INDEX</command> builds the index by inserting + each tuple to the tree using the <function>penalty</function> and + <function>picksplit</function> functions, which is much slower. + </para> + + <para> + The <acronym>SQL</acronym> declaration of the function must look like + this: + +<programlisting> +CREATE OR REPLACE FUNCTION my_sortsupport(internal) +RETURNS void +AS 'MODULE_PATHNAME' +LANGUAGE C STRICT; +</programlisting> + + The argument is a pointer to a <structname>SortSupport</structname> + struct. At a minimum, the function must fill in its comparator field. + The comparator takes three arguments: two Datums to compare, and + a pointer to the <structname>SortSupport</structname> struct. The + Datums are the two indexed values in the format that they are stored + in the index; that is, in the format returned by the + <function>compress</function> method. The full API is defined in + <filename>src/include/utils/sortsupport.h</filename>. + </para> + + <para> + The matching code in the C module could then follow this skeleton: + +<programlisting> +PG_FUNCTION_INFO_V1(my_sortsupport); + +static int +my_fastcmp(Datum x, Datum y, SortSupport ssup) +{ + /* establish order between x and y by computing some sorting value z */ + + int z1 = ComputeSpatialCode(x); + int z2 = ComputeSpatialCode(y); + + return z1 == z2 ? 0 : z1 > z2 ? 1 : -1; +} + +Datum +my_sortsupport(PG_FUNCTION_ARGS) +{ + SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0); + + ssup->comparator = my_fastcmp; + PG_RETURN_VOID(); +} +</programlisting> + </para> + </listitem> + </varlistentry>

+Tuplesortstate * +tuplesort_begin_index_gist(Relation heapRel, + Relation indexRel, + int workMem, + SortCoordinate coordinate, + bool randomAccess) +{ + Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate, + randomAccess); + MemoryContext oldcontext; + int i; + + oldcontext = MemoryContextSwitchTo(state->sortcontext); + +#ifdef TRACE_SORT + if (trace_sort) + elog(LOG, + "begin index sort: workMem = %d, randomAccess = %c", + workMem, randomAccess ? 't' : 'f'); +#endif + + state->nKeys = IndexRelationGetNumberOfKeyAttributes(indexRel); + + state->comparetup = comparetup_index_btree; + state->copytup = copytup_index; + state->writetup = writetup_index; + state->readtup = readtup_index; + + state->heapRel = heapRel; + state->indexRel = indexRel; + + /* Prepare SortSupport data for each column */ + state->sortKeys = (SortSupport) palloc0(state->nKeys * + sizeof(SortSupportData)); + + for (i = 0; i < state->nKeys; i++) + { + SortSupport sortKey = state->sortKeys + i; + + sortKey->ssup_cxt = CurrentMemoryContext; + sortKey->ssup_collation = indexRel->rd_indcollation[i]; + sortKey->ssup_nulls_first = false; + sortKey->ssup_attno = i + 1; + /* Convey if abbreviation optimization is applicable in principle */ + sortKey->abbreviate = (i == 0); + + AssertState(sortKey->ssup_attno != 0); + + /* Look for a sort support function */ + PrepareSortSupportFromGistIndexRel(indexRel, sortKey); + } + + MemoryContextSwitchTo(oldcontext); + + return state; +}

PostgreSQL 许愿链接

您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.

9.9元购买3个月阿里云RDS PostgreSQL实例

PostgreSQL 解决方案集合

德哥 / digoal's github - 公益是一辈子的事.

digoal's wechat

文章转载自digoal,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论