PolarDB-PG原理解读——ANALYZE 源码解读（一）

PolarDB农夫山泉 2023-08-29

152

PolarDB PostgreSQL版（以下简称 PolarDB-PG）是一款阿里云自主研发的企业级数据库产品，采用计算存储分离架构，兼容 PostgreSQL 与 Oracle。PolarDB-PG 的存储与计算能力均可横向扩展，具有高可靠、高可用、弹性扩展等企业级数据库特性。同时，PolarDB-PG 具有大规模并行计算能力，可以应对 OLTP 与 OLAP 混合负载；还具有时空、向量、搜索、图谱等多模创新特性，可以满足企业对数据处理日新月异的新需求。

背景

PostgreSQL 在优化器中为一个查询树输出一个执行效率最高的物理计划树。其中，执行效率高低的衡量是通过代价估算实现的。比如通过估算查询返回元组的条数，和元组的宽度，就可以计算出 I/O 开销；也可以根据将要执行的物理操作估算出可能需要消耗的 CPU 代价。优化器通过系统表 pg_statistic 获得这些在代价估算过程需要使用到的关键统计信息，而 pg_statistic 系统表中的统计信息又是通过自动或手动的 ANALYZE 操作（或 VACUUM）计算得到的。ANALYZE 将会扫描表中的数据并按列进行分析，将得到的诸如每列的数据分布、最常见值、频率等统计信息写入系统表。

本文从源码的角度分析一下 ANALYZE 操作的实现机制。源码使用目前 PostgreSQL 最新的稳定版本 PostgreSQL 14。

统计信息

首先，我们应当搞明白分析操作的输出是什么。所以我们可以看一看 pg_statistic 中有哪些列，每个列的含义是什么。这个系统表中的每一行表示其它数据表中 每一列的统计信息。

postgres=# \d+ pg_statistic
                                 Table "pg_catalog.pg_statistic"
   Column    |   Type   | Collation | Nullable | Default | Storage  | Stats target | Description
-------------+----------+-----------+----------+---------+----------+--------------+-------------
 starelid    | oid      |           | not null |         | plain    |              |
 staattnum   | smallint |           | not null |         | plain    |              |
 stainherit  | boolean  |           | not null |         | plain    |              |
 stanullfrac | real     |           | not null |         | plain    |              |
 stawidth    | integer  |           | not null |         | plain    |              |
 stadistinct | real     |           | not null |         | plain    |              |
 stakind1    | smallint |           | not null |         | plain    |              |
 stakind2    | smallint |           | not null |         | plain    |              |
 stakind3    | smallint |           | not null |         | plain    |              |
 stakind4    | smallint |           | not null |         | plain    |              |
 stakind5    | smallint |           | not null |         | plain    |              |
 staop1      | oid      |           | not null |         | plain    |              |
 staop2      | oid      |           | not null |         | plain    |              |
 staop3      | oid      |           | not null |         | plain    |              |
 staop4      | oid      |           | not null |         | plain    |              |
 staop5      | oid      |           | not null |         | plain    |              |
 stanumbers1 | real[]   |           |          |         | extended |              |
 stanumbers2 | real[]   |           |          |         | extended |              |
 stanumbers3 | real[]   |           |          |         | extended |              |
 stanumbers4 | real[]   |           |          |         | extended |              |
 stanumbers5 | real[]   |           |          |         | extended |              |
 stavalues1  | anyarray |           |          |         | extended |              |
 stavalues2  | anyarray |           |          |         | extended |              |
 stavalues3  | anyarray |           |          |         | extended |              |
 stavalues4  | anyarray |           |          |         | extended |              |
 stavalues5  | anyarray |           |          |         | extended |              |
Indexes:
    "pg_statistic_relid_att_inh_index" UNIQUE, btree (starelid, staattnum, stainherit)
复制

/* ----------------
 *      pg_statistic definition.  cpp turns this into
 *      typedef struct FormData_pg_statistic
 * ----------------
 */
CATALOG(pg_statistic,2619,StatisticRelationId)
{
    /* These fields form the unique key for the entry: */
    Oid         starelid BKI_LOOKUP(pg_class);  /* relation containing
                                                 * attribute */
    int16       staattnum;      /* attribute (column) stats are for */
    bool        stainherit;     /* true if inheritance children are included */

    /* the fraction of the column's entries that are NULL: */
    float4      stanullfrac;

    /*
     * stawidth is the average width in bytes of non-null entries.  For
     * fixed-width datatypes this is of course the same as the typlen, but for
     * var-width types it is more useful.  Note that this is the average width
     * of the data as actually stored, post-TOASTing (eg, for a
     * moved-out-of-line value, only the size of the pointer object is
     * counted).  This is the appropriate definition for the primary use of
     * the statistic, which is to estimate sizes of in-memory hash tables of
     * tuples.
     */
    int32       stawidth;

    /* ----------------
     * stadistinct indicates the (approximate) number of distinct non-null
     * data values in the column.  The interpretation is:
     *      0       unknown or not computed
     *      > 0     actual number of distinct values
     *      < 0     negative of multiplier for number of rows
     * The special negative case allows us to cope with columns that are
     * unique (stadistinct = -1) or nearly so (for example, a column in which
     * non-null values appear about twice on the average could be represented
     * by stadistinct = -0.5 if there are no nulls, or -0.4 if 20% of the
     * column is nulls).  Because the number-of-rows statistic in pg_class may
     * be updated more frequently than pg_statistic is, it's important to be
     * able to describe such situations as a multiple of the number of rows,
     * rather than a fixed number of distinct values.  But in other cases a
     * fixed number is correct (eg, a boolean column).
     * ----------------
     */
    float4      stadistinct;

    /* ----------------
     * To allow keeping statistics on different kinds of datatypes,
     * we do not hard-wire any particular meaning for the remaining
     * statistical fields.  Instead, we provide several "slots" in which
     * statistical data can be placed.  Each slot includes:
     *      kind            integer code identifying kind of data (see below)
     *      op              OID of associated operator, if needed
     *      coll            OID of relevant collation, or 0 if none
     *      numbers         float4 array (for statistical values)
     *      values          anyarray (for representations of data values)
     * The ID, operator, and collation fields are never NULL; they are zeroes
     * in an unused slot.  The numbers and values fields are NULL in an
     * unused slot, and might also be NULL in a used slot if the slot kind
     * has no need for one or the other.
     * ----------------
     */

    int16       stakind1;
    int16       stakind2;
    int16       stakind3;
    int16       stakind4;
    int16       stakind5;

    Oid         staop1 BKI_LOOKUP_OPT(pg_operator);
    Oid         staop2 BKI_LOOKUP_OPT(pg_operator);
    Oid         staop3 BKI_LOOKUP_OPT(pg_operator);
    Oid         staop4 BKI_LOOKUP_OPT(pg_operator);
    Oid         staop5 BKI_LOOKUP_OPT(pg_operator);

    Oid         stacoll1 BKI_LOOKUP_OPT(pg_collation);
    Oid         stacoll2 BKI_LOOKUP_OPT(pg_collation);
    Oid         stacoll3 BKI_LOOKUP_OPT(pg_collation);
    Oid         stacoll4 BKI_LOOKUP_OPT(pg_collation);
    Oid         stacoll5 BKI_LOOKUP_OPT(pg_collation);

#ifdef CATALOG_VARLEN           /* variable-length fields start here */
    float4      stanumbers1[1];
    float4      stanumbers2[1];
    float4      stanumbers3[1];
    float4      stanumbers4[1];
    float4      stanumbers5[1];

    /*
     * Values in these arrays are values of the column's data type, or of some
     * related type such as an array element type.  We presently have to cheat
     * quite a bit to allow polymorphic arrays of this kind, but perhaps
     * someday it'll be a less bogus facility.
     */
    anyarray    stavalues1;
    anyarray    stavalues2;
    anyarray    stavalues3;
    anyarray    stavalues4;
    anyarray    stavalues5;
#endif
} FormData_pg_statistic;
复制

从数据库命令行的角度和内核 C 代码的角度来看，统计信息的内容都是一致的。所有的属性都以 sta 开头。其中：

starelid 表示当前列所属的表或索引
staattnum 表示本行统计信息属于上述表或索引中的第几列
stainherit 表示统计信息是否包含子列
stanullfrac 表示该列中值为 NULL 的行数比例
stawidth 表示该列非空值的平均宽度
stadistinct 表示列中非空值的唯一值数量
- 0 表示未知或未计算
- > 0 表示唯一值的实际数量
- < 0 表示 negative of multiplier for number of rows

由于不同数据类型所能够被计算的统计信息可能会有一些细微的差别，在接下来的部分中，PostgreSQL 预留了一些存放统计信息的 槽（slots）。目前的内核里暂时预留了五个槽：

#define STATISTIC_NUM_SLOTS  5
复制

每一种特定的统计信息可以使用一个槽，具体在槽里放什么完全由这种统计信息的定义自由决定。每一个槽的可用空间包含这么几个部分（其中的 N 表示槽的编号，取值为 1 到 5）：

stakindN：标识这种统计信息的整数编号
staopN：用于计算或使用统计信息的运算符 OID
stacollN：排序规则 OID
stanumbersN：浮点数数组
stavaluesN：任意值数组

PostgreSQL 内核中规定，统计信息的编号 1 至 99 被保留给 PostgreSQL 核心统计信息使用，其它部分的编号安排如内核注释所示：

/*
 * The present allocation of "kind" codes is:
 *
 *  1-99:       reserved for assignment by the core PostgreSQL project
 *              (values in this range will be documented in this file)
 *  100-199:    reserved for assignment by the PostGIS project
 *              (values to be documented in PostGIS documentation)
 *  200-299:    reserved for assignment by the ESRI ST_Geometry project
 *              (values to be documented in ESRI ST_Geometry documentation)
 *  300-9999:   reserved for future public assignments
 *
 * For private use you may choose a "kind" code at random in the range
 * 10000-30000.  However, for code that is to be widely disseminated it is
 * better to obtain a publicly defined "kind" code by request from the
 * PostgreSQL Global Development Group.
 */
复制

目前可以在内核代码中看到的 PostgreSQL 核心统计信息有 7 个，编号分别从 1 到 7。我们可以看看这 7 种统计信息分别如何使用上述的槽。

最后修改时间：2023-08-29 10:35:50

「喜欢这篇文章，您的关注和赞赏是给作者最好的鼓励」

关注作者

文章被以下合辑收录

PolarDB-PG介绍（共76篇）

关于阿里云开源PolarDB—PG的内容介绍

PolarDB-PG原理解读——ANALYZE 源码解读（一）

背景

统计信息

文章被以下合辑收录

评论

相关阅读