暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

PgSQL - 16新特性 - 查询优化器改进

yanzongshuaiDBA 2024-02-19
780
PgSQL - 16新特性 - 查询优化器改进
PgSQL16对查询优化器进行了一些改进,使得很多查询比之前版本执行更快。PgSQL16分为10个大的改进。

1、DISTINCT查询的增量排序

PgSQL13中首次引入增量排序。增量排序减少了获取排序结果的代价。怎么做到?根据一个或者多个leading列已排好序的结果集只在剩余的列上执行排序。
举个例子:列a上有一个btree索引,我们需要在a,b列上进行排序,然后我们可以使用这个btree索引(提供列a的有序值),仅当a列值更改时才对已看到的行进行排序。相对于对一个大group来说,PgSQL使用的快速排序算法对很多小groups进行排序更加高效。
PgSQL16查询优化器对SELECT DISTINCT查询使用增量排序,之前版本仅执行一个全量排序:
    -- Setup
    CREATE TABLE distinct_test (a INT, b INT);
    INSERT INTO distinct_test
    SELECT x,1 FROM generate_series(1,1000000)x;
    CREATE INDEX on distinct_test(a);
    VACUUM ANALYZE distinct_test;
    EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF)
    SELECT DISTINCT a,b FROM distinct_test;
    PgSQL15explain输出
                                QUERY PLAN
      ---------------------------------------------------------------
      HashAggregate (actual rows=1000000 loops=1)
      Group Key: a, b
      Batches: 81 Memory Usage: 11153kB Disk Usage: 31288kB
      -> Seq Scan on distinct_test (actual rows=1000000 loops=1)
      Planning Time: 0.065 ms
      Execution Time: 414.226 ms
      (6 rows)
      PgSQL16explain输出
                                  QUERY PLAN
        ------------------------------------------------------------------
        Unique (actual rows=1000000 loops=1)
        -> Incremental Sort (actual rows=1000000 loops=1)
        Sort Key: a, b
        Presorted Key: a
        Full-sort Groups: 31250 Sort Method: quicksort Average Memory: 26kB Peak Memory: 26kB
        -> Index Scan using distinct_test_a_idx on distinct_test (actual rows=1000000 loops=1)
        Planning Time: 0.108 ms
        Execution Time: 263.167 ms
        (8 rows)
        PgSQL16explain输出中可以看到优化器使用了a列上的索引distinct_test_a_idx,然后执行了Incremental_Sorta的值使用b列进行排序。Presorted Key: a表示的就是这个意思。上面的INSERT语句插入(a,b),a递增,b值为1.增量排序的一批tuple仅包含一个单行。
        Explain中显示Incremental SortPeak Memory仅使用26KB,而PgSQL15需要更多内存,需要生成30MB的临时数据到磁盘。PgSQL1663%
        PgSQL13增量排序patch可阅读:
        https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=d2d8a229bc58a2014dce1c7a4fcdb6c5ab9fb8da

        2、更快的 ORDER BY/DISTINCT聚合

        PgSQL15之前,包含ORDER BY或者DISTINCT的聚合在执行时,Aggregate算子内部需要包含一个排序。因为这个排序总会只能跟,所以优化器不会形成一个提供预排序以输出有序值给聚合。
        PgSQL16产生一个计划给Aggregate节点以合适的顺序提供行记录。执行器更加智能识别出这种情况后,当行记录已经以合适的顺序排好后就放弃执行排序。
          -- Setup
          CREATE TABLE aggtest (a INT, b text);
          INSERT INTO aggtest SELECT a,md5((b%100)::text) FROM generate_series(1,10) a, generate_series(1,100000)b;
          CREATE INDEX ON aggtest(a,b);
          VACUUM FREEZE ANALYZE aggtest;
          EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF, BUFFERS)
          SELECT a,COUNT(DISTINCT b) FROM aggtest GROUP BY a;
          PgSQL15explain输出
                                      QUERY PLAN
            ---------------------------------------------------------------
            GroupAggregate (actual rows=10 loops=1)
            Group Key: a
            Buffers: shared hit=892, temp read=4540 written=4560
            -> Index Only Scan using aggtest_a_b_idx on aggtest (actual rows=1000000 loops=1)
            Heap Fetches: 0
            Buffers: shared hit=892
            Planning Time: 0.122 ms
            Execution Time: 302.693 ms
            (8 rows)
            PgSQL16explain输出
                                        QUERY PLAN
              ---------------------------------------------------------------
              GroupAggregate (actual rows=10 loops=1)
              Group Key: a
              Buffers: shared hit=892
              -> Index Only Scan using aggtest_a_b_idx on aggtest (actual rows=1000000 loops=1)
              Heap Fetches: 0
              Buffers: shared hit=892
              Planning Time: 0.061 ms
              Execution Time: 115.534 ms
              (8 rows)
              PgSQL16152倍。PgSQL15中出现的temp read=4540 written=4560(会溢出到磁盘)并没有出现在PgSQL16中。

              3、UNION ALL查询的Memoize算子

              PgSQL14中首次引入Memoize计划节点。Memoize充当参数化Nested LoopNested Loop内表的cache层。Memoize可以很好地提升性能,因为当所需要的行已被查询并被缓存后,可以跳过子节点的执行。
              UNION ALL出现在参数化Nested Loop的内表端时,PgSQL16现在考虑使用Memoize
                -- Setup
                CREATE TABLE t1 (a INT PRIMARY KEY);
                CREATE TABLE t2 (a INT PRIMARY KEY);
                CREATE TABLE lookup (a INT);
                INSERT INTO t1 SELECT x FROM generate_Series(1,10000) x;
                INSERT INTO t2 SELECT x FROM generate_Series(1,10000) x;
                INSERT INTO lookup SELECT x%10+1 FROM generate_Series(1,1000000)x;
                ANALYZE t1,t2,lookup;
                EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF)
                SELECT * FROM (SELECT * FROM t1 UNION ALL SELECT * FROM t2) t
                INNER JOIN lookup l ON l.a = t.a;
                PgSQL15explain输出
                                                    QUERY PLAN
                  -------------------------------------------------------------------------------
                  Nested Loop (actual rows=2000000 loops=1)
                  -> Seq Scan on lookup l (actual rows=1000000 loops=1)
                  -> Append (actual rows=2 loops=1000000)
                  -> Index Only Scan using t1_pkey on t1 (actual rows=1 loops=1000000)
                  Index Cond: (a = l.a)
                  Heap Fetches: 1000000
                  -> Index Only Scan using t2_pkey on t2 (actual rows=1 loops=1000000)
                  Index Cond: (a = l.a)
                  Heap Fetches: 1000000
                  Planning Time: 0.223 ms
                  Execution Time: 1926.151 ms
                  (11 rows)
                  PgSQL16explain输出
                                                       QUERY PLAN
                    ---------------------------------------------------------------------------------
                    Nested Loop (actual rows=2000000 loops=1)
                    -> Seq Scan on lookup l (actual rows=1000000 loops=1)
                    -> Memoize (actual rows=2 loops=1000000)
                    Cache Key: l.a
                    Cache Mode: logical
                    Hits: 999990 Misses: 10 Evictions: 0 Overflows: 0 Memory Usage: 2kB
                    -> Append (actual rows=2 loops=10)
                    -> Index Only Scan using t1_pkey on t1 (actual rows=1 loops=10)
                    Index Cond: (a = l.a)
                    Heap Fetches: 10
                    -> Index Only Scan using t2_pkey on t2 (actual rows=1 loops=10)
                    Index Cond: (a = l.a)
                    Heap Fetches: 10
                    Planning Time: 0.229 ms
                    Execution Time: 282.120 ms
                    (15 rows)
                    Append节点的父节点为Memoize节点,可以减少Append的循环次数:PgSQL151 millionPgSQL1610millionMemoize节点缓存命中时,不需要执行Append获取数据。该查询性能提升了6倍。(和Material物化算子有啥区别?)

                    4、支持Right Anti Join

                    INNER JOINHASH JOIN执行时,对小表构建hash表。小hash表构建代价较小,并且对于CPU来说更加cache-friendlyCPU在等待主存中数据到达时发生停滞的可能性更小。
                    PgSQL16版本前,如果查询中使用NOT EXISTSAnti Join会将NOT EXISTS的表放到join的内表端。这就意味着无法灵活地对小表进行hash,有可能对一个大表构建hash表。
                    PgSQL16可以选择小表,因为它支持了Right Anti Join
                      -- Setup
                      CREATE TABLE small(a int);
                      CREATE TABLE large(a int);
                      INSERT INTO small
                      SELECT a FROM generate_series(1,100) a;
                      INSERT INTO large
                      SELECT a FROM generate_series(1,1000000) a;
                      VACUUM ANALYZE small,large;
                      EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF)
                      SELECT * FROM small s
                      WHERE NOT EXISTS(SELECT 1 FROM large l WHERE s.a = l.a);
                      PgSQL15explain输出
                                                  QUERY PLAN
                        ---------------------------------------------------------------
                        Hash Anti Join (actual rows=0 loops=1)
                        Hash Cond: (s.a = l.a)
                        -> Seq Scan on small s (actual rows=100 loops=1)
                        -> Hash (actual rows=1000000 loops=1)
                        Buckets: 262144 Batches: 8 Memory Usage: 6446kB
                        -> Seq Scan on large l (actual rows=1000000 loops=1)
                        Planning Time: 0.103 ms
                        Execution Time: 139.023 ms
                        (8 rows)
                        PgSQL16explain输出
                                                  QUERY PLAN
                          -----------------------------------------------------------
                          Hash Right Anti Join (actual rows=0 loops=1)
                          Hash Cond: (l.a = s.a)
                          -> Seq Scan on large l (actual rows=1000000 loops=1)
                          -> Hash (actual rows=100 loops=1)
                          Buckets: 1024 Batches: 1 Memory Usage: 12kB
                          -> Seq Scan on small s (actual rows=100 loops=1)
                          Planning Time: 0.094 ms
                          Execution Time: 77.076 ms
                          (8 rows)
                          可以看到,PgSQL16使用了Hash Right Anti JoinMemory Usage和执行时间比PgSQL15更少。

                          5、并行Hash Full和Right Joins

                          PgSQL11引入了Parallel Hash Join。允许多个并行进程辅助构建一个单独的hash表。11版本前,每个worker构建自己独立的hash表,导致额外的内存消耗。
                          PgSQL16Parallel Hash Join进行了改进,支持FULLRIGHT joinFULL OUTER JOIN允许并行执行并且允许并行执行RIGHT JOIN
                            -- Setup
                            CREATE TABLE odd (a INT);
                            CREATE TABLE even (a INT);
                            INSERT INTO odd
                            SELECT a FROM generate_series(1,1000000,2) a;
                            INSERT INTO even
                            SELECT a FROM generate_series(2,1000000,2) a;
                            VACUUM ANALYZE odd, even;
                            EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF)
                            SELECT COUNT(o.a),COUNT(e.a) FROM odd o FULL JOIN even e ON o.a = e.a;
                            PgSQL15explain输出:
                                                          QUERY PLAN
                              -------------------------------------------------------------------
                              Aggregate (actual rows=1 loops=1)
                              -> Hash Full Join (actual rows=1000000 loops=1)
                              Hash Cond: (o.a = e.a)
                              -> Seq Scan on odd o (actual rows=500000 loops=1)
                              -> Hash (actual rows=500000 loops=1)
                              Buckets: 262144 Batches: 4 Memory Usage: 6439kB
                              -> Seq Scan on even e (actual rows=500000 loops=1)
                              Planning Time: 0.079 ms
                              Execution Time: 220.677 ms
                              (9 rows)
                              PgSQL16explain输出:
                                                                    QUERY PLAN
                                --------------------------------------------------------------------------------
                                Finalize Aggregate (actual rows=1 loops=1)
                                -> Gather (actual rows=2 loops=1)
                                Workers Planned: 1
                                Workers Launched: 1
                                -> Partial Aggregate (actual rows=1 loops=2)
                                -> Parallel Hash Full Join (actual rows=500000 loops=2)
                                Hash Cond: (o.a = e.a)
                                -> Parallel Seq Scan on odd o (actual rows=250000 loops=2)
                                -> Parallel Hash (actual rows=250000 loops=2)
                                Buckets: 262144 Batches: 4 Memory Usage: 6976kB
                                -> Parallel Seq Scan on even e (actual rows=250000 loops=2)
                                Planning Time: 0.161 ms
                                Execution Time: 129.769 ms
                                (13 rows)

                                6、优化窗口函数frame子句

                                当一个查询包含窗口函数比如row_number()rank()dense_rank()percent_rankcume_dist()ntile()时,如果窗口子句没有指定ROWS选项,PgSQL将会使用默认的RANGE选项。RANGE选项会造成执行器向前执行直到找到第一个“non-peer”行。Peerwindow frame中根据window子句ORDER BY进行同等比较的行。如果没有ORDER BYwindow frame中的所有行都是peers。当处理具有许多根据窗口子句同等排序的行的记录时ORDER BY,识别这些对等行的额外处理可能成本高昂。
                                无论是否在window子句中指定ROWS或者RANGE上面提到的窗口函数都没有任何不同。然而16 之前的 PostgreSQL 版本中的执行器并不知道这一点,并且由于某些窗口函数确实关心ROWS/RANGE选项,因此执行器必须在所有情况下对对等行执行检查。
                                PostgreSQL 16 查询规划器知道哪些窗口函数关心ROWS/RANGE选项,并将此信息传递给执行器,以便它可以跳过不必要的额外处理。
                                row_number()当用于限制查询中的结果数量时,此优化效果特别好,如下例所示
                                  -- Setup
                                  CREATE TABLE scores (id INT PRIMARY KEY, score INT);
                                  INSERT INTO scores SELECT s,random()*10 FROM generate_series(1,1000000)s;
                                  CREATE INDEX ON scores(score);
                                  VACUUM ANALYZE scores;
                                  EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF)
                                  SELECT * FROM (
                                  SELECT id,ROW_NUMBER() OVER (ORDER BY score) rn,score
                                  FROM scores
                                  ) m WHERE rn <= 10;
                                  PgSQL15输出
                                                                     QUERY PLAN
                                    -------------------------------------------------------------------------------
                                    WindowAgg (actual rows=10 loops=1)
                                    Run Condition: (row_number() OVER (?) <= 10)
                                    -> Index Scan using scores_score_idx on scores (actual rows=50410 loops=1)
                                    Planning Time: 0.096 ms
                                    Execution Time: 29.775 ms
                                    (5 rows)
                                    PgSQL16输出
                                                                       QUERY PLAN
                                      ----------------------------------------------------------------------------
                                      WindowAgg (actual rows=10 loops=1)
                                      Run Condition: (row_number() OVER (?) <= 10)
                                      -> Index Scan using scores_score_idx on scores (actual rows=11 loops=1)
                                      Planning Time: 0.191 ms
                                      Execution Time: 0.058 ms
                                      (5 rows)
                                      PgSQL15中的Index Scan节点需要从scores_score_idx索引读取50410行才停止。而PgSQL16仅读取11行,一旦row_number到达11后,执行器就会意识到没有更多行符合<=10条件了。该查询和使用ROWSwindow 子句选项的执行器都导致该查询在 PostgreSQL 16 上的运行速度提高了 500 倍以上。

                                      7、优化各种窗口函数

                                      此更改扩展了 PostgreSQL 15 中完成的工作。在 PG15 中,修改了查询规划器以允许执行器提前停止处理WindowAgg执行器节点。当子句中的一项WHERE以一旦条件变为假就永远不会再为真的方式过滤窗口函数时,就可以完成此操作。
                                      row_number()是一个函数的示例,它可以提供这样的保证,因为它是单调递增函数,即同一分区中的后续行的 row_number 永远不会低于前一行。
                                      PostgreSQL 16 查询规划器扩展了此优化的覆盖范围,还涵盖ntile(),cume_dist()percent_rank()。在 PostgreSQL 15 中,这仅适用于row_number()rank()dense_rank()和。count()count(*)
                                        -- Setup
                                        CREATE TABLE marathon (id INT PRIMARY KEY, time INTERVAL NOT NULL);
                                        INSERT INTO marathon
                                        SELECT id,'03:00:00'::interval + (CAST(RANDOM() * 3600 AS INT) || 'secs')::INTERVAL - (CAST(RANDOM() * 3600 AS INT) || ' secs')::INTERVAL
                                        FROM generate_series(1,50000) id;
                                        CREATE INDEX ON marathon (time);
                                        VACUUM ANALYZE marathon;
                                        EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF)
                                        SELECT * FROM (SELECT *,percent_rank() OVER (ORDER BY time) pr
                                        FROM marathon) m WHERE pr <= 0.01;
                                        PgSQL15输出
                                                                        QUERY PLAN
                                          -----------------------------------------------------------------------
                                          Subquery Scan on m (actual rows=500 loops=1)
                                          Filter: (m.pr <= '0.01'::double precision)
                                          Rows Removed by Filter: 49500
                                          -> WindowAgg (actual rows=50000 loops=1)
                                          -> Index Scan using marathon_time_idx on marathon (actual rows=50000 loops=1)
                                          Planning Time: 0.108 ms
                                          Execution Time: 84.358 ms
                                          (7 rows)
                                          PgSQL16输出
                                                                          QUERY PLAN
                                            -----------------------------------------------------------------------
                                            WindowAgg (actual rows=500 loops=1)
                                            Run Condition: (percent_rank() OVER (?) <= '0.01'::double precision)
                                            -> Index Scan using marathon_time_idx on marathon (actual rows=50000 loops=1)
                                            Planning Time: 0.180 ms
                                            Execution Time: 19.454 ms
                                            (5 rows)
                                            PgSQL16中,可以将pr<=0.01作为Run Condition,而PgSQL15中这个为谓词作为子查询的FilterPgSQL16run condition可以提取终止WindowAgg的执行,相对于PgSQL15性能提升4倍。

                                            8、分区表的join消除

                                            很长一段时间以来,查询不请求左表的列并且join不可能重复任何行时,PgSQL可以删除一个LEFT JOIN。但是PgSQL16之前版本,不支持分区表上的左连接裁剪。为什么?因为规划器用来确定任何内侧行是否有可能重复任何外侧行的证据对于分区表来说并不存在。PostgreSQL 16 查询规划器现在允许LEFT JOIN对分区表进行删除优化。这种连接消除优化更有可能对视图有所帮助,因为通常并不总是查询视图中存在的所有列:
                                              -- Setup
                                              CREATE TABLE part_tab (id BIGINT PRIMARY KEY, payload TEXT) PARTITION BY HASH(id);
                                              CREATE TABLE part_tab_p0 PARTITION OF part_tab FOR VALUES WITH (modulus 2, remainder 0);
                                              CREATE TABLE part_tab_p1 PARTITION OF part_tab FOR VALUES WITH (modulus 2, remainder 1);
                                              CREATE TABLE normal_table (id INT, part_tab_id BIGINT);
                                              EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF)
                                              SELECT nt.* FROM normal_table nt LEFT JOIN part_tab pt ON nt.part_tab_id = pt.id;
                                              PgSQL15输出
                                                                            QUERY PLAN
                                                -------------------------------------------------------------------
                                                Merge Right Join (actual rows=0 loops=1)
                                                Merge Cond: (pt.id = nt.part_tab_id)
                                                -> Merge Append (actual rows=0 loops=1)
                                                Sort Key: pt.id
                                                -> Index Only Scan using part_tab_p0_pkey on part_tab_p0 pt_1 (actual rows=0 loops=1)
                                                Heap Fetches: 0
                                                -> Index Only Scan using part_tab_p1_pkey on part_tab_p1 pt_2 (actual rows=0 loops=1)
                                                Heap Fetches: 0
                                                -> Sort (actual rows=0 loops=1)
                                                Sort Key: nt.part_tab_id
                                                Sort Method: quicksort Memory: 25kB
                                                -> Seq Scan on normal_table nt (actual rows=0 loops=1)
                                                Planning Time: 0.325 ms
                                                Execution Time: 0.037 ms
                                                (14 rows)
                                                PgSQL16输出
                                                                       QUERY PLAN
                                                  -----------------------------------------------------
                                                  Seq Scan on normal_table nt (actual rows=0 loops=1)
                                                  Planning Time: 0.244 ms
                                                  Execution Time: 0.015 ms
                                                  (3 rows)

                                                  9、DISTINCT查询

                                                  PostgreSQL 查询计划程序检测到所有行都包含相同值时,它能够避免包含计划节点来消除重复结果。检测到这一点很简单,当可以应用优化时,它可以带来巨大的性能提升。
                                                    -- Setup
                                                    CREATE TABLE abc (a int, b int, c int);
                                                    INSERT INTO abc SELECT a%10,a%10,a%10 FROM generate_series(1,1000000)a;
                                                    VACUUM ANALYZE abc;
                                                    EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF)
                                                    SELECT DISTINCT a,b,c FROM abc WHERE a = 5 AND b = 5 AND c = 5;
                                                    PgSQL15输出
                                                                                     QUERY PLAN
                                                      ------------------------------------------------------------------------
                                                      Unique (actual rows=1 loops=1)
                                                      -> Gather (actual rows=3 loops=1)
                                                      Workers Planned: 2
                                                      Workers Launched: 2
                                                      -> Unique (actual rows=1 loops=3)
                                                      -> Parallel Seq Scan on abc (actual rows=33333 loops=3)
                                                      Filter: ((a = 5) AND (b = 5) AND (c = 5))
                                                      Rows Removed by Filter: 300000
                                                      Planning Time: 0.114 ms
                                                      Execution Time: 30.381 ms
                                                      (10 rows)
                                                      PgSQL16输出
                                                                            QUERY PLAN
                                                        ---------------------------------------------------
                                                        Limit (actual rows=1 loops=1)
                                                        -> Seq Scan on abc (actual rows=1 loops=1)
                                                        Filter: ((a = 5) AND (b = 5) AND (c = 5))
                                                        Rows Removed by Filter: 4
                                                        Planning Time: 0.109 ms
                                                        Execution Time: 0.025 ms
                                                        (6 rows)
                                                        如果仔细查看 SQL 查询,您会发现子句中的每一列在子句DISTINCT中也有一个相等条件WHERE。这意味着查询中的所有输出行的每一列都将具有相同的值。PostgreSQL 16 查询规划器能够利用这些知识并将LIMIT查询结果简单地保留为 1 行。PostgreSQL 15 通过读取整个结果并使用Unique运算符将所有行减少为一行来生成相同的查询结果。PostgreSQL 16 的速度Execution TimePostgreSQL 15 1200 多倍。

                                                        10、MergeJoin后增量排序

                                                        PgSQL16之前,当查询优化器考虑执行一个Merge Join时,会检查merge的排序顺序是否符合父节点(DISTINCTGROUP BYORDER BY)的要求,仅当顺序匹配父节点时才会使用这个排序顺序。对于Incremental Sorts来说这种策略有点过时了。现在Incremental Sorts可以用于父节点操作,可以充分利用预排序结果。
                                                        PgSQL16的查询优化器可以调整这种规则:“行的顺序严格匹配”到“至少1leading列的顺序匹配”。这就允许优化器使用Incremental Sorts将行按照正确的顺序进行上级操作。增量排序比全量排序所需的工作更少,因为增量排序能够利用部分排序的输入,以较小的批量进行排序,从而减少内存消耗和排序比较。
                                                          -- Setup
                                                          CREATE TABLE a (a INT, b INT);
                                                          CREATE TABLE b (x INT, y INT);
                                                          INSERT INTO a SELECT a,a FROM generate_series(1,1000000) a;
                                                          INSERT INTO b SELECT a,a FROM generate_series(1,1000000) a;
                                                          VACUUM ANALYZE a, b;
                                                          SET enable_hashjoin=0;
                                                          SET max_parallel_workers_per_gather=0;
                                                          EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF)
                                                          SELECT a,b,count(*) FROM a INNER JOIN b ON a.a = b.x GROUP BY a,b ORDER BY a DESC, b;
                                                          PgSQL15explain输出
                                                                                            QUERY PLAN
                                                            ---------------------------------------------------------------------------
                                                            GroupAggregate (actual rows=1000000 loops=1)
                                                            Group Key: a.a, a.b
                                                            -> Sort (actual rows=1000000 loops=1)
                                                            Sort Key: a.a DESC, a.b
                                                            Sort Method: external merge Disk: 17664kB
                                                            -> Merge Join (actual rows=1000000 loops=1)
                                                            Merge Cond: (a.a = b.x)
                                                            -> Sort (actual rows=1000000 loops=1)
                                                            Sort Key: a.a
                                                            Sort Method: external merge Disk: 17664kB
                                                            -> Seq Scan on a (actual rows=1000000 loops=1)
                                                            -> Materialize (actual rows=1000000 loops=1)
                                                            -> Sort (actual rows=1000000 loops=1)
                                                            Sort Key: b.x
                                                            Sort Method: external merge Disk: 11768kB
                                                            -> Seq Scan on b (actual rows=1000000 loops=1)
                                                            Planning Time: 0.175 ms
                                                            Execution Time: 1010.738 ms
                                                            (18 rows)
                                                            PgSQL16explain输出
                                                                                              QUERY PLAN
                                                              ---------------------------------------------------------------------------
                                                              GroupAggregate (actual rows=1000000 loops=1)
                                                              Group Key: a.a, a.b
                                                              -> Incremental Sort (actual rows=1000000 loops=1)
                                                              Sort Key: a.a DESC, a.b
                                                              Presorted Key: a.a
                                                              Full-sort Groups: 31250 Sort Method: quicksort Average Memory: 26kB Peak Memory: 26kB
                                                              -> Merge Join (actual rows=1000000 loops=1)
                                                              Merge Cond: (a.a = b.x)
                                                              -> Sort (actual rows=1000000 loops=1)
                                                              Sort Key: a.a DESC
                                                              Sort Method: external merge Disk: 17672kB
                                                              -> Seq Scan on a (actual rows=1000000 loops=1)
                                                              -> Materialize (actual rows=1000000 loops=1)
                                                              -> Sort (actual rows=1000000 loops=1)
                                                              Sort Key: b.x DESC
                                                              Sort Method: external merge Disk: 11768kB
                                                              -> Seq Scan on b (actual rows=1000000 loops=1)
                                                              Planning Time: 0.140 ms
                                                              Execution Time: 915.589 ms
                                                              (19 rows)

                                                              原文

                                                              https://www.citusdata.com/blog/2024/02/08/whats-new-in-postgres-16-query-planner-optimizer/#distinct-queries

                                                              文章转载自yanzongshuaiDBA,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

                                                              评论