暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

PostgreSQL 15 preview - recovery(包括崩溃恢复、逻辑流复制、物理流复制、归档恢复) 加速, 支持异步prefetch 预读接下来要恢复的wal record相关的data block到shared buffer

原创 digoal 2022-01-20
1193

作者

digoal

日期

2022-04-08

标签

PostgreSQL , recovery , 包括崩溃恢复 , 逻辑流复制 , 物理流复制 , 归档恢复 , 异步 , prefetch


《PostgreSQL 14 preview - recovery 加速, 支持prefetch 预读接下来要恢复的wal record相关的data block到shared buffer, 加速wal record+data block的合并过程》

14被打回的特性, 15又放进来了. 目的是提高recovery速度(包括崩溃恢复、逻辑流复制、物理流复制、归档恢复), 异步预读datafile blocks(从wal解析到的需要恢复的数据文件对应blocks)到buffer pool.

相关参数 recovery_prefetch、wal_decode_buffer_size、maintenance_io_concurrency.

相关系统视图 pg_stat_recovery_prefetch.

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=dafae9707ab7e7079ce1ba22cebda4557d0cbaf3

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=5dc0418fab281d017a61a5756240467af982bdfd

Prefetch data referenced by the WAL, take II.  
author  Thomas Munro <tmunro@postgresql.org>      
Thu, 7 Apr 2022 07:28:40 +0000 (19:28 +1200)  
committer   Thomas Munro <tmunro@postgresql.org>      
Thu, 7 Apr 2022 07:42:14 +0000 (19:42 +1200)  
commit  5dc0418fab281d017a61a5756240467af982bdfd  
tree    cdcfda92621a9a7cd458999ede8f3974ef2a0bc1    tree  
parent  9553b4115f1879f66935f42fff0b798ef91866d0    commit | diff  
Prefetch data referenced by the WAL, take II.  
Introduce a new GUC recovery_prefetch.  When enabled, look ahead in the  
WAL and try to initiate asynchronous reading of referenced data blocks  
that are not yet cached in our buffer pool.  For now, this is done with  
posix_fadvise(), which has several caveats.  Since not all OSes have  
that system call, "try" is provided so that it can be enabled where  
available.  Better mechanisms for asynchronous I/O are possible in later  
work.  
Set to "try" for now for test coverage.  Default setting to be finalized  
before release.  
The GUC wal_decode_buffer_size limits the distance we can look ahead in  
bytes of decoded data.  
The existing GUC maintenance_io_concurrency is used to limit the number  
of concurrent I/Os allowed, based on pessimistic heuristics used to  
infer that I/Os have begun and completed.  We'll also not look more than  
maintenance_io_concurrency * 4 block references ahead.  
Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>  
Reviewed-by: Tomas Vondra <tomas.vondra@2ndquadrant.com>  
Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com> (earlier version)  
Reviewed-by: Andres Freund <andres@anarazel.de> (earlier version)  
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> (earlier version)  
Tested-by: Tomas Vondra <tomas.vondra@2ndquadrant.com> (earlier version)  
Tested-by: Jakub Wartak <Jakub.Wartak@tomtom.com> (earlier version)  
Tested-by: Dmitry Dolgov <9erthalion6@gmail.com> (earlier version)  
Tested-by: Sait Talha Nisanci <Sait.Nisanci@microsoft.com> (earlier version)  
Discussion: https://postgr.es/m/CA%2BhUKGJ4VJN8ttxScUFM8dOKX0BrBiboo5uz1cq%3DAovOddfHpA%40mail.gmail.com  
复制

https://www.postgresql.org/docs/devel/runtime-config-wal.html#GUC-RECOVERY-PREFETCH

This section describes the settings that apply to recovery in general, affecting crash recovery, streaming replication and archive-based replication.

recovery_prefetch (enum)
Whether to try to prefetch blocks that are referenced in the WAL that are not yet in the buffer pool, during recovery. Valid values are off (the default), on and try. The setting try enables prefetching only if the operating system provides the posix_fadvise function, which is currently used to implement prefetching. Note that some operating systems provide the function, but it doesn't do anything.

Prefetching blocks that will soon be needed can reduce I/O wait times during recovery with some workloads. See also the wal_decode_buffer_size and maintenance_io_concurrency settings, which limit prefetching activity.

wal_decode_buffer_size (integer)
A limit on how far ahead the server can look in the WAL, to find blocks to prefetch. If this value is specified without units, it is taken as bytes. The default is 512kB.

maintenance_io_concurrency (integer)
Similar to effective_io_concurrency, but used for maintenance work that is done on behalf of many client sessions.

The default is 10 on supported systems, otherwise 0. This value can be overridden for tables in a particular tablespace by setting the tablespace parameter of the same name (see ALTER TABLESPACE).

https://www.postgresql.org/docs/devel/monitoring-stats.html#PG-STAT-RECOVERY-PREFETCH-VIEW

pg_stat_recovery_prefetch

The pg_stat_recovery_prefetch view will contain only one row. It is filled with nulls if recovery has not run or recovery_prefetch is not enabled. The columns wal_distance, block_distance and io_depth show current values, and the other columns show cumulative counters that can be reset with the pg_stat_reset_shared function.

Table 28.17. pg_stat_recovery_prefetch View

Column Type Description
prefetch bigint Number of blocks prefetched because they were not in the buffer pool
hit bigint Number of blocks not prefetched because they were already in the buffer pool
skip_init bigint Number of blocks not prefetched because they would be zero-initialized
skip_new bigint Number of blocks not prefetched because they didn't exist yet
skip_fpw bigint Number of blocks not prefetched because a full page image was included in the WAL
skip_rep bigint Number of blocks not prefetched because they were already recently prefetched
wal_distance integer How many bytes ahead the prefetcher is looking
block_distance integer How many blocks ahead the prefetcher is looking
io_depth integer How many prefetches have been initiated but are not yet known to have completed

期望 PostgreSQL 增加什么功能?

PolarDB for PostgreSQL云原生分布式开源数据库

PostgreSQL 解决方案集合

德哥 / digoal's github - 公益是一辈子的事.

digoal's wechat

「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论