暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

PostgreSQL 14 preview - recovery 性能增强 - recovery_init_sync_method=syncfs

digoal 2021-01-03
857

作者

digoal

日期

2021-03-20

标签

PostgreSQL , recovery , open file , crash性能问题


背景

当数据库实例表很多时, 由于每个表包括main, fsm, init, toast等多个file fork, 如果有索引, 每个索引有对应的main, toast文件, 所以数据库实例的表很多时, 数据库里面会有很多文件, 在crash recovery 时, 需要递归open所有file, 导致recovery非常慢.

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=2ce439f3

```
Recursively fsync() the data directory after a crash.
author Robert Haas rhaas@postgresql.org
Mon, 4 May 2015 18:13:53 +0000 (14:13 -0400)
committer Robert Haas rhaas@postgresql.org
Mon, 4 May 2015 18:13:53 +0000 (14:13 -0400)
commit 2ce439f3379aed857517c8ce207485655000fc8e
tree 196ccf42f41939b101aa6d4587ca93c8060b51a2 tree
parent ec3d976bce7e322c29f1007d19b63b7a3a1a6ee4 commit | diff
Recursively fsync() the data directory after a crash.

Otherwise, if there's another crash, some writes from after the first
crash might make it to disk while writes from before the crash fail
to make it to disk. This could lead to data corruption.

Back-patch to all supported versions.

Abhijit Menon-Sen, reviewed by Andres Freund and slightly revised
by me.
```

PostgreSQL 14 增加参数recovery_init_sync_method, 使用syncfs方法加速, 仅支持Linux.

默认 recovery_init_sync_method=fsync

syncfs (Linux 5.8+)

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=61752afb26404dfc99a535c7a53f7f04dc110263

```
Provide recovery_init_sync_method=syncfs. master github/master

author Thomas Munro tmunro@postgresql.org
Fri, 19 Mar 2021 22:46:32 +0000 (11:46 +1300)
committer Thomas Munro tmunro@postgresql.org
Fri, 19 Mar 2021 23:07:28 +0000 (12:07 +1300)
commit 61752afb26404dfc99a535c7a53f7f04dc110263
tree dbb477a1f01f495a180e891028e3d1545532881d tree
parent b822ae13ea93c18326d58d47829bbc66d36fae5c commit | diff
Provide recovery_init_sync_method=syncfs.

Since commit 2ce439f3 we have opened every file in the data directory
and called fsync() at the start of crash recovery. This can be very
slow if there are many files, leading to field complaints of systems
taking minutes or even hours to begin crash recovery.

Provide an alternative method, for Linux only, where we call syncfs() on
every possibly different filesystem under the data directory. This is
equivalent, but avoids faulting in potentially many inodes from
potentially slow storage.

The new mode comes with some caveats, described in the documentation, so
the default value for the new setting is "fsync", preserving the older
behavior.

Reported-by: Michael Brown michael.brown@discourse.org
Reviewed-by: Fujii Masao masao.fujii@oss.nttdata.com
Reviewed-by: Paul Guo guopa@vmware.com
Reviewed-by: Bruce Momjian bruce@momjian.us
Reviewed-by: Justin Pryzby pryzby@telsasoft.com
Reviewed-by: David Steele david@pgmasters.net
Discussion: https://postgr.es/m/11bc2bb7-ecb5-3ad0-b39f-df632734cd81%40discourse.org
Discussion: https://postgr.es/m/CAEET0ZHGnbXmi8yF3ywsDZvb3m9CbdsGZgfTXscQ6agcbzcZAw%40mail.gmail.com
```

+#ifdef HAVE_SYNCFS + if (recovery_init_sync_method == RECOVERY_INIT_SYNC_METHOD_SYNCFS) + { + DIR *dir; + struct dirent *de; + + /* + * On Linux, we don't have to open every single file one by one. We + * can use syncfs() to sync whole filesystems. We only expect + * filesystem boundaries to exist where we tolerate symlinks, namely + * pg_wal and the tablespaces, so we call syncfs() for each of those + * directories. + */ + + /* Sync the top level pgdata directory. */ + do_syncfs("."); + /* If any tablespaces are configured, sync each of those. */ + dir = AllocateDir("pg_tblspc"); + while ((de = ReadDirExtended(dir, "pg_tblspc", LOG))) + { + char path[MAXPGPATH]; + + if (strcmp(de->d_name, ".") == 0 || strcmp(de->d_name, "..") == 0) + continue; + + snprintf(path, MAXPGPATH, "pg_tblspc/%s", de->d_name); + do_syncfs(path); + } + FreeDir(dir); + /* If pg_wal is a symlink, process that too. */ + if (xlog_is_symlink) + do_syncfs("pg_wal"); + return; + } +#endif /* !HAVE_SYNCFS */ +

PostgreSQL 许愿链接

您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.

9.9元购买3个月阿里云RDS PostgreSQL实例

PostgreSQL 解决方案集合

德哥 / digoal's github - 公益是一辈子的事.

digoal's wechat

文章转载自digoal,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论