作者
digoal
日期
2021-02-19
标签
PostgreSQL , 异步提交
背景
https://www.postgresql.org/docs/current/wal-async-commit.html
1、异步提交指事务提交后, wal日志可能还没有持久化, 通过参数synchronous_commit控制
2、异步提交可能丢失事务信息, 现象类似于: 已成功提交的事务 被 回滚, 类似没有发生过
3、异步提交只破坏持久性, 不破坏原子、隔离、一致性.
4、异步提交最多丢失 3倍wal_writer_delay时间 产生的wal, 并且小于wal_buffer
5、异步提交参数可以在会话, 事务, 用户, db, 集群层面设置, 所以一个数据库中可以共存同步提交、异步提交等多种提交方式. 需要同步提交的设置为同步提交, 不需要同步提交的, 可以设置为异步提交.
alter role set synchronous_commit
alter database set synchronous_commit
6、OOM时, wal buffer内容丢失, 如果事务采用异步提交, 可能丢失. 未来PG内核可能改进, 因为wal有checksum, 而且有序, 没有必要在crash restart时丢弃所有wal buffer的内容.
7、shutdown immediate相当于数据库crash, 如果事务采用异步提交, 可能丢失.
8、数据库crash时, wal buffer内容丢失, 如果事务采用异步提交, 可能丢失.
9、服务器crash时, wal buffer内容丢失, 如果事务采用异步提交, 可能丢失.
10、ddl强制使用同步提交,即使你使用了异步提交参数. 2PC事务也强制使用同步提交.
11、任何情况下都不要使用fsync off, 这个可不是异步提交. 是会破坏一致性的, 导致数据库数据crruption的.
12、使用了异步提交后, 某些依赖关系可能需要注意, crash后已发生的事务可能会变成没有发生, 业务有逻辑依赖的一定要注意.
建议:
1、如果wal目录采用IO延迟低的块设备, 不需要使用异步提交.
2、如果不是高并发小事务, 不需要使用异步提交.
3、如果是高并发小事务, 并且wal目录采用延迟高的块设备, 并且要求RPO=0, 并且需要高TPS吞吐, 可以使用批量提交(commit_siblings, commit_delay).
4、如果业务不需要rpo=0, 或者可以容忍3倍wal_writer_delay时间(最低30毫秒)的wal丢失, 并且是高并发小事务, 可以开启异步提交.
5、如果HA使用了异步流复制, 这种架构本身就已经无法保证RPO=0时, 可以默认开启异步提交.
如果HA使用了异步流复制, 并且开启异步提交, 并且主库crash时产生了大量wal, 并且有大量wal buffer内容没有持久化, 并且没有持久化的wal内容已经发送给从库了, 并且主库重启后依旧是主库, 此时从库可能比主库的wal更前吗?
不可能!!! 因为PG只会将已经持久化的wal内容发送给从库(可以看代码证明这一点, 当然这个有利有弊, 弊端就是从库的延迟可能更大, 好处是从库不会因为使用了异步提交而导致crash后wal超前)
29.3. Asynchronous Commit
Asynchronous commit is an option that allows transactions to complete more quickly, at the cost that the most recent transactions may be lost if the database should crash. In many applications this is an acceptable trade-off.
As described in the previous section, transaction commit is normally synchronous: the server waits for the transaction's WAL records to be flushed to permanent storage before returning a success indication to the client. The client is therefore guaranteed that a transaction reported to be committed will be preserved, even in the event of a server crash immediately after. However, for short transactions this delay is a major component of the total transaction time. Selecting asynchronous commit mode means that the server returns success as soon as the transaction is logically completed, before the WAL records it generated have actually made their way to disk. This can provide a significant boost in throughput for small transactions.
Asynchronous commit introduces the risk of data loss. There is a short time window between the report of transaction completion to the client and the time that the transaction is truly committed (that is, it is guaranteed not to be lost if the server crashes). Thus asynchronous commit should not be used if the client will take external actions relying on the assumption that the transaction will be remembered. As an example, a bank would certainly not use asynchronous commit for a transaction recording an ATM's dispensing of cash. But in many scenarios, such as event logging, there is no need for a strong guarantee of this kind.
The risk that is taken by using asynchronous commit is of data loss, not data corruption. If the database should crash, it will recover by replaying WAL up to the last record that was flushed. The database will therefore be restored to a self-consistent state, but any transactions that were not yet flushed to disk will not be reflected in that state. The net effect is therefore loss of the last few transactions. Because the transactions are replayed in commit order, no inconsistency can be introduced — for example, if transaction B made changes relying on the effects of a previous transaction A, it is not possible for A's effects to be lost while B's effects are preserved.
The user can select the commit mode of each transaction, so that it is possible to have both synchronous and asynchronous commit transactions running concurrently. This allows flexible trade-offs between performance and certainty of transaction durability. The commit mode is controlled by the user-settable parameter synchronous_commit, which can be changed in any of the ways that a configuration parameter can be set. The mode used for any one transaction depends on the value of synchronous_commit when transaction commit begins.
Certain utility commands, for instance DROP TABLE, are forced to commit synchronously regardless of the setting of synchronous_commit. This is to ensure consistency between the server's file system and the logical state of the database. The commands supporting two-phase commit, such as PREPARE TRANSACTION, are also always synchronous.
If the database crashes during the risk window between an asynchronous commit and the writing of the transaction's WAL records, then changes made during that transaction will be lost. The duration of the risk window is limited because a background process (the “WAL writer”) flushes unwritten WAL records to disk every wal_writer_delay milliseconds. The actual maximum duration of the risk window is three times wal_writer_delay because the WAL writer is designed to favor writing whole pages at a time during busy periods.
Caution
An immediate-mode shutdown is equivalent to a server crash, and will therefore cause loss of any unflushed asynchronous commits.
Asynchronous commit provides behavior different from setting fsync = off. fsync is a server-wide setting that will alter the behavior of all transactions. It disables all logic within PostgreSQL that attempts to synchronize writes to different portions of the database, and therefore a system crash (that is, a hardware or operating system crash, not a failure of PostgreSQL itself) could result in arbitrarily bad corruption of the database state. In many scenarios, asynchronous commit provides most of the performance improvement that could be obtained by turning off fsync, but without the risk of data corruption.
commit_delay also sounds very similar to asynchronous commit, but it is actually a synchronous commit method (in fact, commit_delay is ignored during an asynchronous commit). commit_delay causes a delay just before a transaction flushes WAL to disk, in the hope that a single flush executed by one such transaction can also serve other transactions committing at about the same time. The setting can be thought of as a way of increasing the time window in which transactions can join a group about to participate in a single flush, to amortize the cost of the flush among multiple transactions.
PostgreSQL 许愿链接
您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.