暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

数据库实例崩溃恢复-PostgreSQL

DBally 2021-04-20
2066

什么是crash recover?

由于故障导致Instance异常关闭或由于执行了pg_ctl stop -m immediate都会导致数据库实例在重启时执行实例恢复,由后台进程完成,不需要人工干预,恢复的时间通过max_wal_size控制。

为什么要进行crash recover

正常运行期间,里面存在很多脏块,如果实例异常,导致脏块未写入磁盘,如果没有什么手段把崩溃后的脏块找回来,肯定会出现数据不一致的情况。当然关系数据库设计之初,这种情况就已经考虑了,通过重做日志,完全可以实现实例crash不丢数据。
总而言之,所有已提交的事务,都是可以恢复的,这样才能保证数据的一致性。

实例恢复的过程

LSN(Log Sequence Number
wal日志为了replay的有序性需要加上编号。实现的时候,是按日志的产生的顺序写入磁盘的,即使是写到磁盘缓冲区中,也是按产生的顺序一次写到日志缓冲区中,再将日志顺序写到磁盘。
因此采用日志在日志文件中的偏移来代替这个日志编号,可以通过日志编号迅速定位到日志。这个日志编号就叫做lsn。

恢复过程
1、初始化内存,启动后台进程。
2、pg在启动时读取pg_control文件内容。如果state为’in production’,PostgreSQL将进入恢复模式,因为这意味着数据库没有正常停止;如果为’shutdown’,将进入正常启动模式。
3、pg从相应的WAL段文件中读取最新的检查点记录(位于pg_control文件中),并从记录中获取重做点。如果最新的检查点记录无效(invalid),pg将读取前一个检查点的记录。如果两个记录都不可读,将放弃恢复。注意,从11版本开始不会再存储前一个检查点的记录信息。
4、使用合适的资源管理器从重做点开始按顺序读取和重放XLOG记录,直到最新WAL文件的最后位置。当遇到备份块时,无论其LSN如何,都会将覆盖相应表的页面。否则仅当此xlog记录LSN>相应页面的pd_lsn时,才会重放该XLOG记录。

强制关闭实例
pg_ctl stop -m immediate

2020-10-10 21:25:08.995 CST [2172] LOG:  received immediate shutdown request
2020-10-10 21:25:09.008 CST [2300] WARNING: terminating connection because of crash of another server process
2020-10-10 21:25:09.008 CST [2300] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2020-10-10 21:25:09.008 CST [2300] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2020-10-10 21:25:09.008 CST [2178] WARNING: terminating connection because of crash of another server process
2020-10-10 21:25:09.008 CST [2178] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2020-10-10 21:25:09.008 CST [2178] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2020-10-10 21:25:09.023 CST [2172] LOG: database system is shut down

复制

检查控制文件状态
检查点位置为:1/7F8D4B10

[postgres@qinghui-pc ~]$ pg_controldata
pg_control version number: 1100
Catalog version number: 201809051
Database system identifier: 6842561910620900147
Database cluster state: in production
pg_control last modified: Sat 10 Oct 2020 09:23:00 PM CST
Latest checkpoint location: 1/7F8D4B10
Latest checkpoint's REDO location: 1/7F8D4B10
Latest checkpoint's REDO WAL file: 00000001000000010000007F
Latest checkpoint's TimeLineID: 1
Latest checkpoint's PrevTimeLineID: 1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID: 0:749
Latest checkpoint's NextOID: 33179
Latest checkpoint's NextMultiXactId: 1
Latest checkpoint's NextMultiOffset: 0
Latest checkpoint's oldestXID: 561
Latest checkpoint's oldestXID's DB: 1
Latest checkpoint's oldestActiveXID: 0
Latest checkpoint's oldestMultiXid: 1
Latest checkpoint's oldestMulti's DB: 13287
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint: Sat 10 Oct 2020 06:01:16 PM CST
Fake LSN counter for unlogged rels: 0/1
Minimum recovery ending location: 0/0
Min recovery ending loc's timeline: 0
Backup start location: 0/0
Backup end location: 0/0
End-of-backup record required: no
wal_level setting: replica
wal_log_hints setting: off
max_connections setting: 100
max_worker_processes setting: 8
max_prepared_xacts setting: 0
max_locks_per_xact setting: 64
track_commit_timestamp setting: off
Maximum data alignment: 8
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 1996
Size of a large-object chunk: 2048
Date/time type storage: 64-bit integers
Float4 argument passing: by value
Float8 argument passing: by value
Data page checksum version: 0
Mock authentication nonce: b76ef4cc1642cbf88e6fff44c8a02a341834729c30f8ae683fd563312244c061

复制

启动数据库
redo应用的LSN为:1/7F8D4B80(大于1/7F8D4B10(检查点LSN),现象表明日志应用不一定从检查开始,也可能延后一些)

2020-10-10 21:28:36.812 CST [2433] LOG:  database system was interrupted; last known up at 2020-10-10 21:23:00 CST
2020-10-10 21:28:36.857 CST [2433] LOG: database system was not properly shut down; automatic recovery in progress
2020-10-10 21:28:36.858 CST [2433] LOG: redo starts at 1/7F8D4B80
2020-10-10 21:28:37.883 CST [2433] LOG: invalid record length at 1/89375790: wanted 24, got 0
2020-10-10 21:28:37.883 CST [2433] LOG: redo done at 1/89375518
2020-10-10 21:28:37.883 CST [2433] LOG: last completed transaction was at log time 2020-10-10 21:25:05.16881+08
2020-10-10 21:28:38.060 CST [2431] LOG: database system is ready to accept connections

复制

注意
recover过程非备份区块重做不是幂等的,多次重放将会导致不一致,另外recover过程不会做数据文件一致性校验,只要LSN大于数据块的LSN就需要应用日志,如果把一个老版本的数据文件拷贝到一个crash的实例下,检查点之后的日志都会重做一遍。

注:在数学与计算机学中,幂等操作是指其执行任意多次所产生的影响均与执行一次相同,即f(f(x))=f(x)。

总结

1、由于recover过程并不是全部幂等的,多次应用日志会导致数据不一致
2、恢复的起点是检查点的LSN,但是恢复日志中表现出有时比检查点的LSN大一些。
3、PG启动没有验证数据文件的一致性,较老的数据文件也可以正常读取。


文章转载自DBally,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论