作者
digoal
日期
2022-03-18
标签
PostgreSQL , pg_verifybackup , pg_waldump , 检测 , 有效性
使用 pg_waldump 检查wal文件、归档文件是否有损坏.
使用 pg_verifybackup 检查pg_basebackup备份的数据文件是否有损坏.
pg_waldump 检查wal文件是否有损坏的原理是检查checksum, 因为wal的checksum是强制开启的.
pg_verifybackup will use that information to invoke pg_waldump to parse those write-ahead log records. The --quiet
flag will be used, so that pg_waldump will only report errors, without producing any other output.
While this level of verification is sufficient to detect obvious problems such as a missing file or one whose internal checksums do not match, they aren't extensive enough to detect every possible problem that might occur when attempting to recover.
For instance, a server bug that produces write-ahead log records that have the correct checksums but specify nonsensical actions can't be detected by this method.
https://www.postgresql.org/docs/13/app-pgverifybackup.html
src/backend/access/transam/xlogreader.c
/* * CRC-check an XLOG record. We do not believe the contents of an XLOG * record (other than to the minimal extent of computing the amount of * data to read in) until we've checked the CRCs. * * We assume all of the record (that is, xl_tot_len bytes) has been read * into memory at *record. Also, ValidXLogRecordHeader() has accepted the * record's header, which means in particular that xl_tot_len is at least * SizeOfXLogRecord. */ static bool ValidXLogRecord(XLogReaderState *state, XLogRecord *record, XLogRecPtr recptr) { pg_crc32c crc; /* Calculate the CRC */ INIT_CRC32C(crc); COMP_CRC32C(crc, ((char *) record) + SizeOfXLogRecord, record->xl_tot_len - SizeOfXLogRecord); /* include the record header last */ COMP_CRC32C(crc, (char *) record, offsetof(XLogRecord, xl_crc)); FIN_CRC32C(crc); if (!EQ_CRC32C(record->xl_crc, crc)) { report_invalid_record(state, "incorrect resource manager data checksum in record at %X/%X", LSN_FORMAT_ARGS(recptr)); return false; } return true; }
复制
例子
cd $PGDATA/pg_wal pg_waldump -q -p ./ 000000010000000100000045 000000010000000100000076 pg_waldump: fatal: error in WAL record at 1/47000028: invalid record length at 1/47000060: wanted 24, got 0
复制
wal文件会被重命名循环使用, 文件内容中可能还有残留信息, 没有初始化. 因此在使用pg_waldump读取时会报错, 属于正常现象.
postgres=# show wal_init_zero; wal_init_zero --------------- off (1 row)
复制
使用pg_waldump检查归档文件不会有如上问题.