[[toc]]
适用范围
当文件系统异常等原因导致csnlog文件丢失或损坏,在没有备份或者其它更合适的处理办法时,可尝试该文档中的方法应急启动数据库。
问题概述
丢失csnlog文件,数据库可以正常启动,可读,但新事务会报错,导致数据库实例终止
如下所示,提示csnlog pg_csnlog/000000000000文件不存在。
test=# create table t3(id int);
WARNING: AbortTransaction while in COMMIT state
ERROR: could not access status of transaction 27085 , nextXid is 27086
DETAIL: Could not open file “pg_csnlog/000000000000”: No such file or directory.
PANIC: could not access status of transaction 27085 , nextXid is 27086
DETAIL: Could not open file “pg_csnlog/000000000000”: No such file or directory.
ERROR: could not access status of transaction 27085 , nextXid is 27086
DETAIL: Could not open file “pg_csnlog/000000000000”: No such file or directory.
PANIC: could not access status of transaction 27085 , nextXid is 27086
DETAIL: Could not open file “pg_csnlog/000000000000”: No such file or directory.
The connection to the server was lost. Attempting reset: Failed.
问题原因
csnlog保存时事务提交逻辑时间戳,是事务系统正常运行不可缺少的组成部分,丢失将导致数据库系统无法正常运行。
csn作用可参考文章
https://support.enmotech.com/article/1430/publish
解决方案
只有在 latestCompletedXid<=>nextXid 之间的事务才需要csn判断可见性
latestCompletedXid 最后提交的事务ID
nextXid 事务系统下一个分配的事务ID(类似sequence last value)
重启库后,latestCompletedXid=nextXid,历史数据不太可能需要csn判断可见性
因此手工生成全为0的csnlog即可,每个文件256k。
dd if=/dev/zero of=data2/pg_csnlog/000000000000 bs=1024 count=256
再次起库
gs_ctl start -D data2
test=# create table t3(id int);
CREATE TABLE
test=# \dt
List of relations
Schema | Name | Type | Owner | Storage
--------±-----±------±------±---------------------------------
public | t1 | table | omm | {orientation=row,compression=no}
public | t2 | table | omm | {orientation=row,compression=no}
public | t3 | table | omm | {orientation=row,compression=no}
(3 rows)
参考文档
https://support.enmotech.com/article/1430/publish
可查看源码