暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

MogDB丢失DW(double write)文件应急处理方法

原创 范计杰 2022-09-05
1449

[[toc]]

适用范围

当文件系统异常导致double write文件丢失或损坏,在没有备份或者其它更合适的处理办法时,可尝试该文档中的方法应急启动数据库。

问题概述

当开启double write特性(默认开启),丢失double write文件后数据库无法启动。
数据库启动失败,日志如下,提示DW file文件不存在,错误堆栈时在做DW初始化。
2022-06-20 16:48:06.687 [unknown] [unknown] localhost 140123442810048 0 0 [BACKEND] LOG: start create thread!
2022-06-20 16:48:06.687 [unknown] [unknown] localhost 140123442810048 0 0 [BACKEND] LOG: create thread end!
2022-06-20 16:48:06.693 [unknown] [unknown] localhost 140122435614464 0 0 [BACKEND] LOG: [Alarm Module]alarm checker started.
2022-06-20 16:48:06.694 [unknown] [unknown] localhost 140122416674560 0 0 [BACKEND] LOG: reaper backend started.
2022-06-20 16:48:06.717 [unknown] [unknown] localhost 140122357491456 0 0 [REDO] LOG: [mpfl_ulink_file]: unlink global/max_page_flush_lsn sucessfully! ret:4294967295
2022-06-20 16:48:06.717 [unknown] [unknown] localhost 140122357491456 0 0 [BACKEND] LOG: StartupXLOG: biggest_lsn_in_page is set to FFFFFFFF/FFFFFFFF, enable_update_max_page_flush_lsn:0
2022-06-20 16:48:06.717 [unknown] [unknown] localhost 140122357491456 0 0 [BACKEND] LOG: database system timeline: 17
2022-06-20 16:48:06.717 [unknown] [unknown] localhost 140122357491456 0 0 [BACKEND] LOG: database system was shut down at 2022-06-20 16:47:32 CST
2022-06-20 16:48:06.720 [unknown] [unknown] localhost 140122357491456 0 0 [DBL_WRT] - [ ] PANIC: batch flush DW file does not exist <<<<<
2022-06-20 16:48:06.720 [unknown] [unknown] localhost 140122357491456 0 0 [DBL_WRT] BACKTRACELOG: tid[2843]'s backtrace:
/opt/og/bin/gaussdb(+0x9f16e2) [0x5625215106e2]
/opt/og/bin/gaussdb(_Z9errfinishiz+0x31c) [0x56252150274c]
/opt/og/bin/gaussdb(_Z25dw_file_check_and_rebuildv+0x128) [0x562521da4c68]
/opt/og/bin/gaussdb(_Z7dw_initb+0x7d) [0x562521da870d]
/opt/og/bin/gaussdb(_Z11StartupXLOGv+0x177b) [0x562521dddfbb]
/opt/og/bin/gaussdb(_Z18StartupProcessMainv+0x1ac) [0x5625219add4c]
/opt/og/bin/gaussdb(_Z26GaussDbAuxiliaryThreadMainIL15knl_thread_role26EEiP14knl_thread_arg+0xe0) [0x5625219a8f20]
/opt/og/bin/gaussdb(_Z17GaussDbThreadMainIL15knl_thread_role26EEiP14knl_thread_arg+0x245) [0x5625219a9185]
/opt/og/bin/gaussdb(+0xe6dc25) [0x56252198cc25]
/lib64/libpthread.so.0(+0x7e65) [0x7f70ffbebe65]
/lib64/libc.so.6(clone+0x6d) [0x7f70ff91488d]
Use addr2line to get pretty function name and line

问题原因

启动时需要初始化double write,必要时使用DW RECOVER。由于DW文件丢失启动失败。

解决方案

1、重新初始化一个临时cluster,复制新生成的dw文件到需要修复的cluster中
$ gs_initdb -D ./tmpdata --nodename test
$ ls tmpdata/global/pg_dw*
tmpdata/global/pg_dw tmpdata/global/pg_dw_single
$ cp ./tmpdata/global/pg_dw* data/global/
2、修改参数文件,设置enable_double_write = off
这时启动数据库可以正常启动。
注意:
当需要dw处理fracture page时可能会丢数据。

参考文档

https://docs.mogdb.io/zh/mogdb/v2.1/2-checkpoints#enable_double_write

「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

文章被以下合辑收录

评论