暂无图片
暂无图片
8
暂无图片
暂无图片
暂无图片

ADG基于SCN恢复备库(修复GAP)

原创 virvle 2024-05-16
1336

环境:主备基于11.2.0.4版本

一、备库遇到未同步问题:

前几天查看备库的数据,发现数据不是最新的,最新的还停留在4月中,而查询同步延时为0分钟,难怪一直没有同步异常告警
进一步排查,发现是归档日志未从主库传输到备库,发现LOG_ARCHIVE_DEST_STATE_2=defer 。。。。我的天啊

于是尝试利用主库备份的归档日志,传输到备库进行应用,而这过程中,发现缺失了一部分归档日志,触发GAP

Mon May 13 14:02:06 2024 Fetching gap sequence in thread 1, gap sequence 452894-452993 Mon May 13 14:04:00 2024 FAL[client]: Failed to request gap sequence GAP - thread 1 sequence 452894-452993 DBID 2198540012 branch 909675799 FAL[client]: All defined FAL servers have been attempted. ------------------------------------------------------------ Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization parameter is defined to a value that's sufficiently large enough to maintain adequate log switch information to resolve archivelog gaps. ------------------------------------------------------------

至此,使用归档日志应用恢复,已经无望了。
那么接下来,通常做法是要么备库铲掉重做,要么是基于SCN增量恢复,各有优缺点:
(1)铲掉重做:若是数据量特别大,历时较长
(2)基于SCN恢复:速度快,但增备可能对主库造成一定的性能影响

二、综合考虑,使用基于SCN 恢复备库

1. 关闭备库应用,并查询当前备库SCN

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL; Database altered. SQL> select to_char(current_scn) from v$database; TO_CHAR(CURRENT_SCN) ---------------------------------------- 17626998155431

2. 重启备库到nomount【一定要确认IP,谨慎!再谨慎!】

sqlplus / as sysdba shutdown immediate; startup nomount;

3. 在主库查询备库的SCN到当前时间,是否有新增数据文件 (使用第一步获取到的SCN)

select file#,to_char(creation_time,'yyyy-mm-dd hh24:mi:ss' ) creation_time,status,last_time,name from v$datafile where creation_change#>17626998155431;

4. 备份控制文件

alter database create standby controlfile as '/databak/scnbak/standby211_20240515.ctl';

5. 主库基于scn增量备份(需考虑备份对服务器的影响)

5.1 备份脚本:

#!/bin/bash . $HOME/.bash_profile DATE=`date +%y%m%d%H%M`;export DATE RMAN_LOG_FILE=/databak/scnbak/dbbackup_$DATE.log rman target / nocatalog msglog $RMAN_LOG_FILE <<EOF run { allocate channel ch1 type disk; allocate channel ch2 type disk; allocate channel ch3 type disk; allocate channel ch4 type disk; backup as backupset incremental from scn 17626998155431 database filesperset 5 tag='incr_scn_bak' format '/databak/scnbak/incre_%U_%T'; release channel ch1; release channel ch2; release channel ch3; release channel ch4; } EOF DATE=`date +%y%m%d`;export DATE

5.2 为了让其在业务低峰期执行,定了个计划任务执行shell脚本

crontab -l 01 04 * * * /bin/sh /mnt/incre_dbbackup.sh

5.3 查看备份日志情况 (备份耗时1小时15分钟)

$ cat dbbackup_2405160928.log|grep "comp" channel ch4: backup set complete, elapsed time: 00:08:35 channel ch2: backup set complete, elapsed time: 00:12:12 channel ch3: backup set complete, elapsed time: 00:13:27 channel ch1: backup set complete, elapsed time: 00:15:13 channel ch4: backup set complete, elapsed time: 00:07:22 channel ch2: backup set complete, elapsed time: 00:08:32 channel ch1: backup set complete, elapsed time: 00:07:16 channel ch4: backup set complete, elapsed time: 00:06:56 channel ch3: backup set complete, elapsed time: 00:10:23 channel ch4: backup set complete, elapsed time: 00:05:52 channel ch2: backup set complete, elapsed time: 00:08:18 channel ch1: backup set complete, elapsed time: 00:08:38 channel ch3: backup set complete, elapsed time: 00:07:31 channel ch2: backup set complete, elapsed time: 00:07:15 channel ch4: backup set complete, elapsed time: 00:07:39 channel ch3: backup set complete, elapsed time: 00:05:48 channel ch1: backup set complete, elapsed time: 00:08:18 channel ch4: backup set complete, elapsed time: 00:08:55 channel ch2: backup set complete, elapsed time: 00:09:47 channel ch3: backup set complete, elapsed time: 00:09:51 channel ch1: backup set complete, elapsed time: 00:11:12 channel ch4: backup set complete, elapsed time: 00:06:42 channel ch3: backup set complete, elapsed time: 00:05:58 channel ch2: backup set complete, elapsed time: 00:07:50 channel ch1: backup set complete, elapsed time: 00:06:04 channel ch3: backup set complete, elapsed time: 00:06:55 channel ch2: backup set complete, elapsed time: 00:08:06 channel ch4: backup set complete, elapsed time: 00:10:44 channel ch1: backup set complete, elapsed time: 00:08:43 channel ch2: backup set complete, elapsed time: 00:06:57 channel ch4: backup set complete, elapsed time: 00:08:47 channel ch3: backup set complete, elapsed time: 00:11:43 channel ch4: backup set complete, elapsed time: 00:00:03 channel ch2: backup set complete, elapsed time: 00:05:32 channel ch1: backup set complete, elapsed time: 00:10:07 Recovery Manager complete.

6. 传输主库备份的控制文件及增量备份文件到备库

scp -rp -P22 /databak/scnbak/* oracle@10.10.10.2:/databak/scnbak/

7. 备库恢复控制文件

rman target / RMAN> restore controlfile from '/databak/scnbak/standby211_20240515.ctl'; Starting restore at 2024:05:16 11:08:14 using target database control file instead of recovery catalog allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=1135 device type=DISK channel ORA_DISK_1: copied control file copy output file name=/mnt/oradata/control01.ctl output file name=/mnt/oradata/control02.ctl Finished restore at 2024:05:16 11:08:21

8. 启动备库到mount状态,并注册备份文件

rman target / alter database mount; catalog start with '/databak/scnbak';

9备库恢复新增文件,若第3部位查询无新增文件,跳过此步(本次恢复中未涉及,跳过此步骤)

run{ allocate channel ch1 type disk; allocate channel ch2 type disk; backup datafile 200,201,202 format '/databak/scnbak/incre_dbfile_%U'; release channel ch1; release channel ch2; }

10. 查看备份集

report schema; RMAN> list incarnation; List of Database Incarnations DB Key Inc Key DB Name DB ID STATUS Reset SCN Reset Time ------- ------- -------- ---------------- --- ---------- ---------- 1 1 ORCL 2198540012 PARENT 1 2009:08:15 00:16:43 2 2 ORCL 2198540012 CURRENT 945184 2016:04:20 15:43:19

11. 恢复备库,鉴于资源充足,直接开8通道,即时运行(耗时13分钟)

11.1 恢复脚本 rman_recover_incre_scn.sh

#!/bin/bash . $HOME/.bash_profile DATE=`date +%y%m%d%H`;export DATE RMAN_RECOVER_LOG_FILE=/mnt/script/recover_script/recover_incre_scn_$DATE.log rman target / nocatalog msglog $RMAN_RECOVER_LOG_FILE <<EOF run { allocate channel ch1 type disk; allocate channel ch2 type disk; allocate channel ch3 type disk; allocate channel ch4 type disk; allocate channel ch5 type disk; allocate channel ch6 type disk; allocate channel ch7 type disk; allocate channel ch8 type disk; recover database noredo; release channel ch1; release channel ch2; release channel ch3; release channel ch5; release channel ch6; release channel ch7; release channel ch8; } EOF DATE=`date +%y%m%d`;export DATE

11.2 执行恢复脚本

nohup sh rman_recover_incre_scn.sh &

PS: 到此,若直接open数据库,会提示报错

12. 启动备库 mgr 应用日志

12.1 启动日志实时应用

alter database recover managed standby database using current logfile disconnect from session;

12.2 查看redo日志,观察日志应用情况

Thu May 16 12:33:58 2024 Managed Standby Recovery starting Real Time Apply Parallel Media Recovery started with 8 slaves Waiting for all non-current ORLs to be archived... All non-current ORLs have been archived. Media Recovery Waiting for thread 1 sequence 453056 Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION Thu May 16 12:37:01 2024 Primary database is in MAXIMUM PERFORMANCE mode SRL log 20 needs clearing because log has not been created SRL log 21 needs clearing because log has not been created SRL log 22 needs clearing because log has not been created RFS[1]: Assigned to RFS process 19706 RFS[1]: Selected log 23 for thread 1 sequence 453058 dbid -2108412354branch 909675799 Thu May 16 12:37:01 2024 SRL log 20 needs clearing because log has not been created SRL log 21 needs clearing because log has not been created SRL log 22 needs clearing because log has not been created RFS[2]: Assigned to RFS process 19712 RFS[2]: Selected log 24 for thread 1 sequence 453057 dbid -2108412354 branch 909675799 Thu May 16 12:37:03 2024 Fetching gap sequence in thread 1, gap sequence 453056-453056 Thu May 16 12:37:04 2024 RFS[3]: Assigned to RFS process 19714 RFS[3]: Opened log for thread 1 sequence 453056 dbid -2108412354branch 909675799 Thu May 16 12:37:06 2024 Archived Log entry 1 added for thread 1 sequence 453057 ID 0x825fe694 dest 1: Thu May 16 12:37:14 2024 SRL log 20 needs clearing because log has not been created SRL log 21 needs clearing because log has not been created SRL log 22 needs clearing because log has not been created RFS[1]: Selected log 24 for thread 1 sequence 453059 dbid -2108412354 branch 909675799 Thu May 16 12:37:14 2024 Archived Log entry 2 added for thread 1 sequence 453058 ID 0x825fe694 dest 1: SRL log 20 needs clearing because log has not been created SRL log 21 needs clearing because log has not been created SRL log 22 needs clearing because log has not been created RFS[1]: Selected log 23 for thread 1 sequence 453060 dbid -2108412354 branch 909675799 Thu May 16 12:37:19 2024 Archived Log entry 3 added for thread 1 sequence 453059 ID 0x825fe694 dest 1: Thu May 16 12:37:49 2024 Archived Log entry 4 added for thread 1 sequence 453056 rlc 909675799 ID 0x825fe694 dest 2: Thu May 16 12:37:50 2024 Media Recovery Log /mnt/archivelog/ARC0000453056_0909675799.0001.arc Thu May 16 12:38:20 2024 Media Recovery Log /mnt/archivelog/ARC0000453057_0909675799.0001.arc Media Recovery Log /mnt/archivelog/ARC0000453058_0909675799.0001.arc Media Recovery Log /mnt/archivelog/ARC0000453059_0909675799.0001.arc Media Recovery Waiting for thread 1 sequence 453060 (in transit) Recovery of Online Redo Log: Thread 1 Group 23 Seq 453060 Reading mem 0 Mem# 0: /mnt/oradata/standby_redo23.log

12.3 当再行日志到达当前最新时,取消日志应用,打开数据库

SQL> alter database recover managed standby database cancel; Database altered. SQL> alter database open; Database altered. SQL> alter database recover managed standby database using current logfile disconnect; Database altered. SQL> select open_mode from v$database; OPEN_MODE -------------------- READ ONLY WITH APPLY

13. 主库重新打开归档同步日志同步

sqlplus / as sysdba alter system set log_archive_dest_state_2=defer; alter system set log_archive_dest_state_2=enable;

14. 查看主备同步情况

select * from v$dataguard_stats;

image.png

三、常用SQL

1. 查询备库的进程状态

select process, status,sequence#,thread# from v$managed_standby; select message_num,error_code,timestamp,message from v$dataguard_status;

2. 查询备库的角色

set linesize 450; column DBNAME format a10; column DBUNAME format a10; column cftype format a10; column OPEN_MODE format a25; column DATABASE_ROLE format a20; select name dbname,db_unique_name dbuname,controlfile_type cftype,database_role,open_mode from v$database;

3. 查询备库的日志应用模式

SQL> select recovery_mode from v$archive_dest_status where dest_id=1; RECOVERY_MODE ----------------------- MANAGED REAL TIME APPLY

4. 查看恢复进程

set lines 450 set pages 900 col type for a15 col item for a20 col units for a15 select to_char(start_time, 'DD-MON-RR HH24:MI:SS') start_time, type, item, units, sofar, total, timestamp from v$recovery_progress;

5. 查看当前归档日志应用情况

select a.thread#, b.max_available, a.max_applied, b.max_available - a.max_applied from (select thread#, max(sequence#) max_applied from gv$archived_log where applied = 'YES' group by thread#) a, (select thread#, max(sequence#) max_available from gv$archived_log group by thread#) b where a.thread# = b.thread#;

四. 常见报错

1. 主库生成用于备库的控制文件报错,提示目录不存在

SQL> alter database create standby controlfile as '/scnbak/standby211_20240515.ctl'; alter database create standby controlfile as '/scnbak/standby211_20240515.ctl' * ERROR at line 1: ORA-01580: error creating control backup file /scnbak/standby211_20240515.ctl ORA-27040: file create error, unable to create file Linux-x86_64 Error: 2: No such file or directory Additional information: 1

处理措施:新建目录,或选择正确的目录存放

2. 注册备份目录遇到报错

File Name: /databak/scnbak/standby211_20240515.ctl

List of Files Which Where Not Cataloged
=======================================
File Name: /databak/scnbak/dbbackup_24051609.log
  RMAN-07517: Reason: The file header is corrupted
File Name: /databak/scnbak/dbbackup_2405160928.log
  RMAN-07517: Reason: The file header is corrupted

处理措施:查看非备份文件,忽略即可

3. recover完成后,打开报错(先应用归档日志,再开启即可)

SQL> alter database open ; alter database open * ERROR at line 1: ORA-10458: standby database requires recovery ORA-01152: file 1 was not restored from a sufficiently old backup ORA-01110: data file 1: '/mnt/oradata/system01.dbf'

处理措施:先应用日志,开启mgr,待同步后,再行打开
– 查看归档日志

Errors in file /mnt/oracle/diag/rdbms/vir_st/vir/trace/vir_ora_17866.trc: ORA-16016: archived log for thread 1 sequence# 453056 unavailable Recovery interrupted! Completed Standby Crash Recovery. Signalling error 1152 for datafile 1! Errors in file /mnt/oracle/diag/rdbms/vir_st/vir/trace/vir_ora_17866.trc: ORA-10458: standby database requires recovery ORA-01152: file 1 was not restored from a sufficiently old backup ORA-01110: data file 1: '/mnt/oradata/system01.dbf' ORA-10458 signalled during: alter database open ...

到此,已恢复备库的实时。定期检查很重要

最后修改时间:2024-05-17 10:37:51
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论