暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

新鲜出炉—oracle在线重做日志redo损坏后的处理办法

原创 cqiwen 2021-12-13
3788
数据库未开归档,current redo损坏,数据库无法启动的解决方法:
oracle中redo文件损坏时有几种情形
1:如果损坏的是非当前redo文件而且已经归档:
此时可以使用alter database clear logfile group xx 清空该日志文件。

2:如果损坏的是非当前redo文件,并且该文件组没有归档:
此时使用alter database clear unarchived logfile group xx;  强行清空日志

3:归档模式下当前日志的损坏有两种情况:
3.1 数据库是正常关闭,日志文件中没有未决的事务需要实例恢复,当前日志组的损坏就可以直接用alter database clear unarchived logfile group n来重建。

3.2 日志组中有活动的事务,数据库需要媒体恢复,日志组需要用来同步,有两种补救办法
A:最好的办法就是通过不完全恢复,可以保证数据库的一致性,但是这种办法要求在归档方式下,并且有可用的备份
B:通过强制性恢复,但是可能导致数据库不一致

下面的情况假定当前日志损坏,并且没有备份,所以只有采取强制性恢复:

环境模拟
SQL> alter tablespace tbs_cqiwen02 add datafile '/opt/oracle/app/oradata/orcl/cqiwen02.dbf' size 100m autoextend on;
Tablespace altered.

SQL> select name,checkpoint_change# from v$datafile_header;

NAME                                                   CHECKPOINT_CHANGE#
-------------------------------------------------- ------------------
/opt/oracle/app/oradata/orcl/system01.dbf                      1010862
/opt/oracle/app/oradata/orcl/sysaux01.dbf                      1010862
/opt/oracle/app/oradata/orcl/undotbs01.dbf                      1010862
/opt/oracle/app/oradata/orcl/users01.dbf                      1010862
/opt/oracle/app/oradata/orcl/cqiwen02.dbf                      1017522

6 rows selected.

SQL> select * from v$log;

    GROUP#    THREAD#  SEQUENCE#      BYTES  BLOCKSIZE          MEMBERS ARC STATUS               FIRST_CHANGE# FIRST_TIM NEXT_CHANGE# NEXT_TIME
---------- ---------- ---------- ---------- ---------- ---------- --- ---------------- ------------- --------- ------------ ---------
         1            1               1   52428800           512                1 NO  CURRENT                     1010450 13-DEC-21         2.8147E+14
         2            1               0   52428800           512                1 YES UNUSED                           0                          0
         3            1               0   52428800           512                1 YES UNUSED                           0                          0

SQL> select * from v$logfile;

3           ONLINE  /opt/oracle/app/oradata/orcl/redo03.log  NO
2           ONLINE  /opt/oracle/app/oradata/orcl/redo02.log  NO
1           ONLINE  /opt/oracle/app/oradata/orcl/redo01.log  NO

人为破坏当前redo日志文件:
SQL> !mv /opt/oracle/app/oradata/orcl/redo01.log /opt/oracle/app/oradata/orcl/redo01.log.bak
SQL> shutdown immediate
SQL> startup
ORACLE instance started.

Total System Global Area 2137886720 bytes
Fixed Size                    2254952 bytes
Variable Size                  671090584 bytes
Database Buffers         1459617792 bytes
Redo Buffers                    4923392 bytes
Database mounted.
ORA-03113: end-of-file on communication channel
Process ID: 3795
Session ID: 580 Serial number: 5

alter日志中报错:
ORA-00313: open failed for members of log group 1 of thread
ORA-00312: online log 1 thread 1: '/opt/oracle/app/oradata/orcl/redo01.log'
USER (ospid: 3795): terminating the instance due to error 313
System state dump requested by (instance=1, osid=3795), summary=[abnormal instance termination]

SQL> startup mount
SQL> alter database backup controlfile to trace as '/tmp/control.trc';
SQL> !less /tmp/control.trc

重建归档文件(由于redo损坏了,所以重建时要用RESETLOGS,以使其自动重建redo):
将CREATE CONTROLFILE REUSE DATABASE "ORCL" NORESETLOGS NOARCHIVELOG 中的NORESETLOGS 改为 RESETLOGS。
shutdown immediate
startup nomount

CREATE CONTROLFILE REUSE DATABASE "ORCL" RESETLOGS NOARCHIVELOG
    MAXLOGFILES 16
    MAXLOGMEMBERS 3
    MAXDATAFILES 100
    MAXINSTANCES 8
    MAXLOGHISTORY 292
LOGFILE
  GROUP 1 '/opt/oracle/app/oradata/orcl/redo01.log'  SIZE 50M BLOCKSIZE 512,
  GROUP 2 '/opt/oracle/app/oradata/orcl/redo02.log'  SIZE 50M BLOCKSIZE 512,
  GROUP 3 '/opt/oracle/app/oradata/orcl/redo03.log'  SIZE 50M BLOCKSIZE 512
-- STANDBY LOGFILE
DATAFILE
  '/opt/oracle/app/oradata/orcl/system01.dbf',
  '/opt/oracle/app/oradata/orcl/sysaux01.dbf',
  '/opt/oracle/app/oradata/orcl/undotbs01.dbf',
  '/opt/oracle/app/oradata/orcl/users01.dbf',
  '/opt/oracle/app/oradata/orcl/cqiwen02.dbf'
CHARACTER SET AL32UTF8;

1)首先尝试使用隐含参数跳过检测
startup mount;
ALTER SYSTEM SET "_allow_resetlogs_corruption"=TRUE SCOPE=SPFILE;
shutdown immediate
startup
发现仍然启动失败

2)恢复数据库:
SQL> recover database using backup controlfile until cancel;
ORA-00279: change 1004329 generated at 12/10/2021 17:04:42 needed for thread 1
ORA-00289: suggestion : /opt/oracle/app/fast_recovery_area/ORCL/archivelog/2021_12_13/o1_mf_1_7_%u_.arc
ORA-00280: change 1004329 for thread 1 is in sequence #7

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/opt/oracle/app/oradata/orcl/system01.dbf'
ORA-01112: media recovery not started

resetlogs启动数据库:
SQL> alter database open resetlogs;
有时候到此步就已经成功了,数据库能正常open。

但我在实验时出现启动数据库失败:
alter database open resetlogs
*
ERROR at line 1:
ORA-01248: file 6 was created in the future of incomplete recovery
ORA-01110: data file 6: '/opt/oracle/app/oradata/orcl/cqiwen02.dbf'
因为cqiwen02.dbf创建好后没有做checkpoint,导致这个数据文件以及文件头中的的SCN比其它数据文件的都大。
因此系统认为其它数据文件也需要做recovery来追平至最大的SCN。

3)同时由于本人误操作,先将数据文件6 offline,然后进行了resetlogs,正常open后,我尝试将数据文件6 online,此时导致新的问题出现:
ERROR at line 1:
ORA-01190: control file or data file 6 is from before the last RESETLOGS
ORA-01110: data file 6: '/opt/oracle/app/oradata/orcl/cqiwen02.dbf'
意思是数据文件6是在resetlogs之前的文件,无法使用。网上有人说可以重新设置参照物为上一个版本。但由于我这个库没有开归档,最早的参照物是一年前的,已经无法使用。
因此,需要使用其它方法。

启动数据库:
SQL> startup mount;
ORACLE instance started.

Total System Global Area 2137886720 bytes
Fixed Size                    2254952 bytes
Variable Size                  671090584 bytes
Database Buffers         1459617792 bytes
Redo Buffers                    4923392 bytes
Database mounted.

SQL> RECOVER UNTIL CANCEL;
Media recovery complete.

SQL> alter database datafile 6 online;
Database altered.

SQL> ALTER DATABASE OPEN RESETLOGS;
ALTER DATABASE OPEN RESETLOGS
*
ERROR at line 1:
ORA-01190: control file or data file 6 is from before the last RESETLOGS
ORA-01110: data file 6: '/opt/oracle/app/oradata/orcl/cqiwen02.dbf'

4)将数据文件6离线,开启隐含参数并追SCN
重建控制文件,在控制文件中不指定datafile 6:
CREATE CONTROLFILE REUSE DATABASE "ORCL" RESETLOGS  NOARCHIVELOG
    MAXLOGFILES 16
    MAXLOGMEMBERS 3
    MAXDATAFILES 100
    MAXINSTANCES 8
    MAXLOGHISTORY 292
LOGFILE
  GROUP 1 '/opt/oracle/app/oradata/orcl/redo01.log'  SIZE 50M BLOCKSIZE 512,
  GROUP 2 '/opt/oracle/app/oradata/orcl/redo02.log'  SIZE 50M BLOCKSIZE 512,
  GROUP 3 '/opt/oracle/app/oradata/orcl/redo03.log'  SIZE 50M BLOCKSIZE 512
-- STANDBY LOGFILE
DATAFILE
  '/opt/oracle/app/oradata/orcl/system01.dbf',
  '/opt/oracle/app/oradata/orcl/sysaux01.dbf',
  '/opt/oracle/app/oradata/orcl/undotbs01.dbf',
  '/opt/oracle/app/oradata/orcl/users01.dbf'
CHARACTER SET AL32UTF8;

SQL> alter database datafile 6 offline;
alter database datafile 6 offline
*
ERROR at line 1:
ORA-01145: offline immediate disallowed unless media recovery enabled

SQL> alter database datafile 6 offline drop;
Database altered.

SQL> ALTER SYSTEM SET "_allow_resetlogs_corruption"=TRUE SCOPE=SPFILE;
System altered.

SQL> shutdown immediate
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.

SQL> startup
ORACLE instance started.

Total System Global Area 2137886720 bytes
Fixed Size                    2254952 bytes
Variable Size                  671090584 bytes
Database Buffers         1459617792 bytes
Redo Buffers                    4923392 bytes
Database mounted.
ORA-01589: must use RESETLOGS or NORESETLOGS option for database open

SQL> alter database open noresetlogs;
Database altered.

SQL> ALTER SESSION SET EVENTS 'IMMEDIATE TRACE NAME ADJUST_SCN LEVEL 1';  --追SCN
Session altered.

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.

SQL> startup mount
ORACLE instance started.

Total System Global Area 2137886720 bytes
Fixed Size                    2254952 bytes
Variable Size                  671090584 bytes
Database Buffers         1459617792 bytes
Redo Buffers                    4923392 bytes
Database mounted.

SQL> SELECT * FROM V$RECOVER_FILE;

FILE# ONLINE    ONLINE_ERROR                              CHANGE#   TIME
----- ------  ------------------------  -------  -------
6    OFFLINE  OFFLINE UNKNOWN ERROR                        1004679  10-DEC-21

SQL> RECOVER UNTIL CANCEL;
Media recovery complete.

SQL> ALTER DATABASE DATAFILE 6 online;
Database altered.
如果提示文件不存在,则需要将6号文件rename到正确的路径
alter database datafile 'XXX' rename to 'XXX';

SQL> ALTER DATABASE OPEN RESETLOGS;   --注:一定要在6号文件online的情况下才能做resetlogs,否则又会出现之前的问题。
Database altered.

查询文件头SCN(启动SCN)和数据文件SCN:
col name for a50
set lin 200 pagesize 300
select name,checkpoint_change# from v$datafile_header;
select name,checkpoint_change# from v$datafile;

将修改的隐含参数改回默认值:
ALTER SYSTEM SET "_allow_resetlogs_corruption"=false SCOPE=SPFILE;
最后修改时间:2021-12-13 17:36:04
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

文章被以下合辑收录

评论