控制文件中记录着数据库的数据文件、日志文件、备份数据等信息,更为重要的,控制文件中还记录了数据库的检查点和SCN信息,这些信息在数据恢复的过程中将起到关键性作用。
一个正常运行的数据库,通常控制文件都存在多份镜像,这些镜像的内容是完全相同的,Oracle缺省就创建多份控制文件更说明了控制文件的重要:
SQL> select name from v$controlfile; NAME --------------------------------------------------------------- /data2/ora10g/oradata/mars/control01.ctl /data2/ora10g/oradata/mars/control02.ctl /data2/ora10g/oradata/mars/control03.ctl
复制
可以通过如下一条命令将控制文件的创建语句备份到跟踪文件中:
SQL> alter database backup controlfile to trace; Database altered. SQL> @gettrcname TRACE_FILE_NAME ---------------------------------------------------------------- /data2/ora10g/admin/mars/udump/mars_ora_22847.trc
复制
此时跟踪文件中会记录控制文件的创建脚本,脚本包含两个主要的段落,其中一段如下所示:
STARTUP NOMOUNT CREATE CONTROLFILE REUSE DATABASE "MARS" RESETLOGS ARCHIVELOG MAXLOGFILES 16 MAXLOGMEMBERS 3 MAXDATAFILES 100 MAXINSTANCES 8 MAXLOGHISTORY 454 LOGFILE GROUP 1 '/data2/ora10g/oradata/mars/redo01.log' SIZE 10M, GROUP 2 '/data2/ora10g/oradata/mars/redo02.log' SIZE 10M, GROUP 3 '/data2/ora10g/oradata/mars/redo03.log' SIZE 10M -- STANDBY LOGFILE DATAFILE '/data2/ora10g/oradata/mars/system01.dbf', '/data2/ora10g/oradata/mars/undotbs01.dbf', '/data2/ora10g/oradata/mars/sysaux01.dbf', '/data2/ora10g/oradata/mars/users01.dbf' CHARACTER SET ZHS16GBK ;
复制
当在数据库Nomount状态下,可以通过运行这段脚本创建控制文件,控制文件会自动创建到参数文件中记录的控制文件位置(原来的控制文件在创建过程中会被覆盖)。这里需要理解的一个主要选项是:NORESETLOGS/RESETLOGS。
在跟踪文件中包含如下注释,详细解释了这两个选项的含义:
-- Below are two sets of SQL statements, each of which creates a new -- control file and uses it to open the database. The first set opens -- the database with the NORESETLOGS option and should be used only if -- the current versions of all online logs are available. The second -- set opens the database with the RESETLOGS option and should be used -- if online logs are unavailable.
复制
当数据库当前的Redo Log都可用时,可以通过NORESETLOGS参数重建控制文件,此时Oracle能够从日志文件中读取Redo信息,记录到控制文件中,由于Redo中记录的信息足以重演所有提交成功的事务,所以最终能够实现完全恢复,成功打开数据库,这时的数据库就如同进行了一次断电之后的实例恢复,数据没有损失,重做日志可以继续向前写入。
下面通过一个测试来看一下这个过程。
首先在数据库正常运行状态,可以执行一次控制文件转储:
SQL> alter session set events 'immediate trace name CONTROLF level 8'; Session altered.
复制
这个转储文件中将包含数据库的检查点、Redo Thread信息、数据文件等信息,引用一点LOG FILE RECORDS内容:
*************************************************************************** LOG FILE RECORDS *************************************************************************** (size = 72, compat size = 72, section max = 16, section in-use = 3, last-recid= 0, old-recno = 0, last-recno = 0) (extent = 1, blkno = 9, numrecs = 16) LOG FILE #1: (name #3) /data2/ora10g/oradata/mars/redo01.log Thread 1 redo log links: forward: 2 backward: 0 siz: 0x5000 seq: 0x00001225 hws: 0x3 bsz: 512 nab: 0x3749 flg: 0x1 dup: 1 Archive links: fwrd: 0 back: 0 Prev scn: 0x0004.6c273a84 Low scn: 0x0004.6c279db0 03/28/2007 16:15:38 Next scn: 0x0004.6c27b252 03/28/2007 16:15:50 LOG FILE #2: (name #2) /data2/ora10g/oradata/mars/redo02.log Thread 1 redo log links: forward: 3 backward: 1 siz: 0x5000 seq: 0x00001226 hws: 0x1 bsz: 512 nab: 0xffffffff flg: 0x8 dup: 1 Archive links: fwrd: 0 back: 0 Prev scn: 0x0004.6c279db0 Low scn: 0x0004.6c27b252 03/28/2007 16:15:50 Next scn: 0xffff.ffffffff 01/01/1988 00:00:00 LOG FILE #3: (name #1) /data2/ora10g/oradata/mars/redo03.log Thread 1 redo log links: forward: 0 backward: 2 siz: 0x5000 seq: 0x00001224 hws: 0x5 bsz: 512 nab: 0x1eb0 flg: 0x1 dup: 1 Archive links: fwrd: 0 back: 0 Prev scn: 0x0004.6c270b1c Low scn: 0x0004.6c273a84 03/28/2007 14:00:41 Next scn: 0x0004.6c279db0 03/28/2007 16:15:38
复制
注意redo02.log文件是当前的日志文件。
接下来通过shutdown abort模拟一个数据库故障:
SQL> shutdown abort; ORACLE instance shut down.
复制
然后启动数据库到nomount状态,重建控制文件:
SQL> startup nomount; ORACLE instance started. SQL> CREATE CONTROLFILE REUSE DATABASE "MARS" NORESETLOGS ARCHIVELOG 2 MAXLOGFILES 16 3 MAXLOGMEMBERS 3 4 MAXDATAFILES 100 5 MAXINSTANCES 8 6 MAXLOGHISTORY 454 7 LOGFILE 8 GROUP 1 '/data2/ora10g/oradata/mars/redo01.log' SIZE 10M, 9 GROUP 2 '/data2/ora10g/oradata/mars/redo02.log' SIZE 10M, 10 GROUP 3 '/data2/ora10g/oradata/mars/redo03.log' SIZE 10M 11 -- STANDBY LOGFILE 12 DATAFILE 13 '/data2/ora10g/oradata/mars/system01.dbf', 14 '/data2/ora10g/oradata/mars/undotbs01.dbf', 15 '/data2/ora10g/oradata/mars/sysaux01.dbf', 16 '/data2/ora10g/oradata/mars/users01.dbf' 17 CHARACTER SET ZHS16GBK 18 ;
复制
此时再来转储一次控制文件,检查LOG FILE RECORD部分:
*************************************************************************** LOG FILE RECORDS *************************************************************************** (size = 72, compat size = 72, section max = 16, section in-use = 3, last-recid= 0, old-recno = 0, last-recno = 0) (extent = 1, blkno = 9, numrecs = 16) LOG FILE #1: (name #2) /data2/ora10g/oradata/mars/redo01.log Thread 1 redo log links: forward: 2 backward: 0 siz: 0x5000 seq: 0x00001225 hws: 0x3 bsz: 512 nab: 0x3749 flg: 0x0 dup: 1 Archive links: fwrd: 2 back: 3 Prev scn: 0x0004.6c273a84 Low scn: 0x0004.6c279db0 03/28/2007 16:15:38 Next scn: 0x0004.6c27b252 03/28/2007 16:15:50 LOG FILE #2: (name #1) /data2/ora10g/oradata/mars/redo02.log Thread 1 redo log links: forward: 3 backward: 1 siz: 0x5000 seq: 0x00001226 hws: 0x1 bsz: 512 nab: 0xffffffff flg: 0xa dup: 1 Archive links: fwrd: 0 back: 1 Prev scn: 0x0004.6c279db0 Low scn: 0x0004.6c27b252 03/28/2007 16:15:50 Next scn: 0xffff.ffffffff 01/01/1988 00:00:00 LOG FILE #3: (name #3) /data2/ora10g/oradata/mars/redo03.log Thread 1 redo log links: forward: 0 backward: 2 siz: 0x5000 seq: 0x00001224 hws: 0x5 bsz: 512 nab: 0x1eb0 flg: 0x0 dup: 1 Archive links: fwrd: 1 back: 0 Prev scn: 0x0000.00000000 Low scn: 0x0004.6c273a84 03/28/2007 14:00:41 Next scn: 0x0004.6c279db0 03/28/2007 16:15:38
复制
注意到重建的控制文件能够从当前的日志文件获得正确的SCN及时间点等信息。同样地,控制文件也能够从数据文件中获得详细的检查点信息等:
DATA FILE #1: (name #7) /data2/ora10g/oradata/mars/system01.dbf creation size=0 block size=8192 status=0x12 head=7 tail=7 dup=1 tablespace 0, index=1 krfil=1 prev_file=0 unrecoverable scn: 0x0000.00000000 01/01/1988 00:00:00 Checkpoint cnt:4673 scn: 0x0004.6c27b252 03/28/2007 16:15:50 Stop scn: 0xffff.ffffffff 03/28/2007 16:22:38 Creation Checkpointed at scn: 0x0000.00000006 06/09/2005 14:39:29 thread:0 rba:(0x0.0.0)
复制
由于数据库是异常关闭的,所以数据文件的Stop Scn为无穷大:Stop scn: 0xffff.ffffffff。接下来对数据库执行恢复,然后再来记录控制文件信息:
SQL> RECOVER DATABASE; Media recovery complete. SQL> alter session set events 'immediate trace name CONTROLF level 8'; Session altered.
复制
观察此时的数据文件信息:
*************************************************************************** DATA FILE RECORDS *************************************************************************** (size = 180, compat size = 180, section max = 100, section in-use = 4, last-recid= 0, old-recno = 0, last-recno = 0) (extent = 1, blkno = 10, numrecs = 100) DATA FILE #1: (name #7) /data2/ora10g/oradata/mars/system01.dbf creation size=0 block size=8192 status=0x2 head=7 tail=7 dup=1 tablespace 0, index=1 krfil=1 prev_file=0 unrecoverable scn: 0x0000.00000000 01/01/1988 00:00:00 Checkpoint cnt:4674 scn: 0x0004.6c27c3b4 03/28/2007 16:21:09 Stop scn: 0x0004.6c27c3b4 03/28/2007 16:21:09 Creation Checkpointed at scn: 0x0000.00000006 06/09/2005 14:39:29 thread:0 rba:(0x0.0.0)
复制
经过恢复之后,数据文件达到了一致状态,checkpoint scn (0x0004.6c27c3b4)和Stop SCN(0x0004.6c27c3b4)达到了一致。此时数据库就完成了恢复,数据库可以顺利启动:
SQL> alter database open; Database altered.
复制
再来看一下使用RESETLOGS的方式重建控制文件:
SQL> CREATE CONTROLFILE REUSE DATABASE "MARS" RESETLOGS ARCHIVELOG 2 MAXLOGFILES 16 3 MAXLOGMEMBERS 3 4 MAXDATAFILES 100 5 MAXINSTANCES 8 6 MAXLOGHISTORY 454 7 LOGFILE 8 GROUP 1 '/data2/ora10g/oradata/mars/redo01.log' SIZE 10M, 9 GROUP 2 '/data2/ora10g/oradata/mars/redo02.log' SIZE 10M, 10 GROUP 3 '/data2/ora10g/oradata/mars/redo03.log' SIZE 10M 11 -- STANDBY LOGFILE 12 DATAFILE 13 '/data2/ora10g/oradata/mars/system01.dbf', 14 '/data2/ora10g/oradata/mars/undotbs01.dbf', 15 '/data2/ora10g/oradata/mars/sysaux01.dbf', 16 '/data2/ora10g/oradata/mars/users01.dbf' 17 CHARACTER SET ZHS16GBK 18 ; Control file created.
复制
此时观察控制文件的信息,摘录一段LOG FILE RECORD的信息:
*************************************************************************** LOG FILE RECORDS *************************************************************************** (size = 72, compat size = 72, section max = 16, section in-use = 3, last-recid= 0, old-recno = 0, last-recno = 0) (extent = 1, blkno = 9, numrecs = 16) LOG FILE #1: (name #3) /data2/ora10g/oradata/mars/redo01.log Thread 1 redo log links: forward: 2 backward: 0 siz: 0x5000 seq: 0x00000000 hws: 0x0 bsz: 512 nab: 0x0 flg: 0x1 dup: 1 Archive links: fwrd: 0 back: 0 Prev scn: 0x0000.00000000 Low scn: 0x0000.00000000 01/01/1988 00:00:00 Next scn: 0x0000.00000000 01/01/1988 00:00:00 LOG FILE #2: (name #2) /data2/ora10g/oradata/mars/redo02.log Thread 1 redo log links: forward: 3 backward: 1 siz: 0x5000 seq: 0x00000000 hws: 0x0 bsz: 512 nab: 0x0 flg: 0x1 dup: 1 Archive links: fwrd: 0 back: 0 Prev scn: 0x0000.00000000 Low scn: 0x0000.00000000 01/01/1988 00:00:00 Next scn: 0x0000.00000000 01/01/1988 00:00:00 LOG FILE #3: (name #1) /data2/ora10g/oradata/mars/redo03.log Thread 1 redo log links: forward: 0 backward: 2 siz: 0x5000 seq: 0x00000000 hws: 0x0 bsz: 512 nab: 0x2 flg: 0xb dup: 1 Archive links: fwrd: 0 back: 0 Prev scn: 0x0000.00000000 Low scn: 0x0000.00000000 01/01/1988 00:00:00 Next scn: 0x0000.00000000 01/01/1988 00:00:00
复制
注意到此时控制文件中日志信息都是空的,Oracle认为RESETLOGS方式下,当前的日志文件已经损失,那么就意味着Oracle可能丢失提交成功的数据,恢复将是一次不完全的介质恢复。
不完全恢复最终要求数据库通过resetlogs方式打开,resetlogs将会强制清空或重建在线日志文件。此时执行恢复必须使用backup controlfile选项,否则将会报出如下错误:
SQL> recover database; ORA-00283: recovery session canceled due to errors ORA-01610: recovery using the BACKUP CONTROLFILE option must be done
复制
使用如下命令恢复数据库:
SQL> recover database using backup controlfile until cancel; ORA-00279: change 18994483512 generated at 03/29/2007 10:33:29 needed for thread 1 ORA-00289: suggestion : /data2/ora10g/10.2.0/dbs/arch1_4655_560529557.dbf ORA-00280: change 18994483512 for thread 1 is in sequence #4655 Specify log: {<RET>=suggested | filename | AUTO | CANCEL} cancel Media recovery cancelled.
复制
如果在线日志损坏,那么恢复到最后,Oracle提示的最后一个归档日志将是不存在的(如果在线日志未损坏,则可以指定在线日志文件执行恢复),此时可以输入CANCEL取消恢复,然后可以强制打开数据库:
SQL> alter database open resetlogs; Database altered.
复制
最后需要了解的是在以下几种常见的情况下,完成恢复后需要以Resetlogs方式打开数据库:
- 在执行了不完全恢复之后;
- 在使用了备份的控制文件进行恢复后;
- 使用带有resetlogs选项创建的控制文件恢复之后。