最近遇到一个 case,在关闭数据库的时候报 ORA-01210: data file header is media corrupt
sys@ORCL 09:11:53> shutdown immediate
ORA-01122: database file 5 failed verification check
ORA-01110: data file 5: '/oradata/ORCL/tbs01.dbf'
ORA-01210: data file header is media corrupt
复制
通过查看 alter 日志发现,报错的数据文件在两个数据库实例中都有创建tbs成功的语句中,类似如下:
Completed: create tablespace tbs DataFile '/oradata/ORCL/tbs01.dbf' size 100M autoextend on next 1000M maxsize 32767M;
复制
可以看到,都没有使用 reuse 关键字。以下是在我本地环境模拟。
数据库的版本是 19.5,由于目前19.5ru需要密码才能下载,我在本地19.3 和19.17 环境分别模拟如下:
–19.3
在 hhhh 实例上创建 hhtbs01 表空间,数据文件是 /u01/hhdata/HHHH/datafile/hhtbs01.dbf
create tablespace hhtbs01 DataFile '/u01/hhdata/HHHH/datafile/hhtbs01.dbf' size 100M autoextend on next 1000M maxsize 32767M;
hhhh:sys@(301_HHHH)>
hhhh:sys@(301_HHHH)> alter system switch logfile;
System altered.
hhhh:sys@(301_HHHH)> create table h (i int) tablespace hhtbs01;
Table created.
hhhh:sys@(301_HHHH)> insert into h values(100);
1 row created.
hhhh:sys@(301_HHHH)> c/100/101
1* insert into h values(101)
hhhh:sys@(301_HHHH)> /
1 row created.
hhhh:sys@(301_HHHH)> commit;
Commit complete.
hhhh:sys@(301_HHHH)> insert into h select * from h;
2 rows created.
hhhh:sys@(301_HHHH)> /
4 rows created.
hhhh:sys@(301_HHHH)> /
8 rows created.
hhhh:sys@(301_HHHH)> /
16 rows created.
hhhh:sys@(301_HHHH)>
复制
并插入一些数据。
ORADB01 instance 也创建 hhtbs01 表空间,数据文件是 /u01/hhdata/HHHH/datafile/hhtbs01.dbf
ORADB01:sys@(209_CDB$ROOT)> create tablespace hhtbs01 DataFile '/u01/hhdata/HHHH/datafile/hhtbs01.dbf' size 100M autoextend on next 1000M maxsize 32767M;
create tablespace hhtbs01 DataFile '/u01/hhdata/HHHH/datafile/hhtbs01.dbf' size 100M autoextend on next 1000M maxsize 32767M
*
ERROR at line 1:
ORA-01119: error in creating database file '/u01/hhdata/HHHH/datafile/hhtbs01.dbf'
ORA-27038: created file already exists
Additional information: 1
复制
很显然由于没有reuse
关键字,无法创建成功的。但是根据 case 的alert 日志,很明显两个实例在不同的时间点,都执行创建表空间成功了,且都是同一个数据文件。
create tablespace hhtbs01 DataFile '/u01/hhdata/HHHH/datafile/hhtbs01.dbf' size 100M autoextend on next 1000M maxsize 32767M
Completed: create tablespace hhtbs01 DataFile '/u01/hhdata/HHHH/datafile/hhtbs01.dbf' size 100M autoextend on next 1000M maxsize 32767M
复制
这里推测应该是手工将之前的文件进行了rm,不然第二个实例是无法创建成功的,即使通过 touch 一个空的数据文件,create tbs 不使用reuse
关键字也是无法创建成功的。
oggscdg:sys@(963_CDB$ROOT)> !touch /u01/hhdata/HHHH/datafile/htouch.dbf
oggscdg:sys@(963_CDB$ROOT)> !ls -l /u01/hhdata/HHHH/datafile/htouch.dbf
-rw-r--r--. 1 oracle oinstall 0 Dec 21 09:35 /u01/hhdata/HHHH/datafile/htouch.dbf
oggscdg:sys@(963_CDB$ROOT)> create tablespace htouch DataFile '/u01/hhdata/HHHH/datafile/htouch.dbf' size 100M autoextend on next 1000M maxsize 32767M;
ORADB01:sys@(963_CDB$ROOT)> create tablespace htouch DataFile '/u01/hhdata/HHHH/datafile/htouch.dbf' size 100M autoextend on next 1000M maxsize 32767M;
create tablespace htouch DataFile '/u01/hhdata/HHHH/datafile/htouch.dbf' size 100M autoextend on next 1000M maxsize 32767M
*
ERROR at line 1:
ORA-01119: error in creating database file '/u01/hhdata/HHHH/datafile/htouch.dbf'
ORA-27038: created file already exists
Additional information: 1
复制
文件已经存在的情况下,只有使用 reuse
关键字才能创建成功
ORADB01:sys@(963_CDB$ROOT)> create tablespace htouch DataFile '/u01/hhdata/HHHH/datafile/htouch.dbf' size 100M reuse autoextend on next 1000M maxsize 32767M;
Tablespace created.
复制
同时查看 alert 日志里清晰的记录 有reuse 关键字
Completed: create tablespace htouch DataFile '/u01/hhdata/HHHH/datafile/htouch.dbf' size 100M reuse autoextend on next 1000M maxsize 32767M
复制
这里通过 rm 掉hhhh 实例的数据文件之后,在 ORADB01 实例上,创建成功。
ORADB01:sys@(209_CDB$ROOT)> ! ls -l /u01/hhdata/HHHH/datafile/hhtbs01.dbf
-rw-r-----. 1 oracle oinstall 104865792 Dec 21 08:27 /u01/hhdata/HHHH/datafile/hhtbs01.dbf
ORADB01:sys@(209_CDB$ROOT)> ! rm /u01/hhdata/HHHH/datafile/hhtbs01.dbf
ORADB01:sys@(209_CDB$ROOT)> ! ls -l /u01/hhdata/HHHH/datafile/hhtbs01.dbf
ls: cannot access /u01/hhdata/HHHH/datafile/hhtbs01.dbf: No such file or directory
ORADB01:sys@(209_CDB$ROOT)> create tablespace hhtbs01 DataFile '/u01/hhdata/HHHH/datafile/hhtbs01.dbf' size 100M autoextend on next 1000M maxsize 32767M;
Tablespace created.
ORADB01:sys@(209_CDB$ROOT)>
复制
问题出现了,我的版本是 19.3 ,在过了大概 1分钟左右,还没进行 shutdown 测试呢,hhhh 实例 crash 了。
2022-12-21T08:27:03.855326+08:00
create tablespace hhtbs01 DataFile '/u01/hhdata/HHHH/datafile/hhtbs01.dbf' size 100M autoextend on next 1000M maxsize 32767M
Completed: create tablespace hhtbs01 DataFile '/u01/hhdata/HHHH/datafile/hhtbs01.dbf' size 100M autoextend on next 1000M maxsize 32767M
2022-12-21T08:27:54.223384+08:00
Thread 1 advanced to log sequence 13 (LGWR switch)
Current log# 1 seq# 13 mem# 0: /u01/hhdata/HHHH/onlinelog/redo01.log
2022-12-21T08:27:54.227725+08:00
ARC3 (PID:11362): Archived Log entry 8 added for T-1.S-12 ID 0xd4895537 LAD:1
2022-12-21T08:32:08.183283+08:00
Read of datafile '/u01/hhdata/HHHH/datafile/hhtbs01.dbf' (fno 10) header failed with ORA-01210
Hex dump of (file 10, block 1) in trace file /u01/app/oracle/diag/rdbms/hhhh/hhhh/trace/hhhh_ckpt_10899.trc
Corrupt block relative dba: 0x02800001 (file 10, block 1)
Bad header found during datafile header read
Data in bad block:
type: 11 format: 2 rdba: 0x04c00001
last change scn: 0x0000.0000.00000000 seq: 0x1 flg: 0x04
spare3: 0x0
consistency value in tail: 0x00000b01
check value in block header: 0xd9c0
computed block checksum: 0x0
Rereading datafile 10 header failed with ORA-01210
2022-12-21T08:32:08.265924+08:00
Errors in file /u01/app/oracle/diag/rdbms/hhhh/hhhh/trace/hhhh_ckpt_10899.trc:
ORA-63999: data file suffered media failure
ORA-01122: database file 10 failed verification check
ORA-01110: data file 10: '/u01/hhdata/HHHH/datafile/hhtbs01.dbf'
ORA-01210: data file header is media corrupt
2022-12-21T08:32:08.266337+08:00
Errors in file /u01/app/oracle/diag/rdbms/hhhh/hhhh/trace/hhhh_ckpt_10899.trc:
ORA-63999: data file suffered media failure
ORA-01122: database file 10 failed verification check
ORA-01110: data file 10: '/u01/hhdata/HHHH/datafile/hhtbs01.dbf'
ORA-01210: data file header is media corrupt
2022-12-21T08:32:08.299887+08:00
Errors in file /u01/app/oracle/diag/rdbms/hhhh/hhhh/trace/hhhh_mz00_4206.trc:
ORA-01110: data file 2: '/u01/hhdata/HHHH/datafile/htest01.dbf'
ORA-01210: data file header is media corrupt
Errors in file /u01/app/oracle/diag/rdbms/hhhh/hhhh/trace/hhhh_ckpt_10899.trc (incident=37033):
ORA-63999 [] [] [] [] [] [] [] [] [] [] [] []
Incident details in: /u01/app/oracle/diag/rdbms/hhhh/hhhh/incident/incdir_37033/hhhh_ckpt_10899_i37033.trc
2022-12-21T08:32:08.479859+08:00
Errors in file /u01/app/oracle/diag/rdbms/hhhh/hhhh/trace/hhhh_mz00_4206.trc:
ORA-01110: data file 5: '/u01/hhdata/HHHH/datafile/tbs01.dbf'
ORA-01210: data file header is media corrupt
2022-12-21T08:32:08.626372+08:00
Errors in file /u01/app/oracle/diag/rdbms/hhhh/hhhh/trace/hhhh_mz00_4206.trc:
ORA-01110: data file 8: '/u01/hhdata/HHHH/datafile/oth01.dbf'
2022-12-21T08:32:08.783624+08:00
Errors in file /u01/app/oracle/diag/rdbms/hhhh/hhhh/trace/hhhh_mz00_4206.trc:
ORA-01110: data file 9: '/u01/hhdata/HHHH/datafile/coth01.dbf'
ORA-01210: data file header is media corrupt
2022-12-21T08:32:08.927298+08:00
Errors in file /u01/app/oracle/diag/rdbms/hhhh/hhhh/trace/hhhh_mz00_4206.trc:
ORA-01110: data file 10: '/u01/hhdata/HHHH/datafile/hhtbs01.dbf'
ORA-01210: data file header is media corrupt
2022-12-21T08:32:09.252599+08:00
USER (ospid: ): terminating the instance due to ORA error
2022-12-21T08:32:09.350800+08:00
System state dump requested by (instance=1, osid=10899 (CKPT)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/hhhh/hhhh/trace/hhhh_diag_10880.trc
2022-12-21T08:32:10.954050+08:00
Dumping diagnostic data in directory=[cdmp_20221221083209], requested by (instance=1, osid=10899 (CKPT)), summary=[abnormal instance termination].
2022-12-21T08:32:12.176920+08:00
Instance terminated by USER, pid = 10899
复制
这个库我测试了多次,alert 日志有多个报 ORA-01210: data file header is media corrupt 的信息,可以忽略。但是 case 遇到情况是:库可以正常运行,关闭库的时候无法正常关闭提示 :ORA-01210: data file header is media corrupt。
由于找不到 19.5 的 ru 包,索性就升级到最新的19.17 进行模拟测试。
19.17 进行以上相同步骤的测试,创建之后原库正常,可以读写,过了几分钟也没有crash,查看alert日志也都正常,于是我就进行一些插入操作。在插入了 2621440 条数据之后,同样报了 ORA-01122和 ORA-01110但是没有报ORA-01210,这次报的是 ORA-01203: wrong incarnation of this file - wrong creation SCN
,库的状态仍然是 open,这个表空间上的表仍然可以读。
sys@ORCL 20-DEC-22> create table a( i int) tablespace tbs;
Table created.
Elapsed: 00:00:00.38
sys@ORCL 20-DEC-22> insert into a values(1);
1 row created.
Elapsed: 00:00:00.01
sys@ORCL 20-DEC-22> c/1/2
1* insert into a values(2)
sys@ORCL 20-DEC-22> /
1 row created.
Elapsed: 00:00:00.00
sys@ORCL 20-DEC-22> c/2/3
1* insert into a values(3)
sys@ORCL 20-DEC-22> /
1 row created.
Elapsed: 00:00:00.00
sys@ORCL 20-DEC-22> commit;
Commit complete.
Elapsed: 00:00:00.00
sys@ORCL 20-DEC-22> alter system switch logfile;
sys@ORCL 07:19:44> /
/
2621440 rows created.
Elapsed: 00:00:00.54
sys@ORCL 07:19:45>
/
insert into a select * from a
*
ERROR at line 1:
ORA-01122: database file 5 failed verification check
ORA-01110: data file 5: '/oradata/ORCL/tbs01.dbf'
ORA-01203: wrong incarnation of this file - wrong creation SCN
复制
关闭数据库测试,出现如下错误信息:
sys@ORCL 09:59:39> shutdown immediate ORA-01122: database file 5 failed verification check ORA-01110: data file 5: '/oradata/ORCL/tbs01.dbf' ORA-01203: wrong incarnation of this file - wrong creation SCN sys@ORCL 09:59:44>
复制
这里基本可以肯定是通过 rm 掉数据文件,然后再另一个数据库实例进行了表空间的重建。这里可以总结下,19.3 的时候,数据库直接crash,都无法提供服务了。19.5 会报 ORA-01210: data file header is media corrupt ,数据库仍然是open状态能够提供服务(case 环境)。到了19.17 之后,报错更详细了:ORA-01203: wrong incarnation of this file - wrong creation SCN 而不仅仅是报文件头被损坏。
当然通过rm 之后,数据库还没关闭的情况下,句柄是仍然存在的,通过 lsof 可以查看,可以看到 dbf文件仍然存在,理论可以恢复的,下次文章再测试下如何恢复。
[oracle@dsmart:/home/oracle]$ lsof |grep delete|grep dbf ora_dbw0_ 10767 oracle 263uW REG 253,1 104865792 917524 /oradata/ORCL/tbs01.dbf (deleted) ora_ckpt_ 10775 oracle 261u REG 253,1 104865792 917524 /oradata/ORCL/tbs01.dbf (deleted) [oracle@dsmart:/home/oracle]$
复制