暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

不用洪荒之力也可以修复ORA-00704 ORA-01555

东方宏达EHONDA 2016-08-11
1751

环境说明:RedHat Linux 5.10 16G 8C,Oracle 11.2.0.4,在一个P to v的迁移过程中,没有停机迁移,出现了数据库内部不一致现象

故障现象:数据库无法正常open,报错ORA-00704 ORA-01555

 

SQL>alter database open;

alter database open

*

ERROR at line 1:

ORA-01092:ORACLE instance terminated. Disconnection forced

ORA-00704:bootstrap process failure

ORA-00704:bootstrap process failure

ORA-00604:error occurred at recursive SQL level 1

ORA-01555:snapshot too old: rollback segment number 4 with name

"_SYSSMU4_3204654454$"too small

Process ID: 17876

Session ID: 292 Serial number: 7

在碰到该问题时,同事已使用各种方法,比如:

设置undo_management=manual,

oradebug手工推进SCN,

设置隐含参数_allow_resetlogs_corruption,_offline_rollback_segments,_corrupted_rollback_segments屏蔽回滚段

10513 event,均无法启动。

 

首先检查数据库的环境:

SQL>select current_scn,to_char(current_scn,'XXXXXXXXXXXXXXXXXX')cs_16,checkpoint_change#,to_char(checkpoint_change#,'XXXXXXXXXXXXXXXXXX') cc_16 from v$database;

  CURRENT_SCN CS_16                CHECKPOINT_CHANGE# CC_16

------------------- ------------------- ------------------- ---------

   0                   0      10396391515119         974993923EF

SQL>select a.file#,a.name,a.checkpoint_change#,a.last_change#,status from v$datafile a;

FILE#   NAME                  CHECKPOINT_CHANGE# LAST_CHANGE# STATUS

----- ------------------------------------------------ --------------

    1 oradata/training/system01.dbf    10396391515119        SYSTEM

    2 oradata/training/sysaux01.dbf    10396391515119        ONLINE

    3 oradata/training/undotbs01.dbf   10396391515119        ONLINE

    4 oradata/training/users01.dbf     10396391515119        ONLINE

    5 oradata/training/xxxxxxx01.dbf   10396391515119        ONLINE

  ......

   34 oradata/training/xxxxxxx30.dbf   10396391515119        ONLINE

   35 oradata/training/tcsdata01.dbf   10396391515119        ONLINE

   36 oradata/training/gdciq01.dbf     10396391515119        ONLINE

36 rows selected.

 

SQL>select a.file#,a.name,a.recover,a.checkpoint_change#,status fromv$datafile_header a;

FILE# NAME                    REC  CHECKPOINT_CHANGE# STATUS

---------------------------------- --- ------------------- -------

  1 oradata/training/system01.dbf   NO 10396391515119 ONLINE  2 oradata/training/sysaux01.dbf   NO 10396391515119 ONLINE  3 oradata/training/undotbs01.dbf  NO 10396391515119 ONLINE  4 oradata/training/users01.dbf    NO 10396391515119 ONLINE  5 oradata/training/xxxxxxx01.dbf  NO 10396391515119 ONLINE ......

 34 oradata/training/xxxxxxx30.dbf  NO 10396391515119 ONLINE 35 oradata/training/tcsdata01.dbf  NO 10396391515119 ONLINE 36 oradata/training/gdciq01.dbf    NO 10396391515119 ONLINE36 rows selected.

 

可以看到数据文件头都是处于一致的SCN,推进SCN是没有必要且无效的,那么什么原因导致无法open,做10046 event可以看到,

=====================

PARSINGIN CURSOR #47646723591832 len=142 dep=3 uid=0 oct=3 lid=0 tim=1470896704109860 hv=361892850 ad='25ee1a9b0' sqlid='7bd391hat42zk'

select *+ rule */ name,file#,block#,status$,user#,undosqn,xactsqn,scnbas,scnwrp,DECODE(inst#,0,NULL,inst#),ts#,spare1 from undo$ where us#=:1

ENDOF STMT

PARSE #47646723591832:c=1000,e=603,p=0,cr=0,cu=0,mis=1,r=0,dep=3,og=3,plh=0,tim=1470896704109859

BINDS #47646723591832:

 Bind#0

  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00

  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0

  kxsbbbfp=2b559e08ca50  bln=22 avl=02  flg=05 value=4

EXEC #47646723591832:c=1000,e=884,p=0,cr=0,cu=0,mis=1,r=0,dep=3,og=3,plh=906473769,tim=1470896704110869

WAIT #47646723591832: nam='db file sequential read' ela= 12 file#=1 block#=321 blocks=1 obj#=34 tim=1470896704110933

WAIT #47646723591832: nam='db file sequential read' ela= 8 file#=1 block#=225 blocks=1 obj#=15 tim=1470896704110999

FETCH #47646723591832:c=0,e=146,p=2,cr=2,cu=0,mis=0,r=1,dep=3,og=3,plh=906473769,tim=1470896704111050

STAT #47646723591832 id=1 cnt=1 pid=0 pos=1 obj=15 op='TABLE ACCESS BY INDEX ROWIDUNDO$ (cr=2 pr=2 pw=0 time=128 us)'

STAT #47646723591832 id=2 cnt=1 pid=1 pos=1 obj=34 op='INDEX UNIQUE SCAN I_UNDO1 (cr=1 pr=1 pw=0 time=70 us)'

CLOSE #47646723591832:c=0,e=5,dep=3,type=0,tim=1470896704111158

WAIT #47646723178048: nam='db file sequential read' ela= 13 file#=3 block#=176 blocks=1 obj#=0 tim=1470896704111207

WAIT #47646723178048: nam='db file sequential read' ela= 13 file#=3 block#=38674 blocks=1 obj#=0 tim=1470896704111366

FETCH #47646723178048:c=3000,e=2718,p=7,cr=7,cu=0,mis=0,r=0,dep=2,og=4,plh=2853959010,tim=1470896704111762

STAT #47646723178048 id=1 cnt=0 pid=0 pos=1 obj=18 op='TABLE ACCESS BY INDEX ROWIDOBJ$ (cr=0 pr=0 pw=0 time=11 us)'

STAT #47646723178048 id=2 cnt=0 pid=1 pos=1 obj=37 op='INDEX RANGE SCAN I_OBJ2 (cr=0pr=0 pw=0 time=9 us)'

=====================

PARSEERROR #47646723179568:len=60 dep=1 uid=0 oct=3 lid=0 tim=1470896704111888 err=604

SELECT NULL FROM PROPS$ WHERE NAME='BOOTSTRAP_UPGRADE_ERROR'

ORA-00704:bootstrap process failure

ORA-00604:error occurred at recursive SQL level 2

ORA-01555:snapshot too old: rollback segment number 4 with name
"_SYSSMU4_3204654454$" too small

ORA-00704:bootstrap process failure

ORA-00604:error occurred at recursive SQL level 2

ORA-01555:snapshot too old: rollback segment number 4 with name
"_SYSSMU4_3204654454$" too small
 

 

尝试使用非常规手段gdb方法启动数据库无效

[oracle@training]$ ps -ef|grep LOCAL

oracle   19562 19561 0 15:24 ?        00:00:00 oracletraining (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))

oracle   19564 19080 0 15:24 pts/2    00:00:00 grep LOCAL

[oracle@training]$  gdb $ORACLE_HOME/bin/oracle 19562

GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-45.el5)

......

 (gdb) break kokiasg

Breakpoint 1 at 0x15036a4

(gdb) continue

Continuing.Program received signal SIGSEGV, Segmentation fault. 0x0000000009805715 in slaac_int ()

 

Thu Aug 11 15:26:21 2016 SMON:enabling cache recovery

ORA-01555 caused by SQL statement below (SQL ID: 3nkd3g3ju5ph1, SCN: 0x0d17.ebce9d23):select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, spare2 from obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null and linkname is null and subname is null

Errors in file /oracle/app/oracle/diag/rdbms/training/training/trace/training_ora_19562.trc:

ORA-00704: bootstrap process failure

ORA-00604: error occurred at recursive SQL level 2

ORA-01555: snapshot too old: rollback segment number 4 with name
"_SYSSMU4_3204654454$" too small

Errors in file /oracle/app/oracle/diag/rdbms/training/training/trace/training_ora_19562.trc:

ORA-00704: bootstrap process failure

ORA-00604: error occurred at recursive SQL level 2

ORA-01555: snapshot too old: rollback segment number 4 with name
"_SYSSMU4_3204654454$" too small
Error 704 happened during db open, shutting down database

USER (ospid: 19562): terminating the instance due to error 704

 

此时只能使用终极武器,使用BBED上古神器,将rollback segment置为offline状态,首先将10g Linux平台bbed文件拷贝到相应的目录,并且编译

[oracle@training lib]$ ./bbed

Password:

BBED: Release 2.0.0.0.0 - Limited Production on Thu Aug 11 15:35:56 2016 Copyright (c) 1982, 2011, Oracle and/or its affiliates. 
All rights reserved.
*************
!!! For Oracle Internal Use only !!! ***************

BBED>set filename "/oradata/training/system01.dbf"

        FILENAME        /oradata/training/system01.dbf

BBED>set mode edit

        MODE            Edit

BBED>set block 225

        BLOCK#          225

BBED>map 

 File: /oradata/training/system01.dbf (0)

 Block: 225      Dba:0x00000000

------------------------------------------------------------

 KTB Data Block (Table/Cluster)

 struct kcbh, 20 bytes                      @0      

 struct ktbbh, 48 bytes                     @20     

 struct kdbh, 14 bytes                      @68     

 struct kdbt[1], 4 bytes                    @82     

 sb2 kdbr[24]                               @86     

 ub1 freespace[3690]                        @134    

 ub1 rowdata[4364]                          @3824   

 ub4 tailchk                                @8188   

 

BBED>p *kdbr[4]

rowdata[1872]

-------------

ub1 rowdata[1872]                           @5696     0x2c

 

BBED>x  /rncnnnnn

rowdata[1872]                               @5696   

-------------

flag@5696:0x2c (KDRHFL, KDRHFF, KDRHFH)

lock@5697:0x00

cols@5698:   17


 col    0[2] @5699: 4

col   1[20] @5702: _SYSSMU4_3204654454$

col    2[2] @5723: 1

col    3[2] @5726: 3

col    4[3] @5729: 176

col    5[6] @5733: 2527430713

col    6[3] @5740: 2420

col    7[4] @5744: 156153

col    8[3] @5749: 2475

col    9[1] @5753: 0

col   10[2] @5755: 3

col   11[2] @5758: 2

col   12[0] @5761: *NULL*

col   13[0] @5762: *NULL*

col   14[0] @5763: *NULL*

col   15[0] @5764: *NULL*

col   16[2] @5765: 2

 

BBED>dump /v offset 5757 count 64

 File: /oradata/training/system01.dbf (0)

 Block: 225   Offsets: 5757 to 5820 Dba:0x00000000

-------------------------------------------------------

 0402c103 ffffffff 02c1032c 001102c1 l...........,....

 06135f53 5953534d 55355f33 38373635 l.._SYSSMU5_38765

 32343033 2402c102 02c10403 c2025d06 l2403$.........].

 c51a1c2c 081e03c2 191504c3 103e3803 l...,.........>8.

 <16 bytes per line>

 

BBED>m /x 02 offset 5757

Warning:contents of previous BIFILE will be lost. Proceed? (Y/N) y

 File: /oradata/training/system01.dbf (0)

 Block: 225              Offsets: 5757 to 5820         Dba:0x00000000

---------------------------------------------------------------------

  0202c103 ffffffff 02c1032c 001102c1 06135f53 5953534d 55355f33 38373635  32343033 2402c102 02c10403 c2025d06 c51a1c2c 081e03c2 191504c3 103e3803

 <32 bytes per line>

 

BBED>sum apply

Check value for File 0, Block 225:

current = 0x9d05, required = 0x9d05

 

修改完成后,open时又报出_SYSSMU8 ORA-01555,以同样的方式处理

Thu Aug 11 15:38:53 2016 SMON:enabling cache recovery

......

ORA-00704: bootstrap process failure

ORA-00604: error occurred at recursive SQL level 1

ORA-01555: snapshot too old: rollback segment number 8 with name
"_SYSSMU8_1016684853$" too small

Errors in file /oracle/app/oracle/diag/rdbms/training/training/trace/training_ora_19648.trc:

ORA-00704: bootstrap process failure

ORA-00604: error occurred at recursive SQL level 1

ORA-01555: snapshot too old: rollback segment number 8 with name
"_SYSSMU8_1016684853$" too small

Error 704 happened during db open, shutting down database

USER (ospid: 19648): terminating the instance due to error 704

Instance terminated by USER, pid = 19648

ORA-1092 signalled during: alter database open...

opiodr  aborting process unknown ospid (19648) as a result of ORA-1092

Thu Aug 11 15:38:56 2016 ORA-1092 : opitsk aborting process

 

BBED> p *kdbr[8]

rowdata[2159]

-------------

ub1 rowdata[2159]                          
@5983     0x2c

 

BBED>x  /rncnnnnn

rowdata[2159]                               @5983   

-------------

flag@5983:0x2c (KDRHFL, KDRHFF, KDRHFH)

lock@5984:0x00 

cols@5985:   17

 

col    0[2] @5986: 8

col   1[20] @5989: _SYSSMU8_1016684853$

col    2[2] @6010: 1

col    3[2] @6013: 3

col    4[3] @6016: 240

col    5[6] @6020: 2527430702

col    6[3] @6027: 2420

col    7[4] @6031: 156228

col    8[3] @6036: 2562

col    9[1] @6040: 0

col   10[2] @6042: 3

col   11[2] @6045: 2

col   12[0] @6048: *NULL*

col   13[0] @6049: *NULL*

col   14[0] @6050: *NULL*

col   15[0] @6051: *NULL*

col   16[2] @6052: 2

 

BBED>dump /v offset 6044 count 64

 File: /oradata/training/system01.dbf (0)

 Block: 225  Offsets: 6044 to 6107 
 Dba:0x00000000

-------------------------------------------------------

 0402c103 ffffffff 02c1032c 001102c1 l...........,....

 0a145f53 5953534d 55395f31 37323439 l.._SYSSMU9_17249

 31363134 342402c1 0202c104 03c20339 l16144$.........9

 06c51a1c 2c081903 c2191504 c3104902 l....,.........I.

 <16 bytes per line>

 

BBED>  m /x 02 offset 6044

 File: /oradata/training/system01.dbf (0)

 Block: 225              Offsets: 6044 to 6107        Dba:0x00000000

---------------------------------------------------------------------

 0202c103 ffffffff 02c1032c 001102c1 0a145f53 5953534d 55395f31 37323439  31363134 342402c1 0202c104 03c20339 06c51a1c 2c081903 c2191504 c3104902

 <32 bytes per line>

 

BBED>sum apply

Check value for File 0, Block 225:

current = 0x9d03, required = 0x9d03

 

SQL>conn / as sysdba

Connected to an idle instance.

SQL> startup mount

ORACLE instance started.

Total System Global Area 8551575552 bytes

Fixed Size                  2270360 bytes

Variable Size            1342180200 bytes

Database Buffers         7197425664 bytes

Redo Buffers                9699328 bytes

Database mounted.

 

SQL> alter database open;

Database altered.

SQL> select status from v$instance;

STATUS

------------

OPEN

 

后续处理,重建undo表空间:

Thu Aug 11 16:20:06 2016 Errors in file /oracle/app/oracle/diag/rdbms/training/training/trace/training_j000_20028.trc:

ORA-12012:error on auto execute of job 101

ORA-12008:error in materialized view refresh path

ORA-01552:cannot use system rollback segment for non-system tablespace 'SINOAIR'

 

SQL>drop tablespace UNDOTBS1 including contents and datafiles;

drop tablespace UNDOTBS1 including contents and datafiles

*

ERROR at line 1:

ORA-01561:failed to remove all objects in the tablespace specified

 

SQL>select * from v$tablespace;

       TS# NAME                           INC BIG FLA ENC

---------- ------------------------------ --- --- --- ---

         0 SYSTEM                         YES NO  YES

         1 SYSAUX                         YES NO  YES

         2 UNDOTBS1                       YES NO  YES

         3 TEMP                           NO  NO  YES

         4 USERS                          YES NO  YES

         5 SINOAIR                        YES NO  YES

         6 TCSTEST_DATA                   YES NO  YES

         7 GDCIQ                          YES NO  YES

         8 UNDOTBS2                       YES NO  YES
9 rows selected.

SQL>update seg$ set type# = 3 where ts#=2;
3 rows updated.

SQL>commit;

Commit complete.

SQL>drop tablespace UNDOTBS1 including contents and datafiles;

Tablespace dropped.

文章转载自东方宏达EHONDA,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论