暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

连环bug-18c pdb插拔升级至19c 并更换存储

Vagrant Boy Awy 2019-10-31
451
  1. 背景介绍:

            这个变更做完有段时间了,终于有机会把之前处理的故障小做整合,废话不多说了下面开始。

            某客户因为技术一直比较前沿,18刚出的时候生产就已经上了18,所以19稳定版出来后肯定不会放过这个机会,借此升级,由于之前,为了综合管理和资源充分利用一个18c的库都用了pdb的模式而且每台服务器上的pdb比较多,升级db就面临一个问题,业务对外宣称停运。但是pdb较多直接dbua风险较大并且停机时间过长,而且会大范围停止运行,所以采取了在同一用户下创建19c db,用pdb插拔的方式升级,这样避免大范围停运和控制停机时间。在此次变更中由于新老存储替换,并且掺杂了磁盘组的变更,并借此机会统一磁盘组的名称。

2.环境介绍

    版本介绍:

原库版本:18.5

目标版本:19.4

旧磁盘组:+SASDATA

新规划磁盘组:+DATA

3.进行pdb插拔升级

 3.1生成预升级检查

$ORACLE_HOME_18/jdk/bin/java -jar $ORACLE_HOME_19/rdbms/admin/preupgrade.jar dir /tmp -c test


CONNECT / AS SYSDBA

ALTER SESSION SET CONTAINER=test

SQL>@/tmp/preupgrade_fixups_test.sql


3.2关闭pdb,并拔出

SQL>ALTER PLUGGABLE DATABASE test CLOSE immediate instances=all;

SQL>ALTER PLUGGABLE DATABASE test UNPLUG INTO '/home/oracle/test.xml';


 3.3   删除在原库中的pdb资源保留数据文件

SQL>DROP PLUGGABLE DATABASE test KEEP DATAFILES;


3.4 插入目标端

SQL> CREATE PLUGGABLE DATABASE test USING '/home/oracle/test.xml';

CREATE PLUGGABLE DATABASE test USING '/home/oracle/test.xml'

*

ERROR at line 1:

ORA-65169: error encountered while attempting to copy file

+SASDATA/FOTICDB/909C3A504301DE84E0533352070A3C2D/DATAFILE/undo_4.308.1016896059

ORA-19501: read error on file

"+SASDATA/FOTICDB/909C3A504301DE84E0533352070A3C2D/DATAFILE/fund_table.315.1016896

059", block number 768 (block size=8192)

ORA-17512: Block Verification Failed



发现不能读取文件,看下alter日志发现报错如下:

Errors in file /u01/app/oracle/diag/rdbms/xtcdb/xtcdb1/trace/xtcdb1_p03b_118170.trc:

ORA-17512: Block Verification Failed

ORA-19501: read error on file "+SASDATA/FOTICDB/909C3A504301DE84E0533352070A3C2D/DATAFILE/undo_4.308.1016896059", block number 1298816 (block size=8192)

ORA-17512: Block Verification Failed

Reread resulted in same corrupt block on file '+SASDATA/FOTICDB/909C3A504301DE84E0533352070A3C2D/DATAFILE/fund_table.315.1016896059' blockno=2463232 from mirror side 1 (HDS1SASDATA1) logical extent number 0

2019-08-21T15:45:45.761015+08:00

Reread resulted in same corrupt block on file '+SASDATA/FOTICDB/909C3A504301DE84E0533352070A3C2D/DATAFILE/undo_4.308.1016896059' blockno=1716096 from mirror side 2 (HDS1SASDATA1) logical extent number 1

2019-08-21T15:45:45.762202+08:00

Errors in file /u01/app/oracle/diag/rdbms/xtcdb/xtcdb1/trace/xtcdb1_p033_118150.trc:

ORA-17512: Block Verification Failed

ORA-19501: read error on file "+SASDATA/FOTICDB/909C3A504301DE84E0533352070A3C2D/DATAFILE/fund_table.315.1016896059", block number 2462961 (block size=8192)

ORA-17512: Block Verification Failed


说文件不可读,到这你的思路又会是什么?


我使用了如下方式逐一排查

  1. 查看用户组和oracle文件权限

[root@foticdb1 ~]# id grid

uid=54322(grid) gid=54321(oinstall) groups=54321(oinstall),54322(dba),54323(oper),54331(asmadmin),54325(dgdba),54326(kmdba),

54330(racdba),54332(asmdba)[root@foticdb1 ~]# id oracle

uid=54321(oracle) gid=54321(oinstall) groups=54321(oinstall),54322(dba),54323(oper),54331(asmadmin),54324(backupdba),54325(d

gdba),54326(kmdba),54330(racdba),54332(asmdba)



[foticdb1:oracle]:/home/oracle>ls -l $ORACLE_HOME/bin/oracle

-rwsr-s--x 1 oracle asmadmin 441762200 Aug 21 14:58 /u01/app/oracle/product/19c/dbhome/bin/oracle

[root@foticdb1 ~]# su - grid

Last login: Wed Aug 21 21:42:26 CST 2019

[foticdb1:grid]:/home/grid>ls -l $ORACLE_HOME/bin/oracle

-rwsr-s--x 1 grid oinstall 441678776 Aug 10 00:03 /u01/app/19c/grid/bin/oracle



发现并没有问题。

2.查看asm是否有这个文件

ASMCMD> ls -l +SASDATA/FOTICDB/909C3A504301DE84E0533352070A3C2D/DATAFILE/undo_4.308.1016896059

Type Redund Striped Time Sys Name

DATAFILE MIRROR COARSE AUG 21 16:00:00 Y undo_4.308.1016896059



3.用dbv校验一下这个文件

[foticdb1:oracle]:/home/oracle>dbv file=+SASDATA/FOTICDB/909C3A504301DE84E0533352070A3C2D/DATAFILE/fund_table.315.1016896059

DBVERIFY: Release 19.0.0.0.0 - Production on Thu Aug 22 10:46:14 2019

Copyright (c) 1982, 2019, Oracle and/or its affiliates. All rights reserved.

DBVERIFY - Verification starting : FILE = +SASDATA/FOTICDB/909C3A504301DE84E0533352070A3C2D/DATAFILE/fund_table.315.1016896059

There is no extent in file = +SASDATA/FOTICDB/909C3A504301DE84E0533352070A3C2D/DATAFILE/fund_table.315.1016896059

DBVERIFY - Verification complete

Total Pages Examined : 0

Total Pages Processed (Data) : 0

Total Pages Failing (Data) : 0

Total Pages Processed (Index): 0

Total Pages Failing (Index): 0

Total Pages Processed (Other): 0

Total Pages Processed (Seg) : 0

Total Pages Failing (Seg) : 0

Total Pages Empty : 0

Total Pages Marked Corrupt : 0

Total Pages Influx : 0

Total Pages Encrypted : 0

Highest block SCN : 1219771586 (0.1219771586)


[foticdb1:oracle]:/home/oracle>dbv file=+SASDATA/FOTICDB/909C3A504301DE84E0533352070A3C2D/DATAFILE/undo_4.308.1016896059

DBVERIFY: Release 19.0.0.0.0 - Production on Thu Aug 22 10:49:18 2019

Copyright (c) 1982, 2019, Oracle and/or its affiliates. All rights reserved.

DBVERIFY - Verification starting : FILE = +SASDATA/FOTICDB/909C3A504301DE84E0533352070A3C2D/DATAFILE/undo_4.308.1016896059

DBVERIFY - Verification complete

Total Pages Examined : 2967680

Total Pages Processed (Data) : 0

Total Pages Failing (Data) : 0

Total Pages Processed (Index): 0

Total Pages Failing (Index): 0

Total Pages Processed (Other): 2963314

Total Pages Processed (Seg) : 20

Total Pages Failing (Seg) : 0

Total Pages Empty : 4366

Total Pages Marked Corrupt : 0

Total Pages Influx : 0

Total Pages Encrypted : 0

Highest block SCN : 1224378194 (0.1224378194)


发现还是没有问题,那接下来如何做?到此问题陷入僵局。

既然没问题我再插回去,看能不能启动,发现能启动,重新再来一遍

再做一遍发现报错如下:

SQL> CREATE PLUGGABLE DATABASE test USING '/home/oracle/test.xml';

CREATE PLUGGABLE DATABASE test USING '/home/oracle/test.xml'

*

ERROR at line 1:

ORA-65169: error encountered while attempting to copy file

+SASDATA/FOTICDB/909C3A504301DE84E0533352070A3C2D/DATAFILE/crm_hindex.311.101689

6057

ORA-19501: read error on file

"+SASDATA/FOTICDB/909C3A504301DE84E0533352070A3C2D/DATAFILE/crm_hindex.311.10168

96057", block number 768 (block size=8192)

ORA-17512: Block Verification Failed


再次dbv校验文件

[foticdb1:oracle]:/tmp/soft/29332763>dbv file=+SASDATA/FOTICDB/909C3A504301DE84E0533352070A3C2D/DATAFILE/crm_hindex.311.101689

DBVERIFY: Release 19.0.0.0.0 - Production on Thu Aug 22 11:39:11 2019

Copyright (c) 1982, 2019, Oracle and/or its affiliates. All rights reserved.

DBVERIFY - Verification starting : FILE = +SASDATA/FOTICDB/909C3A504301DE84E0533352070A3C2D/DATAFILE/crm_hindex.311.101689

Segmentation fault (core dumped)

[foticdb1:oracle]:/tmp/soft/29332763>dbv file=+SASDATA/FOTICDB/909C3A504301DE84E0533352070A3C2D/DATAFILE/fund_table.315.1016896059

DBVERIFY: Release 19.0.0.0.0 - Production on Thu Aug 22 11:42:54 2019

Copyright (c) 1982, 2019, Oracle and/or its affiliates. All rights reserved.

DBVERIFY - Verification starting : FILE = +SASDATA/FOTICDB/909C3A504301DE84E0533352070A3C2D/DATAFILE/fund_table.315.1016896059

There is no extent in file = +SASDATA/FOTICDB/909C3A504301DE84E0533352070A3C2D/DATAFILE/fund_table.315.1016896059

DBVERIFY - Verification complete

Total Pages Examined : 0

Total Pages Processed (Data) : 0

Total Pages Failing (Data) : 0

Total Pages Processed (Index): 0

Total Pages Failing (Index): 0

Total Pages Processed (Other): 0

Total Pages Processed (Seg) : 0

Total Pages Failing (Seg) : 0

Total Pages Empty : 0

Total Pages Marked Corrupt : 0

Total Pages Influx : 0

Total Pages Encrypted : 0

Highest block SCN : 1219771586 (0.1219771586)


[foticdb1:oracle]:/tmp/soft/29332763>dbv file=+SASDATA/FOTICDB/909C3A504301DE84E0533352070A3C2D/DATAFILE/fund_table.315.1016896059

DBVERIFY: Release 19.0.0.0.0 - Production on Thu Aug 22 11:44:57 2019

Copyright (c) 1982, 2019, Oracle and/or its affiliates. All rights reserved.

DBVERIFY - Verification starting : FILE = +SASDATA/FOTICDB/909C3A504301DE84E0533352070A3C2D/DATAFILE/fund_table.315.1016896059

There is no extent in file = +SASDATA/FOTICDB/909C3A504301DE84E0533352070A3C2D/DATAFILE/fund_table.315.1016896059

DBVERIFY - Verification complete

Total Pages Examined : 0

Total Pages Processed (Data) : 0

Total Pages Failing (Data) : 0

Total Pages Processed (Index): 0

Total Pages Failing (Index): 0

Total Pages Processed (Other): 0

Total Pages Processed (Seg) : 0

Total Pages Failing (Seg) : 0

Total Pages Empty : 0

Total Pages Marked Corrupt : 0

Total Pages Influx : 0

Total Pages Encrypted : 0

Highest block SCN : 1219771586 (0.1219771586)


发现没有问题,查看alter如下报错

ORA-17512: Block Verification Failed

ORA-19501: read error on file "+SASDATA/FOTICDB/909C3A504301DE84E0533352070A3C2D/DATAFILE/fund_table.315.1016896059", block number 793984 (block size=8192)

ORA-17512: Block Verification Failed

2019-08-22T11:35:27.340202+08:00

Errors in file /u01/app/oracle/diag/rdbms/xtcdb/xtcdb1/trace/xtcdb1_p02v_18361.trc:

ORA-17512: Block Verification Failed

ORA-19501: read error on file "+SASDATA/FOTICDB/909C3A504301DE84E0533352070A3C2D/DATAFILE/fund_table.315.1016896059", block number 793984 (block size=8192)

ORA-17512: Block Verification Failed


在mos上搜索相关报错,发现有类似的bug,但已经在19c修复,无奈提了sr,上传了前面的所有步骤和日志,很快就得到了确认是未公布的bug29332763,并有现成的补丁包。


 打完补丁,重复了如上操作,心想这下终于可以搞定了,但是真的到这就完了么?


在进行到插入的时候又报了如下错误:

SQL> CREATE PLUGGABLE DATABASE test USING '/home/oracle/test.xml';

CREATE PLUGGABLE DATABASE test USING '/home/oracle/test.xml'

*

ERROR at line 1:

ORA-65169: error encountered while attempting to copy file

ORA-12801: error signaled in parallel query server


相关alter如下:

OERI_INCIDENT - Incident trace containing an ORA-600, ORA-700 or ORA-7445 (1 file)

- 2019-08-21 08:31:32 xtcdb1_p001_91833_i246741.trc [5Mb] (ORA-600 [2901] incident continued from xtcdb1_p001_91833.trc)

^--> File parsed: BugtagMatch/htm(ORA-600 [2901] on 19.4.0.0;)


INITIAL_TRACE - 11g tracefile referencing an 'incident' (2 files)

- 2019-08-21 09:02:37 xtcdb1_p000_89222.trc [10Kb] (ORA-00600 [2901] occurred with incident tracefile xtcdb1_p000_89222_i328705.trc)

- 2019-08-21 09:02:37 xtcdb1_p001_89239.trc [10Kb] (ORA-00600 [2901] occurred with incident tracefile xtcdb1_p001_89239_i328713.trc)



这个就不用再多看了,又触发了bug。经过和oracle sr沟通,说这个bug已经在19.3修复,而我现在的版本为19.4,原因可能是前面的补丁导致,具体问题还待oracle开发确认,而且这个时间不短,可是变更已经提上日程,如果是你你会等待补丁还是开辟新的道路?


回顾上面的操作,问题出在更换磁盘组那一步,如果我先更换再插拔会不会报同样的报错?


接下来的操作如下:

先克隆一个其他名称的pdb

SQL>CREATE PLUGGABLE DATABASE testbak from test create_file_dest='+data';

启动并关闭

SQL>ALTER PLUGGABLE DATABASE testbak open instances=all;

SQL>ALTER PLUGGABLE DATABASE testbak CLOSE immediate instances=all;

拔出pdb

SQL>ALTER PLUGGABLE DATABASE testbak UNPLUG INTO '/home/oracle/testbak.xml';

删除资源保留文件

SQL>DROP PLUGGABLE DATABASE test KEEP DATAFILES;

插入目标库,前面创建已经更换磁盘组不拷贝数据文件

SQL> CREATE PLUGGABLE DATABASE testbak USING '/home/oracle/test.xml' nocopy;

这次操作异常顺利,完美解决前面的bug

接下来跑升级脚本

SQL> ALTER PLUGGABLE DATABASE testbak OPEN UPGRADE;

SQL> exit


$ORACLE_HOME/perl/bin/perl $ORACLE_HOME/rdbms/admin/catctl.pl -d \

$ORACLE_HOME/rdbms/admin -c 'testbak' -l $ORACLE_BASE catupgrd.sql


跑完脚本启动数据库,重新编译失效对象,至此插拔升级完成。


文章转载自Vagrant Boy Awy,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论