暂无图片
暂无图片
2
暂无图片
暂无图片
1
暂无图片

ORACLE duplicate 搭建DG 错误排查

原创 心在梦在 2024-09-13
586

ORACLE duplicate 搭建DG 错误排查

[TOC]

一、环境描述

11.2.0.4 单机搭建DG 到 RAC 环境,采用duplicate from active database 方式搭建。

二、错误信息

[oracle@orcl1 ~]$ rman target sys/oracle@orcl auxiliary sys/oracle@orcldg Recovery Manager: Release 11.2.0.4.0 - Production on Thu Sep 12 09:59:14 2024 Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved. connected to target database: orcl (DBID=2999707776) connected to auxiliary database: orcl (not mounted) RMAN> RMAN> run { allocate channel cl1 type disk; allocate channel cl2 type disk; allocate auxiliary channel c1 type disk; allocate auxiliary channel c2 type disk; duplicate target database for standby from active database nofilenamecheck dorecover; release channel c1; release channel c2; release channel cl1; release channel cl2; } 2> 3> 4> 5> 6> 7> 8> 9> 10> 11> 12> using target database control file instead of recovery catalog allocated channel: cl1 channel cl1: SID=2312 device type=DISK allocated channel: cl2 channel cl2: SID=2406 device type=DISK allocated channel: c1 channel c1: SID=2053 device type=DISK allocated channel: c2 channel c2: SID=2107 device type=DISK Starting Duplicate Db at 12-SEP-24 contents of Memory Script: { backup as copy reuse passwordfile auxiliary format '/u01/app/oracle/product/11.2.0/db/dbs/orapworcldg' ; } executing Memory Script Starting backup at 12-SEP-24 Finished backup at 12-SEP-24 contents of Memory Script: { backup as copy current controlfile for standby auxiliary format '+DATA/orcl/control01.ctl'; } executing Memory Script Starting backup at 12-SEP-24 channel cl1: starting datafile copy copying standby control file released channel: cl1 released channel: cl2 released channel: c1 released channel: c2 RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of Duplicate Db command at 09/12/2024 09:59:31 RMAN-05501: aborting duplication of target database RMAN-03015: error occurred in stored script Memory Script RMAN-03009: failure of backup command on cl1 channel at 09/12/2024 09:59:31 ORA-17628: Oracle error 19505 returned by remote Oracle server
复制

👉 可以看到,抛出了ORA-17628: Oracle error 19505错误信息,百度了一下,和mos文档Doc ID 1331986.1 上,比较多的说法有:

1)主库数据文件存在非OMF命名,建议使用DB_FILE_NAME_CONVERT参数进行转换。

2)备库参数文件中的定义的路径存在错误。

但是,经过检查,都不是上面的原因造成的。而且,仔细查看可以发现,是在控制文件恢复期间就抛出了错误,还没有到数据文件恢复的过程。

三、原因排查

1) 手动执行 controlfile恢复

  • 根据上面输出信息,是在执行controlfile恢复的时候报错的,我们手动执行一下该语句
--1) 报错信息和duplicate是一样的 RMAN> backup as copy current controlfile for standby auxiliary format '+DATA/orcl/control01.ctl'; Starting backup at 12-SEP-24 allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=2312 device type=DISK channel ORA_DISK_1: starting datafile copy copying standby control file RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03009: failure of backup command on ORA_DISK_1 channel at 09/12/2024 10:01:18 ORA-17628: Oracle error 19505 returned by remote Oracle server --2) 修改命令,直接 format '+DATA' ,报错也一样 RMAN> backup as copy current controlfile for standby auxiliary format '+DATA'; .... ORA-17628: Oracle error 19505 returned by remote Oracle server
复制

2) 尝试rman备份恢复控制文件

  • 主库备份controlfile
[oracle@orcl dbs]$ rman target / Recovery Manager: Release 11.2.0.4.0 - Production on Thu Sep 12 10:02:03 2024 Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved. connected to target database: orcl (DBID=2999707776) RMAN> backup current controlfile for standby format '/home/oracle/forStandby_controlfile.bak'; Starting backup at 12-SEP-24 using target database control file instead of recovery catalog allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=1485 device type=DISK channel ORA_DISK_1: starting full datafile backup set channel ORA_DISK_1: specifying datafile(s) in backup set including standby control file in backup set channel ORA_DISK_1: starting piece 1 at 12-SEP-24 channel ORA_DISK_1: finished piece 1 at 12-SEP-24 piece handle=/home/oracle/forStandby_controlfile.bak tag=TAG20240912T100225 comment=NONE channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01 Finished backup at 12-SEP-24 -- 拷贝到备库 [oracle@orcl ~]$ scp forStandby_controlfile.bak oracle@xx.xx.xx.xx:/home/oracle/ oracle@10.7.250.91's password: forStandby_controlfile.bak 100% 23MB 160.4MB/s 00:00
复制
  • 备库恢复controlfile
--1) 报错:但是抛出了和duplicate不一样的错误 [oracle@orcl1:/home/oracle]$ rman target / Recovery Manager: Release 11.2.0.4.0 - Production on Thu Sep 12 10:03:09 2024 Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved. connected to target database: orcl (not mounted) RMAN> restore standby controlfile from '/home/oracle/forStandby_controlfile.bak'; Starting restore at 12-SEP-24 using target database control file instead of recovery catalog allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=2431 device type=DISK channel ORA_DISK_1: restoring control file RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of restore command at 09/12/2024 10:03:17 ORA-19870: error while restoring backup piece /home/oracle/forStandby_controlfile.bak ORA-19504: failed to create file "+DATA/orcl/control01.ctl" ORA-17502: ksfdcre:3 Failed to create file +DATA/orcl/control01.ctl ORA-15001: diskgroup "DATA" does not exist or is not mounted ORA-15040: diskgroup is incomplete ORA-15040: diskgroup is incomplete ORA-15040: diskgroup is incomplete ORA-15040: diskgroup is incomplete ORA-15040: diskgroup is incomplete --2) 检查ASM磁盘组,实际是mount状态,没有问题 SQL> select name,state,type,total_mb ,free_mb from v$asm_diskgroup; NAME STATE TYPE TOTAL_MB FREE_MB ---------- --------------------------------- ------------------ ---------- ---------- DATA MOUNTED EXTERN 30720000 30719622 OCR MOUNTED NORMAL 30720 29794
复制
  • 同时查看备库的alert日志,可以看到没有权限读取asm的disk,通常都是$ORACLE_HOME/bin/oracle权限问题。
.... ORA-15025: could not open disk "/dev/3par8200_data13" ORA-27041: unable to open file Linux-x86_64 Error: 13: Permission denied Additional information: 9 ORA-15025: could not open disk "/dev/3par8200_data14" ORA-27041: unable to open file Linux-x86_64 Error: 13: Permission denied Additional information: 9 ORA-15025: could not open disk "/dev/3par8200_data15" ORA-27041: unable to open file Linux-x86_64 Error: 13: Permission denied Additional information: 9 Thu Sep 12 09:44:46 2024 SUCCESS: diskgroup DATA was dismounted ERROR: diskgroup DATA was not mounted Errors in file /u01/app/oracle/diag/rdbms/orcldg/orcldg/trace/orcldg_ora_42428.trc: ORA-19505: failed to identify file "+DATA/orcl/control01.ctl" ORA-17502: ksfdcre:3 Failed to create file +DATA/orcl/control01.ctl ORA-15001: diskgroup "DATA" does not exist or is not mounted ORA-15040: diskgroup is incomplete ORA-15040: diskgroup is incomplete ORA-15040: diskgroup is incomplete ORA-15040: diskgroup is incomplete
复制

四、解决办法

1. 修复权限

  • 2个节点都需要操作
--1) 检查 $ORACLE_HOME/bin/oracle文件权限:属于oracle:oinstall ,权限确实是不对的,正确的应该是oracle:asmadmin [oracle@orcl1:/home/oracle]$ ll $ORACLE_HOME/bin/oracle -rwsr-s--x 1 oracle oinstall 239862192 Sep 11 15:03 /u01/app/oracle/product/11.2.0/db/bin/oracle --2) 修改权限 -- root用户更改属组 [root@orcl1 ~]# chgrp asmadmin /u01/app/oracle/product/11.2.0/db/bin/oracle -- grid执行 [grid@orcl1:/home/grid]$ /u01/app/11.2.0/grid/bin/setasmgidwrap o=/u01/app/oracle/product/11.2.0/db/bin/oracle --3) 再次检查:属于oracle:asmadmin [oracle@orcl1:/home/oracle]$ ll /u01/app/oracle/product/11.2.0/db/bin/oracle -rwsr-s--x 1 oracle asmadmin 239862192 Sep 11 15:03 /u01/app/oracle/product/11.2.0/db/bin/oracle
复制

2. 再次恢复controlfile

--1) rman 连接错误,报权限不对 [oracle@orcl1:/home/oracle]$ rman target / Recovery Manager: Release 11.2.0.4.0 - Production on Thu Sep 12 10:27:32 2024 Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved. RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-00554: initialization of internal recovery manager package failed RMAN-04005: error from target database: ORA-27140: attach to post/wait facility failed ORA-27300: OS system dependent operation:invalid_egid failed with status: 1 ORA-27301: OS failure message: Operation not permitted ORA-27302: failure occurred at: skgpwinit6 ORA-27303: additional information: startup egid = 54321 (oinstall), current egid = 54329 (asmadmin) --2) 重启一下数据库 [oracle@orcl1:/home/oracle]$ ps -ef|grep smon root 4654 1 3 Sep11 ? 00:38:08 /u01/app/11.2.0/grid/bin/osysmond.bin grid 5302 1 0 Sep11 ? 00:00:01 asm_smon_+ASM1 oracle 10525 9444 0 10:29 pts/0 00:00:00 grep --color=auto smon oracle 50209 1 0 09:59 ? 00:00:00 ora_smon_orcldg [oracle@orcl1:/home/oracle]$ kill -9 50209 [oracle@orcl1:/home/oracle]$ ps -ef|grep smon root 4654 1 3 Sep11 ? 00:38:08 /u01/app/11.2.0/grid/bin/osysmond.bin grid 5302 1 0 Sep11 ? 00:00:01 asm_smon_+ASM1 oracle 10568 9444 0 10:29 pts/0 00:00:00 grep --color=auto smon [oracle@orcl1:/home/oracle]$ sqlplus / as sysdba SQL*Plus: Release 11.2.0.4.0 Production on Thu Sep 12 10:29:14 2024 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to an idle instance. SQL> startup nomount ORACLE instance started. Total System Global Area 3.2068E+11 bytes Fixed Size 2273888 bytes Variable Size 2.2012E+10 bytes Database Buffers 2.9850E+11 bytes Redo Buffers 170278912 bytes SQL> --3) 再次恢复:OK [oracle@orcl1:/home/oracle]$ rman target / Recovery Manager: Release 11.2.0.4.0 - Production on Thu Sep 12 10:30:03 2024 Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved. connected to target database: orcl (not mounted) RMAN> restore standby controlfile from '/home/oracle/forStandby_controlfile.bak'; Starting restore at 12-SEP-24 using target database control file instead of recovery catalog allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=1999 device type=DISK channel ORA_DISK_1: restoring control file channel ORA_DISK_1: restore complete, elapsed time: 00:00:01 output file name=+DATA/orcl/control01.ctl Finished restore at 12-SEP-24
复制

3. 再次执行duplicate

--1) 很遗憾报错和最开始的的一样,难道修改了权限不行,不是这个问题???🤣 [oracle@orcl1 ~]$ rman target sys/oracle@orcl auxiliary sys/oracle@orcldg .... .... ORA-17628: Oracle error 19505 returned by remote Oracle server --2) 再次手动执行 RMAN> backup as copy current controlfile for standby auxiliary format '+DATA/orcl/control01.ctl'; Starting backup at 12-SEP-24 allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=1485 device type=DISK channel ORA_DISK_1: starting datafile copy copying standby control file RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03009: failure of backup command on ORA_DISK_1 channel at 09/12/2024 10:41:36 ORA-17628: Oracle error 19505 returned by remote Oracle server --3) 修改脚本,再次执行Ok RMAN> backup as copy current controlfile for standby auxiliary format '+DATA'; Starting backup at 12-SEP-24 using channel ORA_DISK_1 channel ORA_DISK_1: starting datafile copy copying standby control file output file name=/u01/app/oracle/product/11.2.0/db_1/dbs/snapcf_orcl.f tag=TAG20240912T104146 channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:01 Finished backup at 12-SEP-24
复制

根据 Doc ID 1302857.1,修改参数文件
图片.png

[oracle@orcl1:/u01/app/oracle/product/11.2.0/db/dbs]$ cat initorcldg.ora |grep control *.control_files='+DATA'
复制
  • 重启备库,再次执行duplicate ,没有报错,问题最终解决,DG搭建完成。
最后修改时间:2024-10-22 15:19:18
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论

手机用户0815
暂无图片
11天前
评论
暂无图片 0
有用,手工恢复的controlfile
11天前
暂无图片 点赞
评论