暂无图片
暂无图片
6
暂无图片
暂无图片
1
暂无图片

ADG日志无法传输和应用报错处理

原创 潇湘秦 2024-10-21
555

问题1:日志名错误问题

Symptoms

周一刚上班看到邮件报警,ADG异常,SCN未推进;ADG监控脚本可以参考如下链接。

 https://www.modb.pro/db/1836634539498684416

图片

环境介绍:
主库:oracle 11.2.0.4,2*node RAC  Redhat 6.8

备库:oracle 11.2.0.4 single instance Centos 7.9

这个ADG是最近新建的一台。

1.检查日志应用情况

发现日志正常被传至备库,但是未能应用

SQL> select 'Last Applied : ' Logs,       to_char(next_time, 'DD-MON-YY HH24:MI:SS') Time  from v$archived_log where sequence# =       (select max(sequence#) from v$archived_log where applied = 'YES')unionselect 'Last Received : ' Logs,       to_char(next_time, 'DD-MON-YY HH24:MI:SS') Time  from v$archived_log where sequence# = (select max(sequence#) from v$archived_log);  2    3    4    5    6    7    8    9   10  
LOGS TIME---------------- ---------------------------Last Applied : 20-OCT-24 05:14:16Last Received : 21-OCT-24 09:18:10
复制

2.尝试重新取消应用和重新应用

但是查询SCN 还是未能正常推进

alter database recover managed standby database cancel;alter database recover managed standby database using current logfile disconnect; select to_char(current_scn) from v$database;
复制

3.查看alert log有如下报错

发现有+DATA的log?难道是日志没有被convert?

DDE: Problem Key 'ORA 312' was flood controlled (0x5) (no incident)ORA-00312: online log 13 thread 4: '+DATA/orcl/redo13.log'ORA-17503: ksfdopn:2 Failed to open file +DATA/orcl/redo13.logORA-15001: diskgroup "DATA" does not exist or is not mountedORA-15077: could not locate ASM instance serving a required diskgroupORA-29701: unable to connect to Cluster Synchronization ServiceDDE: Problem Key 'ORA 313' was flood controlled (0x5) (no incident)ORA-00313: open failed for members of log group 13 of thread 4ORA-00312: online log 13 thread 4: '+DATA/orcl/redo13.log'ORA-17503: ksfdopn:2 Failed to open file +DATA/orcl/redo13.logORA-15001: diskgroup "DATA" does not exist or is not mountedORA-15077: could not locate ASM instance serving a required diskgroupORA-29701: unable to connect to Cluster Synchronization ServiceORA-00313: open failed for members of log group 13 of thread 4ORA-00312: online log 13 thread 4: '+DATA/orcl/redo13.log'ORA-17503: ksfdopn:2 Failed to open file +DATA/orcl/redo13.logORA-15001: diskgroup "DATA" does not exist or is not mountedORA-15077: could not locate ASM instance serving a required diskgroupORA-29701: unable to connect to Cluster Synchronization ServiceORA-00313: open failed for members of log group 13 of thread 4ORA-00312: online log 13 thread 4: '+DATA/orcl/redo13.log'ORA-17503: ksfdopn:2 Failed to open file +DATA/orcl/redo13.logORA-15001: diskgroup "DATA" does not exist or is not mountedORA-15077: could not locate ASM instance serving a required diskgroupORA-29701: unable to connect to Cluster Synchronization ServiceWARNING: ASM communication error: op 9 state 0x40 (15077)ERROR: slave communication error with ASMWARNING: ASM communication error: op 9 state 0x40 (15077)ERROR: slave communication error with ASMError 349 creating/clearing online redo logfile 13*** 2024-10-21 09:10:52.918 4329 krsh.cClearing online redo logfile 13 completeWARNING: Failed to start ASM background process (ASMB)
复制

4.检查备库转换参数

看着dbfile和logfile的转换参数是没有问题的

 SQL> show parameter convert
NAME TYPE VALUE------------------------------------ ----------- ------------------------------db_file_name_convert string +DATA/orcl/datafile/, /data/df mesdg/, +DATA/orcl/TEMPFILE/, /data/dfmesdg/log_file_name_convert string +DATA/orcl/onlinelog/, /data/d fmesdg/SQL> show parameter standby_file_management
NAME TYPE VALUE------------------------------------ ----------- ------------------------------standby_file_management string auto
复制

5.检查日志文件

确实发现有多个日志文件的文件名未被正常转换

SQL>set line 300SQL>col member for a50SQL> select member from v$logfile;
MEMBER------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+DATA/orcl/redo09.log+DATA/orcl/redo10.log+DATA/orcl/redo11.log+DATA/orcl/redo12.log+DATA/orcl/redo13.log+DATA/orcl/redo14.log+DATA/orcl/redo15.log+DATA/orcl/redo16.log/data/dfmesdg/group_1.4912.1181838819/data/dfmesdg/group_2.6204.1181838819/data/dfmesdg/group_3.6845.1181838819
MEMBER------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------/data/dfmesdg/group_4.5697.1181838821/data/dfmesdg/group_6.4525.1181838821/data/dfmesdg/group_7.5169.1181838821/data/dfmesdg/group_8.4187.1181838821/data/dfmesdg/group_5.3748.1181838823
16 rows selected.
复制

Solution

应该是因为新加的四组八个standby log没有正常转换文件名,手动修改为正常的名字即可。具体步骤如下

  1. 取消日志应用

  2. 修改standby_file_management参数manual

  3. shutdown 数据库并启动到mount

  4. rename日志名

  5. open 数据库

  6. 修改standby_file_management参数auto

  7. 开启日志应用,检查SCN是否正常推进


具体session log 如下

#取消日志应用SQL> alter database recover managed standby database cancel;
Database altered.#修改stb file management参数SQL> alter system set standby_file_management='MANUAL';#关闭数据库并启动到mountSystem altered.SQL> shutdown immediate;Database closed.Database dismounted.ORACLE instance shut down.SQL> startup mount;ORACLE instance started.
Total System Global Area 1.0055E+10 bytesFixed Size 2261888 bytesVariable Size 2181041280 bytesDatabase Buffers 7851737088 bytesRedo Buffers 19742720 bytesDatabase mounted.#rename log file nameALTER DATABASE RENAME FILE '+DATA/orcl/redo09.log' TO '/data/dfmesdg/redo09.log';ALTER DATABASE RENAME FILE '+DATA/orcl/redo10.log' TO '/data/dfmesdg/redo10.log';ALTER DATABASE RENAME FILE '+DATA/orcl/redo11.log' TO '/data/dfmesdg/redo11.log';ALTER DATABASE RENAME FILE '+DATA/orcl/redo12.log' TO '/data/dfmesdg/redo12.log';ALTER DATABASE RENAME FILE '+DATA/orcl/redo13.log' TO '/data/dfmesdg/redo13.log';ALTER DATABASE RENAME FILE '+DATA/orcl/redo14.log' TO '/data/dfmesdg/redo14.log';ALTER DATABASE RENAME FILE '+DATA/orcl/redo15.log' TO '/data/dfmesdg/redo15.log';ALTER DATABASE RENAME FILE '+DATA/orcl/redo16.log' TO '/data/dfmesdg/redo16.log';#确认log file name正常SQL> SELECT GROUP#, STATUS, MEMBERFROM V$LOGFILE; 2
GROUP# STATUS MEMBER---------- ------- -------------------------------------------------- 9 /data/dfmesdg/redo09.log 10 /data/dfmesdg/redo10.log 11 /data/dfmesdg/redo11.log 12 /data/dfmesdg/redo12.log 13 /data/dfmesdg/redo13.log 14 /data/dfmesdg/redo14.log 15 /data/dfmesdg/redo15.log 16 /data/dfmesdg/redo16.log 1 /data/dfmesdg/group_1.4912.1181838819 2 /data/dfmesdg/group_2.6204.1181838819 3 /data/dfmesdg/group_3.6845.1181838819
GROUP# STATUS MEMBER---------- ------- -------------------------------------------------- 4 /data/dfmesdg/group_4.5697.1181838821 6 /data/dfmesdg/group_6.4525.1181838821 7 /data/dfmesdg/group_7.5169.1181838821 8 /data/dfmesdg/group_8.4187.1181838821 5 /data/dfmesdg/group_5.3748.1181838823
16 rows selected.
#开启数据库SQL>SQL> alter database open;
Database altered.#开启日志实时应用
SQL> alter database recover managed standby database using current logfile disconnect;
Database altered.#检查SCN是否正常推进
SQL> select to_char(current_scn) from v$database;
TO_CHAR(CURRENT_SCN)----------------------------------------65552389442SQL> /
TO_CHAR(CURRENT_SCN)----------------------------------------65552389744
SQL> SQL> /
TO_CHAR(CURRENT_SCN)----------------------------------------65552484849#修改STANDBY_FILE_MANAGEMENT=AUTO
SQL> ALTER SYSTEM SET STANDBY_FILE_MANAGEMENT=AUTO scope=both;
System altered.
复制


问题2 主库存在logon trigger问题

Symptoms

 备库alert log出现如下报错(看着是想写入什么 因为read only报错)

 Errors in file /u01/app/oracle/diag/rdbms/dfmesdg/dfmesdg/trace/dfmesdg_ora_26039.trc:

ORA-00604: error occurred at recursive SQL level 1
ORA-16000: database open for read-only access
ORA-06512: at line 44
Mon Oct 21 12:29:08 2024
Errors in file /u01/app/oracle/diag/rdbms/dfmesdg/dfmesdg/trace/dfmesdg_ora_26045.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-16000: database open for read-only access
ORA-06512: at line 44
复制

查看trace 文件 可以看到是在执行trigger时报错

[oracle@YCDFLDG01 ~]$ more /u01/app/oracle/diag/rdbms/dfmesdg/dfmesdg/trace/dfmesdg_ora_26039.trc
Trace file /u01/app/oracle/diag/rdbms/dfmesdg/dfmesdg/trace/dfmesdg_ora_26039.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /u01/app/oracle/product/11.2.0/db_1
System name:    Linux
Node name:      YCDFLDG01
Release:        3.10.0-1160.el7.x86_64
Version:        #1 SMP Mon Oct 19 16:18:59 UTC 2020
Machine:        x86_64
VM name:        VMWare Version: 6
Instance name: dfmesdg
Redo thread mounted by this instance: 3
Oracle process number: 20
Unix process pid: 26039, image: oracle@YCDFLDG01 (TNS V1-V3)


*** 2024-10-21 12:29:03.609
*** SESSION ID:(570.843) 2024-10-21 12:29:03.609
*** CLIENT ID:() 2024-10-21 12:29:03.609
*** SERVICE NAME:(SYS$USERS) 2024-10-21 12:29:03.609
*** MODULE NAME:(sqlplus@YCDFLDG01 (TNS V1-V3)) 2024-10-21 12:29:03.609
*** ACTION NAME:() 2024-10-21 12:29:03.609
 
Error in executing triggers on connect internal ##这里是关键
*** 2024-10-21 12:29:03.609
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=0, mask=0x0)
----- Error Stack Dump -----
ORA-00604: error occurred at recursive SQL level 1
ORA-16000: database open for read-only access
ORA-06512: at line 44
复制

想到我的主库是配置了logon trigger 来记录登陆的session信息


SQL> select dest_name, status, error from v$archive_dest WHERE DEST_ID = 2;

DEST_NAME
--------------------------------------------------------------------------------
STATUS    ERROR
--------- -----------------------------------------------------------------
LOG_ARCHIVE_DEST_2
ERROR     ORA-16191: Primary log shipping client not logged on standbySQL>  select OWNER,TRIGGER_NAME,status  from dba_triggers where owner in ('SYS','SYSTEM');

OWNER                          TRIGGER_NAME                   STATUS
------------------------------ ------------------------------ --------
SYSTEM                         DEF$_PROPAGATOR_TRIG           ENABLED
SYSTEM                         REPCATLOGTRIG                  ENABLED
SYS                            LOGMNRGGC_TRIGGER              DISABLED
SYS                            AW_TRUNC_TRG                   ENABLED
SYS                            AW_REN_TRG                     ENABLED
SYS                            AW_DROP_TRG                    ENABLED
SYS                            CDC_ALTER_CTABLE_BEFORE        DISABLED
SYS                            CDC_CREATE_CTABLE_AFTER        DISABLED
SYS                            CDC_CREATE_CTABLE_BEFORE       DISABLED
SYS                            CDC_DROP_CTABLE_BEFORE         DISABLED
SYS                            XDB_PI_TRIG                    ENABLED

OWNER                          TRIGGER_NAME                   STATUS
------------------------------ ------------------------------ --------
SYS                            A_SYSTEM_DDL_TRIGGER           ENABLED
SYSTEM                         A_DB_SUCCESSFUL_LOGIN_TRG      ENABLED
复制

备库需要禁用触发器,防止违法写入 

再次检查备库状态 已经恢复正常,日志可以被正常传输和应用

SQL> alter system set "_system_trig_enabled"=false;

System altered.

SQL> 
SQL>  alter database recover managed standby database cancel;

Database altered.

SQL> alter database recover managed standby database using current logfile disconnect;

Database altered.

SQL> select to_char(current_scn) from v$database;

TO_CHAR(CURRENT_SCN)
----------------------------------------
65614566304
复制




「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论

筱悦星辰
暂无图片
4月前
评论
暂无图片 0
世上也许有些事不能从头来过,有些失去的东西也无法再拥有。但我们却可以吸取经验和教训,不断提升自己,从而让自己未来的人生变得更加美好。
4月前
暂无图片 点赞
评论