问题1:日志名错误问题
Symptoms
周一刚上班看到邮件报警,ADG异常,SCN未推进;ADG监控脚本可以参考如下链接。
https://www.modb.pro/db/1836634539498684416
环境介绍:
主库:oracle 11.2.0.4,2*node RAC Redhat 6.8
备库:oracle 11.2.0.4 single instance Centos 7.9
这个ADG是最近新建的一台。
1.检查日志应用情况
发现日志正常被传至备库,但是未能应用
SQL> select 'Last Applied : ' Logs,
to_char(next_time, 'DD-MON-YY HH24:MI:SS') Time
from v$archived_log
where sequence# =
(select max(sequence#) from v$archived_log where applied = 'YES')
union
select 'Last Received : ' Logs,
to_char(next_time, 'DD-MON-YY HH24:MI:SS') Time
from v$archived_log
where sequence# = (select max(sequence#) from v$archived_log); 2 3 4 5 6 7 8 9 10
LOGS TIME
---------------- ---------------------------
Last Applied : 20-OCT-24 05:14:16
Last Received : 21-OCT-24 09:18:10
复制
2.尝试重新取消应用和重新应用
但是查询SCN 还是未能正常推进
alter database recover managed standby database cancel;
alter database recover managed standby database using current logfile disconnect;
select to_char(current_scn) from v$database;
复制
3.查看alert log有如下报错
发现有+DATA的log?难道是日志没有被convert?
DDE: Problem Key 'ORA 312' was flood controlled (0x5) (no incident)
ORA-00312: online log 13 thread 4: '+DATA/orcl/redo13.log'
ORA-17503: ksfdopn:2 Failed to open file +DATA/orcl/redo13.log
ORA-15001: diskgroup "DATA" does not exist or is not mounted
ORA-15077: could not locate ASM instance serving a required diskgroup
ORA-29701: unable to connect to Cluster Synchronization Service
DDE: Problem Key 'ORA 313' was flood controlled (0x5) (no incident)
ORA-00313: open failed for members of log group 13 of thread 4
ORA-00312: online log 13 thread 4: '+DATA/orcl/redo13.log'
ORA-17503: ksfdopn:2 Failed to open file +DATA/orcl/redo13.log
ORA-15001: diskgroup "DATA" does not exist or is not mounted
ORA-15077: could not locate ASM instance serving a required diskgroup
ORA-29701: unable to connect to Cluster Synchronization Service
ORA-00313: open failed for members of log group 13 of thread 4
ORA-00312: online log 13 thread 4: '+DATA/orcl/redo13.log'
ORA-17503: ksfdopn:2 Failed to open file +DATA/orcl/redo13.log
ORA-15001: diskgroup "DATA" does not exist or is not mounted
ORA-15077: could not locate ASM instance serving a required diskgroup
ORA-29701: unable to connect to Cluster Synchronization Service
ORA-00313: open failed for members of log group 13 of thread 4
ORA-00312: online log 13 thread 4: '+DATA/orcl/redo13.log'
ORA-17503: ksfdopn:2 Failed to open file +DATA/orcl/redo13.log
ORA-15001: diskgroup "DATA" does not exist or is not mounted
ORA-15077: could not locate ASM instance serving a required diskgroup
ORA-29701: unable to connect to Cluster Synchronization Service
WARNING: ASM communication error: op 9 state 0x40 (15077)
ERROR: slave communication error with ASM
WARNING: ASM communication error: op 9 state 0x40 (15077)
ERROR: slave communication error with ASM
Error 349 creating/clearing online redo logfile 13
*** 2024-10-21 09:10:52.918 4329 krsh.c
Clearing online redo logfile 13 complete
WARNING: Failed to start ASM background process (ASMB)
复制
4.检查备库转换参数
看着dbfile和logfile的转换参数是没有问题的
SQL> show parameter convert
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_file_name_convert string +DATA/orcl/datafile/, /data/df
mesdg/, +DATA/orcl/TEMPFILE/,
/data/dfmesdg/
log_file_name_convert string +DATA/orcl/onlinelog/, /data/d
fmesdg/
SQL> show parameter standby_file_management
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
standby_file_management string auto
复制
5.检查日志文件
确实发现有多个日志文件的文件名未被正常转换
SQL>set line 300
SQL>col member for a50
SQL> select member from v$logfile;
MEMBER
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+DATA/orcl/redo09.log
+DATA/orcl/redo10.log
+DATA/orcl/redo11.log
+DATA/orcl/redo12.log
+DATA/orcl/redo13.log
+DATA/orcl/redo14.log
+DATA/orcl/redo15.log
+DATA/orcl/redo16.log
/data/dfmesdg/group_1.4912.1181838819
/data/dfmesdg/group_2.6204.1181838819
/data/dfmesdg/group_3.6845.1181838819
MEMBER
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
/data/dfmesdg/group_4.5697.1181838821
/data/dfmesdg/group_6.4525.1181838821
/data/dfmesdg/group_7.5169.1181838821
/data/dfmesdg/group_8.4187.1181838821
/data/dfmesdg/group_5.3748.1181838823
16 rows selected.
复制
Solution
应该是因为新加的四组八个standby log没有正常转换文件名,手动修改为正常的名字即可。具体步骤如下
取消日志应用
修改standby_file_management参数manual
shutdown 数据库并启动到mount
rename日志名
open 数据库
修改standby_file_management参数auto
开启日志应用,检查SCN是否正常推进
具体session log 如下
#取消日志应用
SQL> alter database recover managed standby database cancel;
Database altered.
#修改stb file management参数
SQL> alter system set standby_file_management='MANUAL';
#关闭数据库并启动到mount
System altered.
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup mount;
ORACLE instance started.
Total System Global Area 1.0055E+10 bytes
Fixed Size 2261888 bytes
Variable Size 2181041280 bytes
Database Buffers 7851737088 bytes
Redo Buffers 19742720 bytes
Database mounted.
#rename log file name
ALTER DATABASE RENAME FILE '+DATA/orcl/redo09.log' TO '/data/dfmesdg/redo09.log';
ALTER DATABASE RENAME FILE '+DATA/orcl/redo10.log' TO '/data/dfmesdg/redo10.log';
ALTER DATABASE RENAME FILE '+DATA/orcl/redo11.log' TO '/data/dfmesdg/redo11.log';
ALTER DATABASE RENAME FILE '+DATA/orcl/redo12.log' TO '/data/dfmesdg/redo12.log';
ALTER DATABASE RENAME FILE '+DATA/orcl/redo13.log' TO '/data/dfmesdg/redo13.log';
ALTER DATABASE RENAME FILE '+DATA/orcl/redo14.log' TO '/data/dfmesdg/redo14.log';
ALTER DATABASE RENAME FILE '+DATA/orcl/redo15.log' TO '/data/dfmesdg/redo15.log';
ALTER DATABASE RENAME FILE '+DATA/orcl/redo16.log' TO '/data/dfmesdg/redo16.log';
#确认log file name正常
SQL> SELECT GROUP#, STATUS, MEMBER
FROM V$LOGFILE; 2
GROUP# STATUS MEMBER
---------- ------- --------------------------------------------------
9 /data/dfmesdg/redo09.log
10 /data/dfmesdg/redo10.log
11 /data/dfmesdg/redo11.log
12 /data/dfmesdg/redo12.log
13 /data/dfmesdg/redo13.log
14 /data/dfmesdg/redo14.log
15 /data/dfmesdg/redo15.log
16 /data/dfmesdg/redo16.log
1 /data/dfmesdg/group_1.4912.1181838819
2 /data/dfmesdg/group_2.6204.1181838819
3 /data/dfmesdg/group_3.6845.1181838819
GROUP# STATUS MEMBER
---------- ------- --------------------------------------------------
4 /data/dfmesdg/group_4.5697.1181838821
6 /data/dfmesdg/group_6.4525.1181838821
7 /data/dfmesdg/group_7.5169.1181838821
8 /data/dfmesdg/group_8.4187.1181838821
5 /data/dfmesdg/group_5.3748.1181838823
16 rows selected.
#开启数据库
SQL>
SQL> alter database open;
Database altered.
#开启日志实时应用
SQL> alter database recover managed standby database using current logfile disconnect;
Database altered.
#检查SCN是否正常推进
SQL> select to_char(current_scn) from v$database;
TO_CHAR(CURRENT_SCN)
----------------------------------------
65552389442
SQL> /
TO_CHAR(CURRENT_SCN)
----------------------------------------
65552389744
SQL>
SQL> /
TO_CHAR(CURRENT_SCN)
----------------------------------------
65552484849
#修改STANDBY_FILE_MANAGEMENT=AUTO
SQL> ALTER SYSTEM SET STANDBY_FILE_MANAGEMENT=AUTO scope=both;
System altered.
复制
问题2 主库存在logon trigger问题
Symptoms
备库alert log出现如下报错(看着是想写入什么 因为read only报错)
Errors in file /u01/app/oracle/diag/rdbms/dfmesdg/dfmesdg/trace/dfmesdg_ora_26039.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-16000: database open for read-only access
ORA-06512: at line 44
Mon Oct 21 12:29:08 2024
Errors in file /u01/app/oracle/diag/rdbms/dfmesdg/dfmesdg/trace/dfmesdg_ora_26045.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-16000: database open for read-only access
ORA-06512: at line 44
复制
查看trace 文件 可以看到是在执行trigger时报错
[oracle@YCDFLDG01 ~]$ more /u01/app/oracle/diag/rdbms/dfmesdg/dfmesdg/trace/dfmesdg_ora_26039.trc
Trace file /u01/app/oracle/diag/rdbms/dfmesdg/dfmesdg/trace/dfmesdg_ora_26039.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /u01/app/oracle/product/11.2.0/db_1
System name: Linux
Node name: YCDFLDG01
Release: 3.10.0-1160.el7.x86_64
Version: #1 SMP Mon Oct 19 16:18:59 UTC 2020
Machine: x86_64
VM name: VMWare Version: 6
Instance name: dfmesdg
Redo thread mounted by this instance: 3
Oracle process number: 20
Unix process pid: 26039, image: oracle@YCDFLDG01 (TNS V1-V3)
*** 2024-10-21 12:29:03.609
*** SESSION ID:(570.843) 2024-10-21 12:29:03.609
*** CLIENT ID:() 2024-10-21 12:29:03.609
*** SERVICE NAME:(SYS$USERS) 2024-10-21 12:29:03.609
*** MODULE NAME:(sqlplus@YCDFLDG01 (TNS V1-V3)) 2024-10-21 12:29:03.609
*** ACTION NAME:() 2024-10-21 12:29:03.609
Error in executing triggers on connect internal ##这里是关键
*** 2024-10-21 12:29:03.609
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=0, mask=0x0)
----- Error Stack Dump -----
ORA-00604: error occurred at recursive SQL level 1
ORA-16000: database open for read-only access
ORA-06512: at line 44
复制
想到我的主库是配置了logon trigger 来记录登陆的session信息
SQL> select dest_name, status, error from v$archive_dest WHERE DEST_ID = 2;
DEST_NAME
--------------------------------------------------------------------------------
STATUS ERROR
--------- -----------------------------------------------------------------
LOG_ARCHIVE_DEST_2
ERROR ORA-16191: Primary log shipping client not logged on standby
SQL> select OWNER,TRIGGER_NAME,status from dba_triggers where owner in ('SYS','SYSTEM');
OWNER TRIGGER_NAME STATUS
------------------------------ ------------------------------ --------
SYSTEM DEF$_PROPAGATOR_TRIG ENABLED
SYSTEM REPCATLOGTRIG ENABLED
SYS LOGMNRGGC_TRIGGER DISABLED
SYS AW_TRUNC_TRG ENABLED
SYS AW_REN_TRG ENABLED
SYS AW_DROP_TRG ENABLED
SYS CDC_ALTER_CTABLE_BEFORE DISABLED
SYS CDC_CREATE_CTABLE_AFTER DISABLED
SYS CDC_CREATE_CTABLE_BEFORE DISABLED
SYS CDC_DROP_CTABLE_BEFORE DISABLED
SYS XDB_PI_TRIG ENABLED
OWNER TRIGGER_NAME STATUS
------------------------------ ------------------------------ --------
SYS A_SYSTEM_DDL_TRIGGER ENABLED
SYSTEM A_DB_SUCCESSFUL_LOGIN_TRG ENABLED
复制
备库需要禁用触发器,防止违法写入
再次检查备库状态 已经恢复正常,日志可以被正常传输和应用
SQL> alter system set "_system_trig_enabled"=false;
System altered.
SQL>
SQL> alter database recover managed standby database cancel;
Database altered.
SQL> alter database recover managed standby database using current logfile disconnect;
Database altered.
SQL> select to_char(current_scn) from v$database;
TO_CHAR(CURRENT_SCN)
----------------------------------------
65614566304
复制
评论
