环境是自己Oracle 11g ogg测试环境,原有进程上今天新增了一个同步用户和一套抽取进程、pump辅助进程和读取进程,info all查看进程运行信息,首先发现是ogg源端EXTRACT进程EORA_1和辅助进程 PORA_1报错OGG-01496;报错信息如下
2024-09-23 16:46:08 ERROR OGG-01496 Failed to open target trail file /ogg/dirdat/aa000147, at RBA 1231.
2024-09-23 16:46:08 ERROR OGG-01668 PROCESS ABENDING.
抽取进程EORA_1和辅助进程PORA_1都是相同的错误
原因是之前在源端删除了trail文件,在$OGG_HOME/dirdat/下没有了EORA_1进程原本对应的aaxxxx trail文件
解决办法是向前滚重新生成进程对应的trail队列文件
1.源端目标端登录ogg用户
GGSCI (ogg1) 54> dblogin USERID ogg_source,PASSWORD ogg
Successfully logged into database.
2.源端重置前滚重新生成队列文件
GGSCI (ogg1) 56> alter EXTRACT EORA_1 etrollover
2024-09-23 16:51:33 INFO OGG-01520 Rollover performed. For each affected output trail of Version 10 or higher format, after starting the source extract, issue ALTER EX
TSEQNO for that trail's reader (either pump EXTRACT or REPLICAT) to move the reader's scan to the new trail file; it will not happen automatically.
EXTRACT altered.
GGSCI (ogg1) 57> alter EXTRACT PORA_1 etrollover
2024-09-23 16:53:22 INFO OGG-01520 Rollover performed. For each affected output trail of Version 10 or higher format, after starting the source extract, issue ALTER EX
TSEQNO for that trail's reader (either pump EXTRACT or REPLICAT) to move the reader's scan to the new trail file; it will not happen automatically.
EXTRACT altered.
3.目标端重置读取进程
GGSCI (ogg2) 63> dblogin USERID ogg_target,PASSWORD ogg
Successfully logged into database.
GGSCI (ogg2) 64> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
REPLICAT RUNNING RORA_1 00:00:00 00:00:08
REPLICAT RUNNING RORA_2 00:00:00 00:00:08
GGSCI (ogg2) 65> stop RORA_1
Sending STOP request to REPLICAT RORA_1 ...
Request processed.
GGSCI (ogg2) 66> alter replicat RORA_1 extseqno 0,extrba 0
REPLICAT altered.
4.重启进程
源端:
GGSCI (ogg1) 58> start EORA_1
Sending START request to MANAGER ...
EXTRACT EORA_1 starting
目标端:
GGSCI (ogg2) 67> start RORA_1
Sending START request to MANAGER ...
REPLICAT RORA_1 starting
info all 查看进程
看到EORA_1 PORA_1进程状态已经是运行
验证表同步状态在源端插入数据时,目标端发现没有完成同步,也没有产生新的trail文件,查询进程诊断详细信息发现抽取进程EORA_1有异常
GGSCI (ogg1) 68> view report EORA_1
***********************************************************************
Oracle GoldenGate Capture for Oracle
Version 11.2.1.0.3 14400833 OGGCORE_11.2.1.0.3_PLATFORMS_120823.1258_FBO
Linux, x64, 64bit (optimized), Oracle 11g on Aug 23 2012 20:32:12
Copyright (C) 1995, 2012, Oracle and/or its affiliates. All rights reserved.
Starting at 2024-09-23 16:55:33
***********************************************************************
Operating System Version:
Linux
Version #1 SMP Wed Oct 19 11:24:13 EDT 2016, Release 3.10.0-514.el7.x86_64
Node: ogg1
Machine: x86_64
soft limit hard limit
Address Space Size : unlimited unlimited
Heap Size : unlimited unlimited
File Size : unlimited unlimited
CPU Time : unlimited unlimited
Process id: 7965
Description:
***********************************************************************
** Running with the following parameters **
***********************************************************************
2024-09-23 16:55:33 INFO OGG-03035 Operating system character set identified as UTF-8. Locale: en_US, LC_ALL:.
EXTRACT EORA_1
SETENV (NLS_LANG=AMERICAN_AMERICA.AL32UTF8)
Set environment variable (NLS_LANG=AMERICAN_AMERICA.AL32UTF8)
USERID ogg_source, PASSWORD ***
EXTTRAIL /ogg/dirdat/aa
TABLE ogg_source.test_ogg;
TABLE ogg_source.test_ogg1;
TABLE ogg_source.test3;
2024-09-23 16:55:34 INFO OGG-01635 BOUNDED RECOVERY: reset to initial or altered checkpoint.
2024-09-23 16:55:34 INFO OGG-01815 Virtual Memory Facilities for: BR
anon alloc: mmap(MAP_ANON) anon free: munmap
file alloc: mmap(MAP_SHARED) file free: munmap
target directories:
/ogg/BR/EORA_1.
Bounded Recovery Parameter:
Options = BRRESET
BRINTERVAL = 4HOURS
BRDIR = /ogg
2024-09-23 16:55:34 INFO OGG-01815 Virtual Memory Facilities for: COM
anon alloc: mmap(MAP_ANON) anon free: munmap
file alloc: mmap(MAP_SHARED) file free: munmap
target directories:
/ogg/dirtmp.
CACHEMGR virtual memory values (may have been adjusted)
CACHESIZE: 64G
CACHEPAGEOUTSIZE (normal): 8M
PROCESS VM AVAIL FROM OS (min): 128G
CACHESIZEMAX (strict force to disk): 96G
Database Version:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
PL/SQL Release 11.2.0.4.0 - Production
CORE 11.2.0.4.0 Production
TNS for Linux: Version 11.2.0.4.0 - Production
NLSRTL Version 11.2.0.4.0 - Production
Database Language and Character Set:
NLS_LANG = "AMERICAN_AMERICA.AL32UTF8"
NLS_LANGUAGE = "AMERICAN"
NLS_TERRITORY = "AMERICA"
NLS_CHARACTERSET = "AL32UTF8"
2024-09-23 16:55:34 INFO OGG-01513 Positioning to Sequence 130, RBA 48083984, SCN 0.0.
2024-09-23 16:55:34 INFO OGG-01516 Positioned to Sequence 130, RBA 48083984, SCN 0.0, Sep 23, 2024 4:35:00 PM.
2024-09-23 16:55:34 INFO OGG-01052 No recovery is required for target file /ogg/dirdat/aa000148, at RBA 0 (file not opened).
2024-09-23 16:55:34 INFO OGG-01478 Output file /ogg/dirdat/aa is using format RELEASE 11.2.
***********************************************************************
** Run Time Messages **
***********************************************************************
2024-09-23 16:55:34 INFO OGG-01517 Position of first record processed Sequence 130, RBA 48083984, SCN 0.3382016, Sep 23, 2024 4:35:24 PM.
TABLE resolved (entry ogg_source.test_ogg):
TABLE "OGG_SOURCE"."TEST_OGG";
Using the following key columns for source table OGG_SOURCE.TEST_OGG: ID.
TABLE resolved (entry ogg_source.test_ogg1):
TABLE "OGG_SOURCE"."TEST_OGG1";
Using the following key columns for source table OGG_SOURCE.TEST_OGG1: ID.
TABLE resolved (entry ogg_source.test3):
TABLE "OGG_SOURCE"."TEST3";
Using the following key columns for source table OGG_SOURCE.TEST3: ID.
由上述信息和ggserr.log提取出OGG-01052 No recovery is required for target file /ogg/dirdat/aa000148, at RBA 0 (file not opened). 新的问题还是和队列文件相关,通过重置读取进程的时候是重置为0,而重置的dirdat目录下的队列文件是148结尾,应该是上一个处理中重置为0导致的原因(第一次碰到这个问题,如果能从图2中报错ERROR OGG-01496 Failed to open target trail file /ogg/dirdat/aa000147, at RBA 1231发现aa000147丢失,直接从148重置就OK解决了.)
GGSCI (ogg1) 71> info EORA_1 detail
EXTRACT EORA_1 Last Started 2024-09-23 16:55 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:01 ago)
Log Read Checkpoint Oracle Redo Logs
2024-09-23 21:00:11 Seqno 130, RBA 53547008
SCN 0.3386374 (3386374)
Target Extract Trails:
Remote Trail Name Seqno RBA Max MB
/ogg/dirdat/aa 148 121266 5
Extract Source Begin End
/u01/app/oracle/oradata/ORACLE11G/onlinelog/redo5_1.log 2024-09-23 16:35 2024-09-23 21:00
如图:
停止抽取进程,pump辅助进程,和读取进程,重新在目标端从148重置读取进程RORA_1
GGSCI (ogg2) 77> alter replicat RORA_1 extseqno 148 extrba 0
REPLICAT altered.
GGSCI (ogg2) 78> start RORA_1
Sending START request to MANAGER ...
REPLICAT RORA_1 starting
重启进程后再次查询抽取进程EORA_1的report信息,提示Recovery completed for target file /ogg/dirdat/aa000149, at RBA 1082.,即已经在目标端产生了149号的新的trail队列文件,此时状态已经正常。
进行数据同步验证
源端
目标端
over~
所以问题复现过程重要,加深自己对数据库的学习。