1、报告概述
本篇文档对OracleOGG进程与数据库session对应关系做了较为详细的测试。
首先我们要了解什么是进程?什么是session。
进程是什么?进程就是一个执行中程序的实例,即一个正在执行的程序。我更喜欢将进程理解为分配系统资源的一个单位。那什么是session呢?将session和connection放在一起讲就更好理解了。session顾名思义,会话。意思是用户和服务器建立连接时的逻辑会话。而connection是用户与服务器的物理通信链路。简单来说connection为连接两地的桥梁,而session是通过桥梁给两地运送物资的卡车。
那做这个测试的目的是解决在OGG复制进程出现异常的时候,不知如何定位问题的情况。
我们知道定位问题的最终目标是要找到造成该问题的SQL或者event事务。那我们又知道通过该session信息,可以得到ash视图。而ash视图可以查询一段时间内某进程session都在执行什么SQL。
由此我们知道,如果OGG进程出现异常,如果弄清楚OracleOGG进程与数据库session对应关系,即可以快速通过ash视图定位到故障问题。
该文档不仅仅讲解了OracleOGG进程与数据库session之间的对应关系,还对OGG复制进程参数parallelism x做了一个补充学习。了解了这个参数的含义,作用,使用方法和一些可能造成的副作用。
2、环境准备
源库:oracle 11g rac
目标库:oracle 19c rac
ogg版本:ogg 19.1.0.0.4
3、问题测试
3.1.OGG进程与操作系统进程,与数据库session的管理关系
首先在OGG执行info 复制进程名称查询该复制进程的Process ID,
GGSCI (oel7n01) 107> info REPDEMO REPLICAT REPDEMO Last Started 2022-08-02 10:20 Status RUNNING INTEGRATED Checkpoint Lag 00:00:00 (updated 00:00:03 ago) Process ID 58335 Log Read Checkpoint File /ogg/dirdat/lt000000033 2022-08-02 09:53:27.208488 RBA 1566
复制
得到了该复制进程的Process ID后,在Oracle数据库内根据v$ session视图查询该复制进程具体信息,即得到OracleOGG进程与数据库session之间的对应关系。(v$session中的process字段代表操作系统客户端进程 ID。)
SQL> select SID,SERIAL#,USERNAME,STATUS,PROCESS,PROGRAM,MODULE from v$session where process='58335'; SID SERIAL# USERNAME STATUS PROCESS PROGRAM MODULE ------ -------- ---------- -------- -------------- ------------------------------ ---------- 315 24494 OGG INACTIVE 58335 replicat@oel7n01 (TNS V1-V3) GoldenGate
复制
看完这个方法,那如果在操作系统ps -ef|grep 复制进程名查询呢,看看会得到什么结果?
[oracle@oel7n01 ~]$ ps -ef|grep REPDEMO oracle 58335 13402 0 10:20 ? 00:00:01 /ogg/replicat PARAMFILE /ogg/dirprm/repdemo.prm REPORTFILE /ogg/dirrpt/REPDEMO.rpt PROCESSID REPDEMO oracle 90003 85042 0 10:46 pts/3 00:00:00 grep --color=auto REPDEMO
复制
我们可以知道ps -ef|grep复制进程名查询到的的进程对应的是OGG复制进程
那,该进程可不可以在V$PROCESS中查询到呢?
答案是否定的。因为V$PROCESS是显示有关当前活动进程的信息,但由上述表述可知该进程状态STATUS为INACTIVE,所以并不能在V$PROCESS中查询到。
那除了这个方法外,还有没有办法能够得到OGG进程与数据库session对应关系呢?
答案是有的,可以通过v$session中username字段值等于OGG来查询。
SQL> select SID,SERIAL#,USERNAME,STATUS,PROCESS,PROGRAM,MODULE from v$session where USERNAME= 'OGG'; SID SERIAL# USERNAME STATUS PROCESS PROGRAM MODULE ---------- ---------- ---------- -------- ---------- -------------------- -------------------- 69 6024 OGG ACTIVE 58350 oracle@oel7n01 (AS02 GoldenGate ) 75 50126 OGG ACTIVE 58348 oracle@oel7n01 (AS01 GoldenGate ) 106 4139 OGG ACTIVE 58352 oracle@oel7n01 (AS03 GoldenGate ) 109 64175 OGG ACTIVE 58360 oracle@oel7n01 (AS07 GoldenGate ) 356 39224 OGG INACTIVE 58335 replicat@oel7n01 (TN GoldenGate S V1-V3) 361 19510 OGG ACTIVE 58354 oracle@oel7n01 (AS04 GoldenGate ) 392 32134 OGG ACTIVE 58358 oracle@oel7n01 (AS06 GoldenGate ) 393 39260 OGG ACTIVE 58356 oracle@oel7n01 (AS05 GoldenGate ) 8 rows selected.
复制
3.2.OGG性能分析,当前session和历史session mode的性能分析
我们已经在3.1学习了解了OracleOGG进程与数据库session之间的对应关系,那现在我们来模拟一下目标库OGG复制进程Lag at Chkpt过高的情况,看能不能成功定位到故障问题。
首先制造同步数据量150w条,模拟OGG复制进程Lag at Chkpt过高现象:
源库创建表t_parallel:
SQL> create table t_parallel(a int,b int);
复制
然后插入数据,模拟处理数据量过大的情况:
SQL> insert into t_parallel select level,level from dual connect by level<=5e5; SQL> commit; SQL> insert into t_parallel select level+500000,level+500000 from dual connect by level<5e5; SQL> commit; SQL> insert into t_parallel select level+1000000,level+1000000 from dual connect by level<=5e5; SQL> commit;
复制
我们去目标库OGG查询下进程信息,发现复制进程Lag at Chkpt确实异于正常值
GGSCI (oel7n01) 89> ! info all Program Status Group Lag at Chkpt Time Since Chkpt MANAGER RUNNING EXTRACT STOPPED EXTPDB 312:21:17 08:04:51 EXTRACT STOPPED PUMPPDB 00:00:00 08:04:43 REPLICAT RUNNING REPDEMO 00:00:38 00:00:06
复制
根据3.1我们知道该如何通过OGG复制进程定位数据库session。
首先查询复制进程信息,找到复制进程的Process ID
GGSCI (oel7n01) 97> info REPDEMO REPLICAT REPDEMO Last Started 2022-07-26 16:35 Status RUNNING INTEGRATED Checkpoint Lag 00:01:12 (updated 00:00:32 ago) Process ID 104409 Log Read Checkpoint File /ogg/dirdat/lt000000029 2022-07-26 16:37:33.498043 RBA 476690848
复制
然后在数据库内定位session
SQL> select SID,SERIAL#,USERNAME,STATUS,PROCESS,PROGRAM,MODULE from v$session where process='104409'; SID SERIAL# USERNAME STATUS PROCESS PROGRAM MODULE ---------- ---------- ---------- -------- ------------------------ -------------------- --------------- 315 5513 OGG INACTIVE 104409 replicat@oel7n01 (TN GoldenGateS V1-V3)
复制
再通过ash视图,查询1天内该OGG进程session都在执行什么SQL。
SQL> select sql_id,event,BLOCKING_SESSION,CURRENT_OBJ#,count(*) from v$active_session_history where SAMPLE_TIME>sysdate-1 and SESSION_ID=315 and SESSION_SERIAL#=5513 group by sql_id,event,BLOCKING_SESSION,CURRENT_OBJ# order by 5,4; SQL_ID EVENT BLOCKING_SESSION CURRENT_OBJ# COUNT(*) ------------- ---------------------------------------------------------------- ---------------- ------------ ---------- REPL Capture/Apply: miscellaneous 774 1 -1 2 766 6 361x11wcsfp5u 766 6 361x11wcsfp5u REPL Capture/Apply: memory 766 29 REPL Capture/Apply: memory 766 36 REPL Capture/Apply: memory -1 37 7 rows selected.
复制
以此来定位OGG复制进程Lag at Chkpt延迟过高原因是在等待内存资源分配。
3.3.OGG并行参数调整,对应数据库的session ,process的影响,以及并行参数的特征
那最后,我们来讨论下OGG并行参数度参数parallelism x是干啥的呢?改变这个参数,到底真的对并行度的改变有没有直接关系呢?
要了解这个参数我们要先了解OGG在12c版本中复制进程的新特性:集成复制。
OGG经典复制架构
读Oracle GoldenGate的踪迹。
执行数据筛选、映射和转换。
构造表示源数据库DML或DDL事务的SQL语句(按提交的顺序)。
通过Oracle调用接口(OCI)将SQL应用于目标。
OGG集成复制架构
阅读Oracle GoldenGate的踪迹。
执行数据筛选、映射和转换。
构造表示源数据库DML事务(按提交的顺序)的逻辑更改记录(LCR)。DDL由Replicat直接应用。
通过轻量级流媒体接口连接到目标数据库中的后台进程(称为数据库入站服务器)。
将LCRs传输到入站服务器,后者将数据应用到目标数据库。
由上述两个架构图可知集成复制和经典复制的差别在于经典复制是将Trail文件解析成SQL语句之后再对SQL语句进行解析,而集成复制是将Trail文件解析成LCRs。
但这并不在我们今天的讨论范畴。我们今天要讨论的是集成复制的并行度参数parallelism x。也就是下面这张图。
显而易见,由图可知集成复制参数parallelism x能够增加目标库OGG复制进程并行度。也就是将进程分为主进程和子进程,主进程操作子进程,子进程进行工作。
但真实的情况到底是不是像OGG讲的这样呢?还是说多个子进程看着主进程工作呢?
我们看看OGG12.3版本最新推出的复制进程并行复制模式架构。我们可以发现,这好像跟并行度参数parallelism x很相似???为什么复制进程parallelism x并行度参数能解决的并行问题,还需要另外推出新的模式来解决呢?
OGG并行复制架构
我们先来聊聊,parallelism x到底是什么?
PARALLELISM: Sets a minimum number of apply servers that can be used under normal conditions. Setting PARALLELISM to 1 disables apply parallelism, and transactions are applied with a single apply server process. The default for Oracle GoldenGate is 4. For Oracle Standard Edition, this must be set to 1.
集成复制参数PARALLELISM:设置在正常情况下可以使用的应用服务器的最小数量。设置PARALLELISM为 1 将禁用应用并行性,事务通过单个应用服务器进程应用。Oracle GoldenGate 的默认值为4. 对于 Oracle 标准版,这必须设置为1。
那么这个参数有没有上限呢,如果有的话是多少呢?
查阅官方文档,我们可以知道
MAX_PARALLELISM: Limits the number of apply servers that can be used when the load is heavy. This number is reduced again when the workload subsides. The automatic tuning of the number of apply servers is effective only if PARALLELISM is greater than 1 and MAX_PARALLELISM is greater than PARALLELISM. If PARALLELISM is equal to MAX_PARALLELISM, the number of apply servers remains constant during the workload. The default for Oracle GoldenGate is 50.
MAX_PARALLELISM:限制负载较重时可以使用的应用服务器的数量。当工作量减少时,这个数字会再次减少。应用服务器数量的自动调优仅在PARALLELISM大于1且MAX_PARALLELISM大于时有效PARALLELISM。如果PARALLELISM等于MAX_PARALLELISM,则应用服务器的数量在工作负载期间保持不变。Oracle GoldenGate的默认值为50。
所以我们可以通过调整MAX_PARALLELISM参数的值来控制PARALLELISM x的上限。
那了解完parallelism x参数是什么及其工作原理后,我们就要开始测试他到底有何作用。究竟能不能有真正并行的效果。
首先我们先设置上限值为60,即MAX_PARALLELISM 60.然后我们将parallelism参数分别设置为1、6、24、60,重复3.2的操作,源库插入150w条数据,查看目标库复制进程延迟时间。
- parallelism 1:
GGSCI (oel7n01) 246> ! info all Program Status Group Lag at Chkpt Time Since Chkpt MANAGER RUNNING EXTRACT STOPPED EXTPDB 312:21:17 10:16:46 EXTRACT STOPPED PUMPPDB 00:00:00 10:16:38 REPLICAT RUNNING REPDEMO 00:01:21 00:00:06
复制
- parallelism 6:
GGSCI (oel7n01) 378> ! info all Program Status Group Lag at Chkpt Time Since Chkpt MANAGER RUNNING EXTRACT STOPPED EXTPDB 312:21:17 10:21:40 EXTRACT STOPPED PUMPPDB 00:00:00 10:21:32 REPLICAT RUNNING REPDEMO 00:01:44 00:00:01
复制
- parallelism 24:
GGSCI (oel7n01) 542> ! info all Program Status Group Lag at Chkpt Time Since Chkpt MANAGER RUNNING EXTRACT STOPPED EXTPDB 312:21:17 10:25:51 EXTRACT STOPPED PUMPPDB 00:00:00 10:25:43 REPLICAT RUNNING REPDEMO 00:01:50 00:00:10
复制
- parallelism 60:
GGSCI (oel7n01) 931> ! info all Program Status Group Lag at Chkpt Time Since Chkpt MANAGER RUNNING EXTRACT STOPPED EXTPDB 312:21:17 10:55:50 EXTRACT STOPPED PUMPPDB 00:00:00 10:55:42 REPLICAT RUNNING REPDEMO 00:02:08 00:00:10
复制
为什么会出现以上情况,不应该Lag at Chkpt时间会随着parallelism参数增大而随之减少嘛?为什么不但没有减少,反而还增加了呢?
我们查看下OGG并行参数调整,对应数据库的session ,process的影响。在parallelism为1,6,24,60时对应数据库的session ,process情况是如何呢?
- parallelism 1:
SQL> select SID,SERIAL#,USERNAME,STATUS,PROCESS,PROGRAM,MODULE from v$session where USERNAME= 'OGG'; SID SERIAL# USERNAME STATUS PROCESS PROGRAM MODULE ---------- ---------- ---------- -------- ------------------------ ------------------------------ ---------- 315 26453 OGG INACTIVE 35511 replicat@oel7n01 (TNS V1-V3) GoldenGate 318 545 OGG ACTIVE 35523 oracle@oel7n01 (AS02) GoldenGate 322 42855 OGG ACTIVE 35521 oracle@oel7n01 (AS01) GoldenGate
复制
- parallelism 6:
SQL> select SID,SERIAL#,USERNAME,STATUS,PROCESS,PROGRAM,MODULE from v$session where USERNAME= 'OGG'; SID SERIAL# USERNAME STATUS PROCESS PROGRAM MODULE ---------- ---------- ---------- -------- ------------------------ ------------------------------ ---------- 79 56748 OGG ACTIVE 94165 oracle@oel7n01 (AS06) GoldenGate 95 29469 OGG ACTIVE 94156 oracle@oel7n01 (AS02) GoldenGate 106 51330 OGG ACTIVE 94160 oracle@oel7n01 (AS04) GoldenGate 313 29309 OGG ACTIVE 94158 oracle@oel7n01 (AS03) GoldenGate 315 38455 OGG INACTIVE 94143 replicat@oel7n01 (TNS V1-V3) GoldenGate 318 6775 OGG ACTIVE 94163 oracle@oel7n01 (AS05) GoldenGate 322 26403 OGG ACTIVE 94154 oracle@oel7n01 (AS01) GoldenGate 343 54221 OGG ACTIVE 94167 oracle@oel7n01 (AS07) GoldenGate 8 rows selected.
复制
- parallelism 24:
SQL> select SID,SERIAL#,USERNAME,STATUS,PROCESS,PROGRAM,MODULE from v$session where USERNAME= 'OGG'; SID SERIAL# USERNAME STATUS PROCESS PROGRAM MODULE ---------- ---------- --------------- -------- ---------- ------------------------------ --------------- 74 57388 OGG ACTIVE 55488 oracle@oel7n01 (AS08) GoldenGate 80 41229 OGG ACTIVE 55502 oracle@oel7n01 (AS0A) GoldenGate 82 24225 OGG ACTIVE 55510 oracle@oel7n01 (AS0E) GoldenGate 95 21231 OGG ACTIVE 55514 oracle@oel7n01 (AS0G) GoldenGate 122 10319 OGG ACTIVE 55478 oracle@oel7n01 (AS03) GoldenGate 123 25208 OGG ACTIVE 55522 oracle@oel7n01 (AS0K) GoldenGate 127 54472 OGG ACTIVE 55526 oracle@oel7n01 (AS0M) GoldenGate 136 1374 OGG ACTIVE 55484 oracle@oel7n01 (AS06) GoldenGate 139 249 OGG ACTIVE 55480 oracle@oel7n01 (AS04) GoldenGate 143 6124 OGG ACTIVE 55530 oracle@oel7n01 (AS0O) GoldenGate 151 58685 OGG ACTIVE 55506 oracle@oel7n01 (AS0C) GoldenGate 153 57419 OGG ACTIVE 55518 oracle@oel7n01 (AS0I) GoldenGate 336 6182 OGG INACTIVE 55463 replicat@oel7n01 (TNS V1-V3) GoldenGate 344 62207 OGG ACTIVE 55474 oracle@oel7n01 (AS01) GoldenGate 348 55359 OGG ACTIVE 55524 oracle@oel7n01 (AS0L) GoldenGate 375 20671 OGG ACTIVE 55516 oracle@oel7n01 (AS0H) GoldenGate 380 51920 OGG ACTIVE 55492 oracle@oel7n01 (AS09) GoldenGate 388 51902 OGG ACTIVE 55476 oracle@oel7n01 (AS02) GoldenGate 391 44261 OGG ACTIVE 55512 oracle@oel7n01 (AS0F) GoldenGate 392 28049 OGG ACTIVE 55508 oracle@oel7n01 (AS0D) GoldenGate 394 35863 OGG ACTIVE 55520 oracle@oel7n01 (AS0J) GoldenGate 396 42997 OGG ACTIVE 55486 oracle@oel7n01 (AS07) GoldenGate 400 20070 OGG ACTIVE 55532 oracle@oel7n01 (AS0P) GoldenGate 401 43042 OGG ACTIVE 55528 oracle@oel7n01 (AS0N) GoldenGate 411 63191 OGG ACTIVE 55504 oracle@oel7n01 (AS0B) GoldenGate 416 35222 OGG ACTIVE 55482 oracle@oel7n01 (AS05) GoldenGate 26 rows selected.
复制
- parallelism 60:
SQL> select SID,SERIAL#,USERNAME,STATUS,PROCESS,PROGRAM,MODULE from v$session where USERNAME= 'OGG'; SID SERIAL# USERNAME STATUS PROCESS PROGRAM MODULE ---------- ---------- ---------- -------- ---------- ---------------------------------------- --------------- 69 49323 OGG ACTIVE 98389 oracle@oel7n01 (AS17) GoldenGate 70 13168 OGG ACTIVE 98306 oracle@oel7n01 (AS03) GoldenGate 72 64405 OGG ACTIVE 98326 oracle@oel7n01 (AS0D) GoldenGate 75 15068 OGG ACTIVE 98415 oracle@oel7n01 (AS1K) GoldenGate 79 49623 OGG ACTIVE 98411 oracle@oel7n01 (AS1I) GoldenGate 80 5966 OGG ACTIVE 98399 oracle@oel7n01 (AS1C) GoldenGate 81 37079 OGG INACTIVE 98292 replicat@oel7n01 (TNS V1-V3) GoldenGate 82 51587 OGG ACTIVE 98302 oracle@oel7n01 (AS01) GoldenGate 89 24387 OGG ACTIVE 98354 oracle@oel7n01 (AS0R) GoldenGate 90 19180 OGG ACTIVE 98423 oracle@oel7n01 (AS1O) GoldenGate 92 26431 OGG ACTIVE 98338 oracle@oel7n01 (AS0J) GoldenGate 97 59593 OGG ACTIVE 98381 oracle@oel7n01 (AS13) GoldenGate 98 16971 OGG ACTIVE 98334 oracle@oel7n01 (AS0H) GoldenGate 106 12729 OGG ACTIVE 98350 oracle@oel7n01 (AS0P) GoldenGate 107 51683 OGG ACTIVE 98403 oracle@oel7n01 (AS1E) GoldenGate 108 42579 OGG ACTIVE 98342 oracle@oel7n01 (AS0L) GoldenGate 109 28372 OGG ACTIVE 98367 oracle@oel7n01 (AS0X) GoldenGate 110 31792 OGG ACTIVE 98371 oracle@oel7n01 (AS0Z) GoldenGate 111 6275 OGG ACTIVE 98346 oracle@oel7n01 (AS0N) GoldenGate 113 61901 OGG ACTIVE 98363 oracle@oel7n01 (AS0V) GoldenGate 117 31043 OGG ACTIVE 98318 oracle@oel7n01 (AS09) GoldenGate 118 6120 OGG ACTIVE 98376 oracle@oel7n01 (AS11) GoldenGate 119 6833 OGG ACTIVE 98407 oracle@oel7n01 (AS1G) GoldenGate 127 23373 OGG ACTIVE 98393 oracle@oel7n01 (AS19) GoldenGate 130 54514 OGG ACTIVE 98344 oracle@oel7n01 (AS0M) GoldenGate 132 33696 OGG ACTIVE 98322 oracle@oel7n01 (AS0B) GoldenGate 133 21055 OGG ACTIVE 98310 oracle@oel7n01 (AS05) GoldenGate 134 33203 OGG ACTIVE 98330 oracle@oel7n01 (AS0F) GoldenGate 137 4098 OGG ACTIVE 98359 oracle@oel7n01 (AS0T) GoldenGate 140 59860 OGG ACTIVE 98395 oracle@oel7n01 (AS1A) GoldenGate 143 34952 OGG ACTIVE 98314 oracle@oel7n01 (AS07) GoldenGate 145 31872 OGG ACTIVE 98419 oracle@oel7n01 (AS1M) GoldenGate 147 36669 OGG ACTIVE 98385 oracle@oel7n01 (AS15) GoldenGate 325 2362 OGG ACTIVE 98320 oracle@oel7n01 (AS0A) GoldenGate 335 48762 OGG ACTIVE 98356 oracle@oel7n01 (AS0S) GoldenGate 339 40741 OGG ACTIVE 98409 oracle@oel7n01 (AS1H) GoldenGate 340 21316 OGG ACTIVE 98336 oracle@oel7n01 (AS0I) GoldenGate 345 6306 OGG ACTIVE 98324 oracle@oel7n01 (AS0C) GoldenGate 351 57661 OGG ACTIVE 98378 oracle@oel7n01 (AS12) GoldenGate 353 35293 OGG ACTIVE 98425 oracle@oel7n01 (AS1P) GoldenGate 356 42342 OGG ACTIVE 98391 oracle@oel7n01 (AS18) GoldenGate 357 39149 OGG ACTIVE 98328 oracle@oel7n01 (AS0E) GoldenGate 359 61910 OGG ACTIVE 98383 oracle@oel7n01 (AS14) GoldenGate 361 4149 OGG ACTIVE 98387 oracle@oel7n01 (AS16) GoldenGate 362 10139 OGG ACTIVE 98312 oracle@oel7n01 (AS06) GoldenGate 365 45163 OGG ACTIVE 98405 oracle@oel7n01 (AS1F) GoldenGate 366 12196 OGG ACTIVE 98352 oracle@oel7n01 (AS0Q) GoldenGate 367 64783 OGG ACTIVE 98332 oracle@oel7n01 (AS0G) GoldenGate 368 61680 OGG ACTIVE 98373 oracle@oel7n01 (AS10) GoldenGate 370 26936 OGG ACTIVE 98304 oracle@oel7n01 (AS02) GoldenGate 371 6628 OGG ACTIVE 98308 oracle@oel7n01 (AS04) GoldenGate 372 32173 OGG ACTIVE 98401 oracle@oel7n01 (AS1D) GoldenGate 373 60017 OGG ACTIVE 98413 oracle@oel7n01 (AS1J) GoldenGate 377 62967 OGG ACTIVE 98369 oracle@oel7n01 (AS0Y) GoldenGate 378 51894 OGG ACTIVE 98421 oracle@oel7n01 (AS1N) GoldenGate 387 44287 OGG ACTIVE 98348 oracle@oel7n01 (AS0O) GoldenGate 388 59532 OGG ACTIVE 98361 oracle@oel7n01 (AS0U) GoldenGate 390 26175 OGG ACTIVE 98417 oracle@oel7n01 (AS1L) GoldenGate 391 25455 OGG ACTIVE 98397 oracle@oel7n01 (AS1B) GoldenGate 395 44595 OGG ACTIVE 98365 oracle@oel7n01 (AS0W) GoldenGate 399 51558 OGG ACTIVE 98340 oracle@oel7n01 (AS0K) GoldenGate 401 9020 OGG ACTIVE 98316 oracle@oel7n01 (AS08) GoldenGate 62 rows selected.
复制
让我们再看看当parallelism 60时的alert日志。
2022-08-01T20:29:36.693012+08:00 PDB(3):APPLY OGG$REPDEMO: Parameter Set by User: PARALLELISM Value: 60 2022-08-01T20:29:36.693057+08:00 PDB(3):APPLY OGG$REPDEMO: Parameter Set by User: ALLOW_DUPLICATE_ROWS Value: Y 2022-08-01T20:29:36.693094+08:00 PDB(3):APPLY OGG$REPDEMO: Parameter Set by User: PRESERVE_ENCRYPTION Value: N 2022-08-01T20:29:36.693131+08:00 PDB(3):APPLY OGG$REPDEMO: Parameter Set by User: EAGER_SIZE Value: 15100 2022-08-01T20:29:36.693167+08:00 PDB(3):APPLY OGG$REPDEMO: Parameter Set by User: MAX_PARALLELISM Value: 60 2022-08-01T20:29:36.693203+08:00 PDB(3):APPLY OGG$REPDEMO: Parameter Set by User: CDGRANULARITY Value: ROW 2022-08-01T20:29:36.693239+08:00 PDB(3):APPLY OGG$REPDEMO: Parameter Set by User: SUPPRESSTRIGGERS Value: Y 2022-08-01T20:29:36.693319+08:00 PDB(3):APPLY OGG$REPDEMO: Parameter Set by User: _GG_ERROR_HANDLING_MODE Value: ASYNC 2022-08-01T20:29:37.606057+08:00 PDB(3): PDB(3):GoldenGate Apply Reader for OGG$REPDEMO started AS01 with pid=116 OS id=98302 2022-08-01T20:29:37.635184+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS04 with pid=119 OS id=98308 2022-08-01T20:29:37.644626+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1K with pid=176 OS id=98415 2022-08-01T20:29:37.645558+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1H with pid=173 OS id=98409 2022-08-01T20:29:37.656564+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1O with pid=180 OS id=98423 2022-08-01T20:29:37.661634+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS14 with pid=159 OS id=98383 2022-08-01T20:29:37.662670+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS18 with pid=163 OS id=98391 2022-08-01T20:29:37.663559+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS17 with pid=162 OS id=98389 2022-08-01T20:29:37.664570+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1C with pid=168 OS id=98399 2022-08-01T20:29:37.666097+08:00 PDB(3): 2022-08-01T20:29:37.666123+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS12 with pid=157 OS id=98378 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS03 with pid=118 OS id=98306 2022-08-01T20:29:37.667595+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS19 with pid=164 OS id=98393 2022-08-01T20:29:37.667751+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1F with pid=171 OS id=98405 2022-08-01T20:29:37.669808+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1A with pid=166 OS id=98395 2022-08-01T20:29:38.394844+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0R with pid=144 OS id=98354 2022-08-01T20:29:38.403212+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0K with pid=135 OS id=98340 2022-08-01T20:29:38.403923+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1J with pid=175 OS id=98413 2022-08-01T20:29:38.410785+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1G with pid=172 OS id=98407 2022-08-01T20:29:38.435094+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1I with pid=174 OS id=98411 2022-08-01T20:29:38.442955+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0U with pid=149 OS id=98361 2022-08-01T20:29:38.466664+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1E with pid=170 OS id=98403 2022-08-01T20:29:38.499265+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0P with pid=142 OS id=98350 2022-08-01T20:29:38.523536+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1M with pid=178 OS id=98419 2022-08-01T20:29:38.531347+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0T with pid=148 OS id=98359 2022-08-01T20:29:38.531876+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS09 with pid=124 OS id=98318 2022-08-01T20:29:38.532630+08:00 PDB(3): 2022-08-01T20:29:38.532647+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0F with pid=130 OS id=98330 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0W with pid=151 OS id=98365 2022-08-01T20:29:38.538329+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1N with pid=179 OS id=98421 2022-08-01T20:29:38.539420+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0X with pid=152 OS id=98367 2022-08-01T20:29:38.539654+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS13 with pid=158 OS id=98381 2022-08-01T20:29:38.555818+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS07 with pid=122 OS id=98314 2022-08-01T20:29:38.562832+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0V with pid=150 OS id=98363 2022-08-01T20:29:38.563063+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0Y with pid=153 OS id=98369 2022-08-01T20:29:38.563817+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS02 with pid=117 OS id=98304 2022-08-01T20:29:38.563934+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0G with pid=131 OS id=98332 2022-08-01T20:29:38.570649+08:00 PDB(3): 2022-08-01T20:29:38.570664+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0H with pid=132 OS id=98334 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS16 with pid=161 OS id=98387 2022-08-01T20:29:38.571085+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0Q with pid=143 OS id=98352 2022-08-01T20:29:38.571314+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0C with pid=127 OS id=98324 2022-08-01T20:29:38.586553+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0L with pid=136 OS id=98342 2022-08-01T20:29:38.594974+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS10 with pid=155 OS id=98373 2022-08-01T20:29:38.595835+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1L with pid=177 OS id=98417 2022-08-01T20:29:38.603409+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1P with pid=181 OS id=98425 2022-08-01T20:29:38.603529+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0I with pid=133 OS id=98336 2022-08-01T20:29:38.610469+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0S with pid=145 OS id=98356 2022-08-01T20:29:38.610656+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0B with pid=126 OS id=98322 2022-08-01T20:29:38.611359+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0M with pid=138 OS id=98344 2022-08-01T20:29:38.611809+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0D with pid=128 OS id=98326 2022-08-01T20:29:38.612591+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS15 with pid=160 OS id=98385 2022-08-01T20:29:38.618852+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0O with pid=141 OS id=98348 2022-08-01T20:29:38.626410+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1B with pid=167 OS id=98397 2022-08-01T20:29:38.634627+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0N with pid=140 OS id=98346 2022-08-01T20:29:38.635926+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1D with pid=169 OS id=98401 2022-08-01T20:29:38.636476+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS05 with pid=120 OS id=98310 2022-08-01T20:29:38.636800+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0E with pid=129 OS id=98328 2022-08-01T20:29:38.642494+08:00 PDB(3): 2022-08-01T20:29:38.642497+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS11 with pid=156 OS id=98376 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS08 with pid=123 OS id=98316 2022-08-01T20:29:38.643000+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0J with pid=134 OS id=98338 2022-08-01T20:29:38.643337+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS06 with pid=121 OS id=98312 2022-08-01T20:29:38.650516+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0Z with pid=154 OS id=98371 2022-08-01T20:29:38.651334+08:00 PDB(3): PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0A with pid=125 OS id=98320 2022-08-01T20:29:40.566126+08:00 PDB(3):Setting statement cache size for apply OGG$REPDEMO to 130. Current open_cursors value is 300. 2022-08-01T20:29:40.886798+08:00 PDB(3):APPLY OGG$REPDEMO: Source Database: NULL PDB(3):APPLY OGG$REPDEMO: Source Root Name: NULL PDB(3):APPLY OGG$REPDEMO: Applied Message Number: 4790762 PDB(3):APPLY OGG$REPDEMO: Message Create Time: 19:32, 2022/07/26 2022-08-01T20:29:48.340690+08:00 PDB(3):Propagation Receiver (CCA) for GoldenGate replicat and Apply OGG$REPDEMO with pid=110, OS id=98297, objnum=0 , LCRID ver=2 started. 2022-08-01T20:31:14.422416+08:00 ropagation Receiver (CCA) for GoldenGate replicat and Apply OGG$REPDEMO with pid=110, OS id=98297, objnum=0 , LCRID ver=2 started
复制
我们可以制作一个表格记录整理上述信息。
测试数据 | 实际耗时 | 产生的session数量 | |
---|---|---|---|
parallelism 1 | 150w | 00:01:21 | 3 |
parallelism 6 | 150w | 00:01:44 | 8 |
parallelism 24 | 150w | 00:01:50 | 26 |
parallelism 60 | 150w | 00:02:08 | 62 |
由此我们可以发现parallelism x参数可能是个假参数。不管他并行度设置的多大,可能干活的进程还是一个,就是多人看一人干活的现象。并且它不只是个假并行参数,而且他还会加大Oracle数据库的负载量!!随着parallelism x参数增大的同时会影响数据库session数量同量增大。且OGG用户产生的session数量为x+2。一个主进程+一个守护进程+x个子进程。
且在parallelism 60时,等待一段时间后,发现OGG目标库复制进程abended了。
GGSCI (oel7n01) 100> info all Program Status Group Lag at Chkpt Time Since Chkpt MANAGER RUNNING EXTRACT STOPPED EXTPDB 481:41:00 00:27:25 EXTRACT RUNNING PUMPPDB 00:00:00 00:00:02 REPLICAT ABENDED REPDEMO 00:00:00 13:34:07
复制
查看日志发现,
2022-08-02 09:53:35 ERROR OGG-00664 OCI Error ORA (status = 4036-ORA-04036: PGA memory used by the instance exceeds PGA_AGGREGATE_LIMI
复制
报了PGA内存超出限制的错误。
所以当parallelism x参数过大的时候,产生的session过大导致数据库负载过大而造成OGG复制进程abended。
而不论该参数值为多少,v$ px_process与v$px_session视图都不变。都为
SQL> select * from v$px_process; SERV STATUS PID SPID SID SERIAL# IS_GV CON_ID ---- --------- ---------- ------------------------ ---------- ---------- ----- ---------- P000 AVAILABLE 60 5515 FALSE 0 P001 AVAILABLE 61 5518 FALSE 0 P002 AVAILABLE 103 6554 FALSE 0 P003 AVAILABLE 104 6563 FALSE 0
复制
SQL> select * from v$px_session; no rows selected
复制
所以说OGG集成复制中复制进程的并行,并不是我们平时理解的Oracle临时的并行执行,更能说明这其实是个假并行!!!
4.结论及建议
由上述测试可知,如果OGG进程出现异常,我们可以通过OGG进程与数据库session对应关系来找到相对应的session。然后再通过ash视图来定位故障问题再找到相对应的解决方案。而在OGG集成复制的架构中,并行度参数parallelism x其实是一个假并行。所以后续OGG才会推出复制进程的协调模式和并行复制模式来优化集成模式对并行方面的不足。且parallelism x还与session数量有直接且相对应的关系。parallelism x增大,session也会随之增大。parallelism x设置过大会造成产生的session过大而数据库负载过大以至于数据库查询速度变慢甚至OGG复制进程会abended的情况。所以这个参数我们以后就不用去追求x数量多大啦,反正都是多人看一人干活的现象,并没有真正的多人一起干活。而且还会增大session。如果要解决复制进程集成模式不并行的弊端,可以改用OGG在12.1版本推出的协调模式或者12.3版本最新推出的并行复制模式来处理。
5.附件
https://docs.oracle.com/en/middleware/goldengate/core/12.3.0.1/ggodb/optional-parameters-integrated-modes.html#GUID-5AADB73B-6A60-4391-B297-36B14D19577A