暂无图片
暂无图片
3
暂无图片
暂无图片
暂无图片

Oracle OGG进程与数据库session对应关系测试

原创 龙女Shyvana 云和恩墨 2022-09-23
1508

1、报告概述

本篇文档对OracleOGG进程与数据库session对应关系做了较为详细的测试。

首先我们要了解什么是进程?什么是session。

进程是什么?进程就是一个执行中程序的实例,即一个正在执行的程序。我更喜欢将进程理解为分配系统资源的一个单位。那什么是session呢?将session和connection放在一起讲就更好理解了。session顾名思义,会话。意思是用户和服务器建立连接时的逻辑会话。而connection是用户与服务器的物理通信链路。简单来说connection为连接两地的桥梁,而session是通过桥梁给两地运送物资的卡车。

那做这个测试的目的是解决在OGG复制进程出现异常的时候,不知如何定位问题的情况。

我们知道定位问题的最终目标是要找到造成该问题的SQL或者event事务。那我们又知道通过该session信息,可以得到ash视图。而ash视图可以查询一段时间内某进程session都在执行什么SQL。

由此我们知道,如果OGG进程出现异常,如果弄清楚OracleOGG进程与数据库session对应关系,即可以快速通过ash视图定位到故障问题。

该文档不仅仅讲解了OracleOGG进程与数据库session之间的对应关系,还对OGG复制进程参数parallelism x做了一个补充学习。了解了这个参数的含义,作用,使用方法和一些可能造成的副作用。

2、环境准备

源库:oracle 11g rac

目标库:oracle 19c rac

ogg版本:ogg 19.1.0.0.4

3、问题测试

3.1.OGG进程与操作系统进程,与数据库session的管理关系


首先在OGG执行info 复制进程名称查询该复制进程的Process ID,

 GGSCI (oel7n01) 107> info REPDEMO
 
 REPLICAT REPDEMO Last Started 2022-08-02 10:20 Status RUNNING
 INTEGRATED
 Checkpoint Lag 00:00:00 (updated 00:00:03 ago)
 Process ID 58335
 Log Read Checkpoint File /ogg/dirdat/lt000000033
 2022-08-02 09:53:27.208488 RBA 1566
复制

得到了该复制进程的Process ID后,在Oracle数据库内根据v$ session视图查询该复制进程具体信息,即得到OracleOGG进程与数据库session之间的对应关系。(v$session中的process字段代表操作系统客户端进程 ID。)

SQL> select SID,SERIAL#,USERNAME,STATUS,PROCESS,PROGRAM,MODULE from v$session where process='58335';
   
   SID SERIAL# USERNAME STATUS PROCESS PROGRAM MODULE
   ------   -------- ---------- -------- -------------- ------------------------------ ----------
   315     24494    OGG   INACTIVE       58335     replicat@oel7n01 (TNS V1-V3) GoldenGate
复制

看完这个方法,那如果在操作系统ps -ef|grep 复制进程名查询呢,看看会得到什么结果?

   [oracle@oel7n01 ~]$ ps -ef|grep REPDEMO
    
    oracle 58335 13402 0 10:20 ? 00:00:01 /ogg/replicat PARAMFILE /ogg/dirprm/repdemo.prm REPORTFILE /ogg/dirrpt/REPDEMO.rpt PROCESSID REPDEMO
    oracle 90003 85042 0 10:46 pts/3 00:00:00 grep --color=auto REPDEMO
复制

我们可以知道ps -ef|grep复制进程名查询到的的进程对应的是OGG复制进程

那,该进程可不可以在V$PROCESS中查询到呢?

答案是否定的。因为V$PROCESS是显示有关当前活动进程的信息,但由上述表述可知该进程状态STATUS为INACTIVE,所以并不能在V$PROCESS中查询到。

那除了这个方法外,还有没有办法能够得到OGG进程与数据库session对应关系呢?

答案是有的,可以通过v$session中username字段值等于OGG来查询。

SQL>  select SID,SERIAL#,USERNAME,STATUS,PROCESS,PROGRAM,MODULE from v$session where USERNAME= 'OGG';

     SID    SERIAL# USERNAME   STATUS   PROCESS    PROGRAM              MODULE
---------- ---------- ---------- -------- ---------- -------------------- --------------------
      69       6024 OGG        ACTIVE   58350      oracle@oel7n01 (AS02 GoldenGate
                                                   )

      75      50126 OGG        ACTIVE   58348      oracle@oel7n01 (AS01 GoldenGate
                                                   )

     106       4139 OGG        ACTIVE   58352      oracle@oel7n01 (AS03 GoldenGate
                                                   )

     109      64175 OGG        ACTIVE   58360      oracle@oel7n01 (AS07 GoldenGate
                                                   )

     356      39224 OGG        INACTIVE 58335      replicat@oel7n01 (TN GoldenGate
                                                   S V1-V3)

     361      19510 OGG        ACTIVE   58354      oracle@oel7n01 (AS04 GoldenGate
                                                   )

     392      32134 OGG        ACTIVE   58358      oracle@oel7n01 (AS06 GoldenGate
                                                   )

     393      39260 OGG        ACTIVE   58356      oracle@oel7n01 (AS05 GoldenGate
                                                   )

8 rows selected.
复制

3.2.OGG性能分析,当前session和历史session mode的性能分析


我们已经在3.1学习了解了OracleOGG进程与数据库session之间的对应关系,那现在我们来模拟一下目标库OGG复制进程Lag at Chkpt过高的情况,看能不能成功定位到故障问题。

首先制造同步数据量150w条,模拟OGG复制进程Lag at Chkpt过高现象:

源库创建表t_parallel:

SQL> create table t_parallel(a int,b int);
复制

然后插入数据,模拟处理数据量过大的情况:

    SQL> insert into t_parallel select level,level from dual connect by level<=5e5;
    SQL> commit;
    SQL> insert into t_parallel select level+500000,level+500000 from dual connect by level<5e5;
    SQL> commit;
    SQL> insert into t_parallel select level+1000000,level+1000000 from dual connect by level<=5e5;
    SQL> commit;	
复制

我们去目标库OGG查询下进程信息,发现复制进程Lag at Chkpt确实异于正常值

   GGSCI (oel7n01) 89> !
   
   info all
   Program Status Group Lag at Chkpt Time Since Chkpt
   MANAGER RUNNING
   EXTRACT STOPPED EXTPDB 312:21:17 08:04:51
   EXTRACT STOPPED PUMPPDB 00:00:00 08:04:43
   REPLICAT RUNNING REPDEMO 00:00:38 00:00:06
复制

根据3.1我们知道该如何通过OGG复制进程定位数据库session。

首先查询复制进程信息,找到复制进程的Process ID

    GGSCI (oel7n01) 97> info REPDEMO
    
    REPLICAT REPDEMO Last Started 2022-07-26 16:35 Status RUNNING
    INTEGRATED
    Checkpoint Lag 00:01:12 (updated 00:00:32 ago)
    Process ID 104409
    Log Read Checkpoint File /ogg/dirdat/lt000000029
    2022-07-26 16:37:33.498043 RBA 476690848
复制

然后在数据库内定位session

SQL> select SID,SERIAL#,USERNAME,STATUS,PROCESS,PROGRAM,MODULE from v$session where process='104409';
    
    SID SERIAL# USERNAME STATUS PROCESS PROGRAM MODULE
    ---------- ---------- ---------- -------- ------------------------ -------------------- ---------------
    315 5513 OGG INACTIVE 104409 replicat@oel7n01 (TN GoldenGateS V1-V3)
复制

再通过ash视图,查询1天内该OGG进程session都在执行什么SQL。

SQL> select sql_id,event,BLOCKING_SESSION,CURRENT_OBJ#,count(*) from v$active_session_history where SAMPLE_TIME>sysdate-1 and SESSION_ID=315 and SESSION_SERIAL#=5513 group by sql_id,event,BLOCKING_SESSION,CURRENT_OBJ# order by 5,4;

SQL_ID        EVENT                                                            BLOCKING_SESSION CURRENT_OBJ#   COUNT(*)
------------- ---------------------------------------------------------------- ---------------- ------------ ----------
              REPL Capture/Apply: miscellaneous                                                          774          1
                                                                                                          -1          2
                                                                                                         766          6
361x11wcsfp5u                                                                                            766          6
361x11wcsfp5u REPL Capture/Apply: memory                                                                 766         29
              REPL Capture/Apply: memory                                                                 766         36
              REPL Capture/Apply: memory                                                                  -1         37

7 rows selected.
复制

以此来定位OGG复制进程Lag at Chkpt延迟过高原因是在等待内存资源分配。

3.3.OGG并行参数调整,对应数据库的session ,process的影响,以及并行参数的特征

那最后,我们来讨论下OGG并行参数度参数parallelism x是干啥的呢?改变这个参数,到底真的对并行度的改变有没有直接关系呢?

要了解这个参数我们要先了解OGG在12c版本中复制进程的新特性:集成复制。

9e1baf5c2bd008cabfbf5c869570321

OGG经典复制架构

读Oracle GoldenGate的踪迹。
执行数据筛选、映射和转换。
构造表示源数据库DML或DDL事务的SQL语句(按提交的顺序)。
通过Oracle调用接口(OCI)将SQL应用于目标。

bf9b46504005b3a1b0547eabe7ca3a2

OGG集成复制架构

阅读Oracle GoldenGate的踪迹。

执行数据筛选、映射和转换。

构造表示源数据库DML事务(按提交的顺序)的逻辑更改记录(LCR)。DDL由Replicat直接应用。

通过轻量级流媒体接口连接到目标数据库中的后台进程(称为数据库入站服务器)。

将LCRs传输到入站服务器,后者将数据应用到目标数据库。

由上述两个架构图可知集成复制和经典复制的差别在于经典复制是将Trail文件解析成SQL语句之后再对SQL语句进行解析,而集成复制是将Trail文件解析成LCRs。

但这并不在我们今天的讨论范畴。我们今天要讨论的是集成复制的并行度参数parallelism x。也就是下面这张图。

IMG_256

显而易见,由图可知集成复制参数parallelism x能够增加目标库OGG复制进程并行度。也就是将进程分为主进程和子进程,主进程操作子进程,子进程进行工作。

但真实的情况到底是不是像OGG讲的这样呢?还是说多个子进程看着主进程工作呢?

我们看看OGG12.3版本最新推出的复制进程并行复制模式架构。我们可以发现,这好像跟并行度参数parallelism x很相似???为什么复制进程parallelism x并行度参数能解决的并行问题,还需要另外推出新的模式来解决呢?

OGG并行复制架构

我们先来聊聊,parallelism x到底是什么?

PARALLELISM: Sets a minimum number of apply servers that can be used under normal conditions. Setting PARALLELISM to 1 disables apply parallelism, and transactions are applied with a single apply server process. The default for Oracle GoldenGate is 4. For Oracle Standard Edition, this must be set to 1.

集成复制参数PARALLELISM:设置在正常情况下可以使用的应用服务器的最小数量。设置PARALLELISM为 1 将禁用应用并行性,事务通过单个应用服务器进程应用。Oracle GoldenGate 的默认值为4. 对于 Oracle 标准版,这必须设置为1。

那么这个参数有没有上限呢,如果有的话是多少呢?

查阅官方文档,我们可以知道

MAX_PARALLELISM: Limits the number of apply servers that can be used when the load is heavy. This number is reduced again when the workload subsides. The automatic tuning of the number of apply servers is effective only if PARALLELISM is greater than 1 and MAX_PARALLELISM is greater than PARALLELISM. If PARALLELISM is equal to MAX_PARALLELISM, the number of apply servers remains constant during the workload. The default for Oracle GoldenGate is 50.

MAX_PARALLELISM:限制负载较重时可以使用的应用服务器的数量。当工作量减少时,这个数字会再次减少。应用服务器数量的自动调优仅在PARALLELISM大于1且MAX_PARALLELISM大于时有效PARALLELISM。如果PARALLELISM等于MAX_PARALLELISM,则应用服务器的数量在工作负载期间保持不变。Oracle GoldenGate的默认值为50。

所以我们可以通过调整MAX_PARALLELISM参数的值来控制PARALLELISM x的上限。

那了解完parallelism x参数是什么及其工作原理后,我们就要开始测试他到底有何作用。究竟能不能有真正并行的效果。

首先我们先设置上限值为60,即MAX_PARALLELISM 60.然后我们将parallelism参数分别设置为1、6、24、60,重复3.2的操作,源库插入150w条数据,查看目标库复制进程延迟时间。

  1. parallelism 1:
GGSCI (oel7n01) 246> !
info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           
EXTRACT     STOPPED     EXTPDB      312:21:17     10:16:46    
EXTRACT     STOPPED     PUMPPDB     00:00:00      10:16:38    
REPLICAT    RUNNING     REPDEMO     00:01:21      00:00:06  
复制
  1. parallelism 6:
GGSCI (oel7n01) 378> !
info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           
EXTRACT     STOPPED     EXTPDB      312:21:17     10:21:40    
EXTRACT     STOPPED     PUMPPDB     00:00:00      10:21:32    
REPLICAT    RUNNING     REPDEMO     00:01:44      00:00:01   
复制
  1. parallelism 24:
GGSCI (oel7n01) 542> !
info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           
EXTRACT     STOPPED     EXTPDB      312:21:17     10:25:51    
EXTRACT     STOPPED     PUMPPDB     00:00:00      10:25:43    
REPLICAT    RUNNING     REPDEMO     00:01:50      00:00:10   
复制
  1. parallelism 60:
GGSCI (oel7n01) 931> !
info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           
EXTRACT     STOPPED     EXTPDB      312:21:17     10:55:50    
EXTRACT     STOPPED     PUMPPDB     00:00:00      10:55:42    
REPLICAT    RUNNING     REPDEMO     00:02:08      00:00:10
复制

为什么会出现以上情况,不应该Lag at Chkpt时间会随着parallelism参数增大而随之减少嘛?为什么不但没有减少,反而还增加了呢?

我们查看下OGG并行参数调整,对应数据库的session ,process的影响。在parallelism为1,6,24,60时对应数据库的session ,process情况是如何呢?

  1. parallelism 1:
SQL> select SID,SERIAL#,USERNAME,STATUS,PROCESS,PROGRAM,MODULE from v$session where USERNAME= 'OGG';
    
    SID SERIAL# USERNAME STATUS PROCESS PROGRAM MODULE
    
    ---------- ---------- ---------- -------- ------------------------ ------------------------------ ----------
    315 26453 OGG INACTIVE 35511 replicat@oel7n01 (TNS V1-V3) GoldenGate
    318 545 OGG ACTIVE 35523 oracle@oel7n01 (AS02) GoldenGate
    322 42855 OGG ACTIVE 35521 oracle@oel7n01 (AS01) GoldenGate

复制
  1. parallelism 6:
SQL> select SID,SERIAL#,USERNAME,STATUS,PROCESS,PROGRAM,MODULE from v$session where USERNAME= 'OGG';
  
  SID SERIAL# USERNAME STATUS PROCESS PROGRAM MODULE
  ---------- ---------- ---------- -------- ------------------------ ------------------------------ ----------
  79 56748 OGG ACTIVE 94165 oracle@oel7n01 (AS06) GoldenGate
  95 29469 OGG ACTIVE 94156 oracle@oel7n01 (AS02) GoldenGate
  106 51330 OGG ACTIVE 94160 oracle@oel7n01 (AS04) GoldenGate
  313 29309 OGG ACTIVE 94158 oracle@oel7n01 (AS03) GoldenGate
  315 38455 OGG INACTIVE 94143 replicat@oel7n01 (TNS V1-V3) GoldenGate
  318 6775 OGG ACTIVE 94163 oracle@oel7n01 (AS05) GoldenGate
  322 26403 OGG ACTIVE 94154 oracle@oel7n01 (AS01) GoldenGate
  343 54221 OGG ACTIVE 94167 oracle@oel7n01 (AS07) GoldenGate
  
  8 rows selected.
复制
  1. parallelism 24:
SQL> select SID,SERIAL#,USERNAME,STATUS,PROCESS,PROGRAM,MODULE from v$session where USERNAME= 'OGG';
 
 SID SERIAL# USERNAME STATUS PROCESS PROGRAM MODULE
 
 ---------- ---------- --------------- -------- ---------- ------------------------------ ---------------
 74 57388 OGG ACTIVE 55488 oracle@oel7n01 (AS08) GoldenGate
 80 41229 OGG ACTIVE 55502 oracle@oel7n01 (AS0A) GoldenGate
 82 24225 OGG ACTIVE 55510 oracle@oel7n01 (AS0E) GoldenGate
 95 21231 OGG ACTIVE 55514 oracle@oel7n01 (AS0G) GoldenGate
 122 10319 OGG ACTIVE 55478 oracle@oel7n01 (AS03) GoldenGate
 123 25208 OGG ACTIVE 55522 oracle@oel7n01 (AS0K) GoldenGate
 127 54472 OGG ACTIVE 55526 oracle@oel7n01 (AS0M) GoldenGate
 136 1374 OGG ACTIVE 55484 oracle@oel7n01 (AS06) GoldenGate
 139 249 OGG ACTIVE 55480 oracle@oel7n01 (AS04) GoldenGate
 143 6124 OGG ACTIVE 55530 oracle@oel7n01 (AS0O) GoldenGate
 151 58685 OGG ACTIVE 55506 oracle@oel7n01 (AS0C) GoldenGate
 153 57419 OGG ACTIVE 55518 oracle@oel7n01 (AS0I) GoldenGate
 336 6182 OGG INACTIVE 55463 replicat@oel7n01 (TNS V1-V3) GoldenGate
 344 62207 OGG ACTIVE 55474 oracle@oel7n01 (AS01) GoldenGate
 348 55359 OGG ACTIVE 55524 oracle@oel7n01 (AS0L) GoldenGate
 375 20671 OGG ACTIVE 55516 oracle@oel7n01 (AS0H) GoldenGate
 380 51920 OGG ACTIVE 55492 oracle@oel7n01 (AS09) GoldenGate
 388 51902 OGG ACTIVE 55476 oracle@oel7n01 (AS02) GoldenGate
 391 44261 OGG ACTIVE 55512 oracle@oel7n01 (AS0F) GoldenGate
 392 28049 OGG ACTIVE 55508 oracle@oel7n01 (AS0D) GoldenGate
 394 35863 OGG ACTIVE 55520 oracle@oel7n01 (AS0J) GoldenGate
 396 42997 OGG ACTIVE 55486 oracle@oel7n01 (AS07) GoldenGate
 400 20070 OGG ACTIVE 55532 oracle@oel7n01 (AS0P) GoldenGate
 401 43042 OGG ACTIVE 55528 oracle@oel7n01 (AS0N) GoldenGate
 411 63191 OGG ACTIVE 55504 oracle@oel7n01 (AS0B) GoldenGate
 416 35222 OGG ACTIVE 55482 oracle@oel7n01 (AS05) GoldenGate
 
 26 rows selected.
复制
  1. parallelism 60:
SQL> select SID,SERIAL#,USERNAME,STATUS,PROCESS,PROGRAM,MODULE from v$session where USERNAME= 'OGG';
 
 SID SERIAL# USERNAME STATUS PROCESS PROGRAM MODULE
 
 ---------- ---------- ---------- -------- ---------- ---------------------------------------- ---------------
 69 49323 OGG ACTIVE 98389 oracle@oel7n01 (AS17) GoldenGate
 70 13168 OGG ACTIVE 98306 oracle@oel7n01 (AS03) GoldenGate
 72 64405 OGG ACTIVE 98326 oracle@oel7n01 (AS0D) GoldenGate
 75 15068 OGG ACTIVE 98415 oracle@oel7n01 (AS1K) GoldenGate
 79 49623 OGG ACTIVE 98411 oracle@oel7n01 (AS1I) GoldenGate
 80 5966 OGG ACTIVE 98399 oracle@oel7n01 (AS1C) GoldenGate
 81 37079 OGG INACTIVE 98292 replicat@oel7n01 (TNS V1-V3) GoldenGate
 82 51587 OGG ACTIVE 98302 oracle@oel7n01 (AS01) GoldenGate
 89 24387 OGG ACTIVE 98354 oracle@oel7n01 (AS0R) GoldenGate
 90 19180 OGG ACTIVE 98423 oracle@oel7n01 (AS1O) GoldenGate
 92 26431 OGG ACTIVE 98338 oracle@oel7n01 (AS0J) GoldenGate
 97 59593 OGG ACTIVE 98381 oracle@oel7n01 (AS13) GoldenGate
 98 16971 OGG ACTIVE 98334 oracle@oel7n01 (AS0H) GoldenGate
 106 12729 OGG ACTIVE 98350 oracle@oel7n01 (AS0P) GoldenGate
 107 51683 OGG ACTIVE 98403 oracle@oel7n01 (AS1E) GoldenGate
 108 42579 OGG ACTIVE 98342 oracle@oel7n01 (AS0L) GoldenGate
 109 28372 OGG ACTIVE 98367 oracle@oel7n01 (AS0X) GoldenGate
 110 31792 OGG ACTIVE 98371 oracle@oel7n01 (AS0Z) GoldenGate
 111 6275 OGG ACTIVE 98346 oracle@oel7n01 (AS0N) GoldenGate
 113 61901 OGG ACTIVE 98363 oracle@oel7n01 (AS0V) GoldenGate
 117 31043 OGG ACTIVE 98318 oracle@oel7n01 (AS09) GoldenGate
 118 6120 OGG ACTIVE 98376 oracle@oel7n01 (AS11) GoldenGate
 119 6833 OGG ACTIVE 98407 oracle@oel7n01 (AS1G) GoldenGate
 127 23373 OGG ACTIVE 98393 oracle@oel7n01 (AS19) GoldenGate
 130 54514 OGG ACTIVE 98344 oracle@oel7n01 (AS0M) GoldenGate
 132 33696 OGG ACTIVE 98322 oracle@oel7n01 (AS0B) GoldenGate
 133 21055 OGG ACTIVE 98310 oracle@oel7n01 (AS05) GoldenGate
 134 33203 OGG ACTIVE 98330 oracle@oel7n01 (AS0F) GoldenGate
 137 4098 OGG ACTIVE 98359 oracle@oel7n01 (AS0T) GoldenGate
 140 59860 OGG ACTIVE 98395 oracle@oel7n01 (AS1A) GoldenGate
 143 34952 OGG ACTIVE 98314 oracle@oel7n01 (AS07) GoldenGate
 145 31872 OGG ACTIVE 98419 oracle@oel7n01 (AS1M) GoldenGate
 147 36669 OGG ACTIVE 98385 oracle@oel7n01 (AS15) GoldenGate
 325 2362 OGG ACTIVE 98320 oracle@oel7n01 (AS0A) GoldenGate
 335 48762 OGG ACTIVE 98356 oracle@oel7n01 (AS0S) GoldenGate
 339 40741 OGG ACTIVE 98409 oracle@oel7n01 (AS1H) GoldenGate
 340 21316 OGG ACTIVE 98336 oracle@oel7n01 (AS0I) GoldenGate
 345 6306 OGG ACTIVE 98324 oracle@oel7n01 (AS0C) GoldenGate
 351 57661 OGG ACTIVE 98378 oracle@oel7n01 (AS12) GoldenGate
 353 35293 OGG ACTIVE 98425 oracle@oel7n01 (AS1P) GoldenGate
 356 42342 OGG ACTIVE 98391 oracle@oel7n01 (AS18) GoldenGate
 357 39149 OGG ACTIVE 98328 oracle@oel7n01 (AS0E) GoldenGate
 359 61910 OGG ACTIVE 98383 oracle@oel7n01 (AS14) GoldenGate
 361 4149 OGG ACTIVE 98387 oracle@oel7n01 (AS16) GoldenGate
 362 10139 OGG ACTIVE 98312 oracle@oel7n01 (AS06) GoldenGate
 365 45163 OGG ACTIVE 98405 oracle@oel7n01 (AS1F) GoldenGate
 366 12196 OGG ACTIVE 98352 oracle@oel7n01 (AS0Q) GoldenGate
 367 64783 OGG ACTIVE 98332 oracle@oel7n01 (AS0G) GoldenGate
 368 61680 OGG ACTIVE 98373 oracle@oel7n01 (AS10) GoldenGate
 370 26936 OGG ACTIVE 98304 oracle@oel7n01 (AS02) GoldenGate
 371 6628 OGG ACTIVE 98308 oracle@oel7n01 (AS04) GoldenGate
 372 32173 OGG ACTIVE 98401 oracle@oel7n01 (AS1D) GoldenGate
 373 60017 OGG ACTIVE 98413 oracle@oel7n01 (AS1J) GoldenGate
 377 62967 OGG ACTIVE 98369 oracle@oel7n01 (AS0Y) GoldenGate
 378 51894 OGG ACTIVE 98421 oracle@oel7n01 (AS1N) GoldenGate
 387 44287 OGG ACTIVE 98348 oracle@oel7n01 (AS0O) GoldenGate
 388 59532 OGG ACTIVE 98361 oracle@oel7n01 (AS0U) GoldenGate
 390 26175 OGG ACTIVE 98417 oracle@oel7n01 (AS1L) GoldenGate
 391 25455 OGG ACTIVE 98397 oracle@oel7n01 (AS1B) GoldenGate
 395 44595 OGG ACTIVE 98365 oracle@oel7n01 (AS0W) GoldenGate
 399 51558 OGG ACTIVE 98340 oracle@oel7n01 (AS0K) GoldenGate
 401 9020 OGG ACTIVE 98316 oracle@oel7n01 (AS08) GoldenGate
 
 62 rows selected.
复制

让我们再看看当parallelism 60时的alert日志。

 2022-08-01T20:29:36.693012+08:00
 PDB(3):APPLY OGG$REPDEMO: Parameter Set by User: PARALLELISM Value: 60
 2022-08-01T20:29:36.693057+08:00
 PDB(3):APPLY OGG$REPDEMO: Parameter Set by User: ALLOW_DUPLICATE_ROWS Value: Y
 2022-08-01T20:29:36.693094+08:00
 PDB(3):APPLY OGG$REPDEMO: Parameter Set by User: PRESERVE_ENCRYPTION Value: N
 2022-08-01T20:29:36.693131+08:00
 PDB(3):APPLY OGG$REPDEMO: Parameter Set by User: EAGER_SIZE Value: 15100
 2022-08-01T20:29:36.693167+08:00
 PDB(3):APPLY OGG$REPDEMO: Parameter Set by User: MAX_PARALLELISM Value: 60
 2022-08-01T20:29:36.693203+08:00
 PDB(3):APPLY OGG$REPDEMO: Parameter Set by User: CDGRANULARITY Value: ROW
 2022-08-01T20:29:36.693239+08:00
 PDB(3):APPLY OGG$REPDEMO: Parameter Set by User: SUPPRESSTRIGGERS Value: Y
 2022-08-01T20:29:36.693319+08:00
 PDB(3):APPLY OGG$REPDEMO: Parameter Set by User: _GG_ERROR_HANDLING_MODE Value: ASYNC
 2022-08-01T20:29:37.606057+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Reader for OGG$REPDEMO started AS01 with pid=116 OS id=98302
 2022-08-01T20:29:37.635184+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS04 with pid=119 OS id=98308
 2022-08-01T20:29:37.644626+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1K with pid=176 OS id=98415
 2022-08-01T20:29:37.645558+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1H with pid=173 OS id=98409
 2022-08-01T20:29:37.656564+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1O with pid=180 OS id=98423
 2022-08-01T20:29:37.661634+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS14 with pid=159 OS id=98383
 2022-08-01T20:29:37.662670+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS18 with pid=163 OS id=98391
 2022-08-01T20:29:37.663559+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS17 with pid=162 OS id=98389
 2022-08-01T20:29:37.664570+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1C with pid=168 OS id=98399
 2022-08-01T20:29:37.666097+08:00
 PDB(3):
 2022-08-01T20:29:37.666123+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS12 with pid=157 OS id=98378
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS03 with pid=118 OS id=98306
 2022-08-01T20:29:37.667595+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS19 with pid=164 OS id=98393
 2022-08-01T20:29:37.667751+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1F with pid=171 OS id=98405
 2022-08-01T20:29:37.669808+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1A with pid=166 OS id=98395
 2022-08-01T20:29:38.394844+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0R with pid=144 OS id=98354
 2022-08-01T20:29:38.403212+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0K with pid=135 OS id=98340
 2022-08-01T20:29:38.403923+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1J with pid=175 OS id=98413
 2022-08-01T20:29:38.410785+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1G with pid=172 OS id=98407
 2022-08-01T20:29:38.435094+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1I with pid=174 OS id=98411
 2022-08-01T20:29:38.442955+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0U with pid=149 OS id=98361
 2022-08-01T20:29:38.466664+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1E with pid=170 OS id=98403
 2022-08-01T20:29:38.499265+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0P with pid=142 OS id=98350
 2022-08-01T20:29:38.523536+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1M with pid=178 OS id=98419
 2022-08-01T20:29:38.531347+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0T with pid=148 OS id=98359
 2022-08-01T20:29:38.531876+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS09 with pid=124 OS id=98318
 2022-08-01T20:29:38.532630+08:00
 PDB(3):
 2022-08-01T20:29:38.532647+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0F with pid=130 OS id=98330
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0W with pid=151 OS id=98365
 2022-08-01T20:29:38.538329+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1N with pid=179 OS id=98421
 2022-08-01T20:29:38.539420+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0X with pid=152 OS id=98367
 2022-08-01T20:29:38.539654+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS13 with pid=158 OS id=98381
 2022-08-01T20:29:38.555818+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS07 with pid=122 OS id=98314
 2022-08-01T20:29:38.562832+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0V with pid=150 OS id=98363
 2022-08-01T20:29:38.563063+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0Y with pid=153 OS id=98369
 2022-08-01T20:29:38.563817+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS02 with pid=117 OS id=98304
 2022-08-01T20:29:38.563934+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0G with pid=131 OS id=98332
 2022-08-01T20:29:38.570649+08:00
 PDB(3):
 2022-08-01T20:29:38.570664+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0H with pid=132 OS id=98334
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS16 with pid=161 OS id=98387
 2022-08-01T20:29:38.571085+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0Q with pid=143 OS id=98352
 2022-08-01T20:29:38.571314+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0C with pid=127 OS id=98324
 2022-08-01T20:29:38.586553+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0L with pid=136 OS id=98342
 2022-08-01T20:29:38.594974+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS10 with pid=155 OS id=98373
 2022-08-01T20:29:38.595835+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1L with pid=177 OS id=98417
 2022-08-01T20:29:38.603409+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1P with pid=181 OS id=98425
 2022-08-01T20:29:38.603529+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0I with pid=133 OS id=98336
 2022-08-01T20:29:38.610469+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0S with pid=145 OS id=98356
 2022-08-01T20:29:38.610656+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0B with pid=126 OS id=98322
 2022-08-01T20:29:38.611359+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0M with pid=138 OS id=98344
 2022-08-01T20:29:38.611809+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0D with pid=128 OS id=98326
 2022-08-01T20:29:38.612591+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS15 with pid=160 OS id=98385
 2022-08-01T20:29:38.618852+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0O with pid=141 OS id=98348
 2022-08-01T20:29:38.626410+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1B with pid=167 OS id=98397
 2022-08-01T20:29:38.634627+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0N with pid=140 OS id=98346
 2022-08-01T20:29:38.635926+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS1D with pid=169 OS id=98401
 2022-08-01T20:29:38.636476+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS05 with pid=120 OS id=98310
 2022-08-01T20:29:38.636800+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0E with pid=129 OS id=98328
 2022-08-01T20:29:38.642494+08:00
 PDB(3):
 2022-08-01T20:29:38.642497+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS11 with pid=156 OS id=98376
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS08 with pid=123 OS id=98316
 2022-08-01T20:29:38.643000+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0J with pid=134 OS id=98338
 2022-08-01T20:29:38.643337+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS06 with pid=121 OS id=98312
 2022-08-01T20:29:38.650516+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0Z with pid=154 OS id=98371
 2022-08-01T20:29:38.651334+08:00
 PDB(3):
 PDB(3):GoldenGate Apply Server for OGG$REPDEMO started AS0A with pid=125 OS id=98320
 2022-08-01T20:29:40.566126+08:00
 PDB(3):Setting statement cache size for apply OGG$REPDEMO to 130. Current open_cursors value is 300.
 2022-08-01T20:29:40.886798+08:00
 PDB(3):APPLY OGG$REPDEMO: Source Database: NULL
 PDB(3):APPLY OGG$REPDEMO: Source Root Name: NULL
 PDB(3):APPLY OGG$REPDEMO: Applied Message Number: 4790762
 PDB(3):APPLY OGG$REPDEMO: Message Create Time: 19:32, 2022/07/26
 2022-08-01T20:29:48.340690+08:00
 PDB(3):Propagation Receiver (CCA) for GoldenGate replicat and Apply OGG$REPDEMO with pid=110, OS id=98297, objnum=0 , LCRID ver=2 started.
 2022-08-01T20:31:14.422416+08:00
 ropagation Receiver (CCA) for GoldenGate replicat and Apply OGG$REPDEMO with pid=110, OS id=98297, objnum=0 , LCRID ver=2 started
复制

我们可以制作一个表格记录整理上述信息。

测试数据 实际耗时 产生的session数量
parallelism 1 150w 00:01:21 3
parallelism 6 150w 00:01:44 8
parallelism 24 150w 00:01:50 26
parallelism 60 150w 00:02:08 62

由此我们可以发现parallelism x参数可能是个假参数。不管他并行度设置的多大,可能干活的进程还是一个,就是多人看一人干活的现象。并且它不只是个假并行参数,而且他还会加大Oracle数据库的负载量!!随着parallelism x参数增大的同时会影响数据库session数量同量增大。且OGG用户产生的session数量为x+2。一个主进程+一个守护进程+x个子进程。

且在parallelism 60时,等待一段时间后,发现OGG目标库复制进程abended了。

 GGSCI (oel7n01) 100> info all
     
 Program Status Group Lag at Chkpt Time Since Chkpt
 MANAGER RUNNING
 EXTRACT STOPPED EXTPDB 481:41:00 00:27:25
 EXTRACT RUNNING PUMPPDB 00:00:00 00:00:02
 REPLICAT ABENDED REPDEMO 00:00:00 13:34:07
复制

查看日志发现,

 2022-08-02 09:53:35 ERROR OGG-00664 OCI Error ORA (status = 4036-ORA-04036: PGA memory used by the instance exceeds PGA_AGGREGATE_LIMI
复制

报了PGA内存超出限制的错误。

所以当parallelism x参数过大的时候,产生的session过大导致数据库负载过大而造成OGG复制进程abended。

而不论该参数值为多少,v$ px_process与v$px_session视图都不变。都为

SQL> select * from v$px_process;
 
 SERV STATUS PID SPID SID SERIAL# IS_GV CON_ID
 ---- --------- ---------- ------------------------ ---------- ---------- ----- ----------
 P000 AVAILABLE 60 5515 FALSE 0
 
 P001 AVAILABLE 61 5518 FALSE 0
 
 P002 AVAILABLE 103 6554 FALSE 0
 
 P003 AVAILABLE 104 6563 FALSE 0
复制
SQL> select * from v$px_session;

  
no rows selected
复制

所以说OGG集成复制中复制进程的并行,并不是我们平时理解的Oracle临时的并行执行,更能说明这其实是个假并行!!!

4.结论及建议

由上述测试可知,如果OGG进程出现异常,我们可以通过OGG进程与数据库session对应关系来找到相对应的session。然后再通过ash视图来定位故障问题再找到相对应的解决方案。而在OGG集成复制的架构中,并行度参数parallelism x其实是一个假并行。所以后续OGG才会推出复制进程的协调模式和并行复制模式来优化集成模式对并行方面的不足。且parallelism x还与session数量有直接且相对应的关系。parallelism x增大,session也会随之增大。parallelism x设置过大会造成产生的session过大而数据库负载过大以至于数据库查询速度变慢甚至OGG复制进程会abended的情况。所以这个参数我们以后就不用去追求x数量多大啦,反正都是多人看一人干活的现象,并没有真正的多人一起干活。而且还会增大session。如果要解决复制进程集成模式不并行的弊端,可以改用OGG在12.1版本推出的协调模式或者12.3版本最新推出的并行复制模式来处理。

5.附件

https://docs.oracle.com/en/middleware/goldengate/core/12.3.0.1/ggodb/optional-parameters-integrated-modes.html#GUID-5AADB73B-6A60-4391-B297-36B14D19577A

最后修改时间:2022-09-23 16:57:35
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论