暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

记录一个19C关于克隆PDB的BUG,执行克隆PDB的命令一直完成不了

原创 张玉龙 2021-06-10
1390

1. 环境介绍:

克隆PDB的源端和目标端的操作系统和数据库版本完全相同,都是RHEL 7.5操作系统上的Oracle 19.8 2节点RAC,但是字符集不相同,源端ZHS16GBK,目标端是AL32UTF8。

2. 操作命令:

源端执行:

$ sqlplus / as sysdba SQL> alter session set container=pdb; SQL> grant create pluggable database to system;
复制

目标端执行:

$ sqlplus / as sysdba SQL> create public database link to_pdb_link connect to system identified by oracle using '192.168.56.10:1521/pdb'; SQL> select cdb from v$database@to_pdb_link; SQL> grant create pluggable database to system container=all; SQL> create pluggable database pdb from pdb@to_pdb_link; ---此命令一直等待
复制

3. 问题现象:

目标端数据库alert日志:

2021-06-04T13:21:48.412342+08:00 ARC3 (PID:130647): Archived Log entry 16377 added for T-1.S-8206 ID 0x9827fe4 LAD:1 2021-06-04T14:16:18.320927+08:00 TABLE AUDSYS.AUD$UNIFIED: ADDED INTERVAL PARTITION SYS_P4228 (84) VALUES LESS THAN (TIMESTAMP' 2021-07-01 00:00:00') 2021-06-04T14:16:44.944205+08:00 Session (915,48524): EXTERNAL SCN ALERT: Advanced SCN by 14476 minutes worth to 0x00000764053e2878, by outbound distributed transaction logon with returned scn Session (915,48524): EXTERNAL SCN SOURCE: Outbound connection to DBID: bf70154a GLOBAL_DB_NAME: PDB Session (915,48524): EXTERNAL SCN SOURCE: DBlink Name: TO_PDB_LINK, Connect String: (DESCRIPTION=(CONNECT_DATA=(SERVICE_NAME=pdb))(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.56.10)(PORT=1521))), Remote Machine: rac1 2021-06-04T14:19:04.810749+08:00 ************************************************************** Undo Create of Pluggable Database PDB with pdb id - 9. ************************************************************** 2021-06-04T14:24:32.383227+08:00 Active Session History (ASH) performed an emergency flush. This may mean that ASH is undersized. If emergency flushes are a recurring issue, you may consider increasing ASH size by setting the value of _ASH_SIZE to a sufficiently large value. Currently, ASH size is 266338304 bytes. Both ASH size and the total number of emergency flushes since instance startup can be monitored by running the following query: select total_size,awr_flush_emergency_count from v$ash_info; 2021-06-04T14:49:45.423345+08:00 PDB(10):Endian type of dictionary set to little 2021-06-04T14:49:56.695027+08:00 **************************************************************** Pluggable Database PDB with pdb id - 10 is created as UNUSABLE. If any errors are encountered before the pdb is marked as NEW, then the pdb must be dropped local undo-1, localundoscn-0x0000000000000118 **************************************************************** 2021-06-04T14:50:16.329817+08:00 Applying media recovery for pdb-4099 from SCN 8126166752258 to SCN 8126171048558 Remote log information: count-2 thr-2,seq-15750,logfile-+ARCH/ORCL/pdb_archivelog/2021_06_04/thread_2_seq_15750.383.1074350997,los-8126163298692,nxs-18446744073709551615,maxblks-4095154 thr-1,seq-10459,logfile-+ARCH/ORCL/pdb_archivelog/2021_06_04/thread_1_seq_10459.911.1074351011,los-8126158194517,nxs-18446744073709551615,maxblks-2206663 queued attach DA request 0xf067c0a60 for pdb 10, ospid 58020 2021-06-04T14:50:16.332846+08:00 Increasing priority of 26 RS Domain Action Reconfiguration started (domid 10, new da inc 1, cluster inc 8) Instance 1 is attaching to domain 10 * allocate domain 10, valid ? 1 Global Resource Directory partially frozen for domain action Domain Action Reconfiguration complete (total time 0.0 secs) Decreasing priority of 26 RS 2021-06-04T14:50:16.338743+08:00 PDB(10):Media Recovery Start 2021-06-04T14:50:16.344552+08:00 PDB(10):Serial Media Recovery started PDB(10):max_pdb is 10 2021-06-04T14:50:17.228856+08:00 PDB(10):Media Recovery Log +ARCH/ORCL/pdb_archivelog/2021_06_04/thread_1_seq_10459.911.1074351011 2021-06-04T14:50:17.288262+08:00 PDB(10):Media Recovery Log +ARCH/ORCL/pdb_archivelog/2021_06_04/thread_2_seq_15750.383.1074350997 2021-06-04T14:50:17.371908+08:00 ALTER SYSTEM SET remote_listener=' rac-scan:1521' SCOPE=MEMORY SID='orcl1'; 2021-06-04T14:50:17.372908+08:00 ALTER SYSTEM SET listener_networks='' SCOPE=MEMORY SID='orcl1'; 2021-06-04T14:50:24.670032+08:00 PDB(10):Incomplete Recovery applied until change 8126171048558 time 06/04/2021 14:49:57 2021-06-04T14:50:24.672416+08:00 PDB(10):Media Recovery Complete (orcl1) queued detach DA request 0xf067c09f8 for pdb 10, ospid 58020 2021-06-04T14:50:24.903122+08:00 Increasing priority of 26 RS Domain Action Reconfiguration started (domid 10, new da inc 2, cluster inc 8) Instance 1 is detaching from domain 10 (lazy abort? 0) Global Resource Directory partially frozen for domain action * domain detach - domain 10 valid ? 1 Non-local Process blocks cleaned out Set master node info Dwn-cvts replayed, VALBLKs dubious All grantable enqueues granted freeing rdom 10 freeing the fusion rht of pdb 10 freeing the pdb enqueue rht Domain Action Reconfiguration complete (total time 0.2 secs) Decreasing priority of 26 RS PDB(10):Autotune of undo retention is turned on. 2021-06-04T14:50:25.326031+08:00 PDB(10):This instance was first to open pluggable database PDB (container=10) 2021-06-04T14:50:26.019523+08:00 PDB(10):queued attach DA request 0xf067c0990 for pdb 10, ospid 58020 2021-06-04T14:50:26.020971+08:00 Increasing priority of 26 RS Domain Action Reconfiguration started (domid 10, new da inc 3, cluster inc 8) Instance 1 is attaching to domain 10 * allocate domain 10, valid ? 1 Global Resource Directory partially frozen for domain action Domain Action Reconfiguration complete (total time 0.0 secs) Decreasing priority of 26 RS 2021-06-04T14:55:26.579987+08:00 PDB(10):>>> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=409 PDB(10):System State dumped to trace file /u02/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_p05k_133040.trc
复制

目标端数据库中的活动会话

USERNAME SID EVENT MACHINE MODULE STATUS LAST_CALL_ET SQL_ID WAI_SECINW ROW_WAIT_OBJ# SQLTEXT BS CH# OSUSER HEX ----------- ---------- -------------------- ---------- -------------------- -------- ------------ --------------- ---------- ------------- ------------------------------ ---------- ---- ---------- --------- C##DBAAS_SYS 1106 GV$: slave acquisiti zcloud zoramon_collector ACTIVE 1 0tzum39pa4n2g 0:0 0 SELECT file_id, file_name, con : 0 zcloud 100013d SQM 5236 On CPU / runqueue sql-sqm mz3_v2_agent ACTIVE 108 g31hyhz63hb9k -1:0 -1 SELECT sysdate AS ZONE_LAST_SN : 159 root 10000a0 SYS 6197 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:20 0 : oracle SYS 4322 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:20 0 : oracle SYS 2068 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:20 0 : oracle SYS 5716 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:20 0 : oracle SYS 7586 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:20 0 : oracle SYS 4661 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:21 0 : oracle SYS 4707 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:21 0 : oracle SYS 4708 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:21 0 : oracle SYS 4757 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:21 0 : oracle SYS 4949 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:21 0 : oracle SYS 4997 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:21 0 : oracle SYS 5090 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:21 0 : oracle SYS 5190 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:21 0 : oracle SYS 5234 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:21 0 : oracle SYS 5620 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:21 0 : oracle SYS 5714 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:21 0 : oracle SYS 5764 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:21 0 : oracle SYS 5862 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:21 0 : oracle SYS 5957 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:21 0 : oracle SYS 6101 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:21 0 : oracle SYS 6102 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:21 0 : oracle SYS 6342 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:21 0 : oracle SYS 6388 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:21 0 : oracle SYS 6435 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:21 0 : oracle ... ... SYS 2546 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:23 0 : oracle SYS 2741 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:23 0 : oracle SYS 2887 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:23 0 : oracle SYS 2932 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:23 0 : oracle SYS 3027 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:23 0 : oracle SYS 3893 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:23 0 : oracle SYS 7587 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:23 0 : oracle SYS 4805 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:23 0 : oracle SYS 4853 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:23 0 : oracle SYS 4998 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:23 0 : oracle SYS 5042 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:23 0 : oracle SYS 5284 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:23 0 : oracle SYS 5526 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:23 0 : oracle SYS 5812 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:23 0 : oracle SYS 6049 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:23 0 : oracle SYS 6051 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:23 0 : oracle SYS 6148 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:23 0 : oracle SYS 6675 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:23 0 : oracle SYS 6724 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:23 0 : oracle SYS 6917 cursor: pin S wait o rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:23 0 : oracle SYS 4278 row cache lock rac1 sqlplus ACTIVE 5266 8vyjutx6hg3wh 0:4944 0 : 0 oracle SYS 915 PX Deq: Txn Recovery rac1 sqlplus ACTIVE 7059 83qguqsfzt4vu 0:5267 15 : 0 oracle 1000000 323 rows selected.
复制

4. 请求Oracle SR分析:

Verified the issue in the log file as noted below: LOG FILE ----------------------- Filename =orcl1_p05k_133040.trc See the following: Trace file /u02/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_p05k_133040.trc Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production Version 19.8.0.0.0 Build label: RDBMS_19.8.0.0.0DBRU_LINUX.X64_200702 ORACLE_HOME: /u02/app/oracle/product/19.0.0/db_1 System name: Linux Node name: rac1 Release: 3.10.0-862.el7.x86_64 Version: #1 SMP Wed Mar 21 18:14:51 EDT 2018 Machine: x86_64 Instance name: orcl1 Redo thread mounted by this instance: 1 Oracle process number: 409 Unix process pid: 133040, image: oracle@rac1 (P05K) *** 2021-06-04T14:55:26.580236+08:00 (PARTNER(10)) *** SESSION ID:(4278.49172) 2021-06-04T14:55:26.580297+08:00 *** CLIENT ID:() 2021-06-04T14:55:26.580305+08:00 *** SERVICE NAME:(SYS$USERS) 2021-06-04T14:55:26.580310+08:00 *** MODULE NAME:(sqlplus@rac1 (TNS V1-V3)) 2021-06-04T14:55:26.580315+08:00 *** ACTION NAME:() 2021-06-04T14:55:26.580320+08:00 *** CLIENT DRIVER:() 2021-06-04T14:55:26.580324+08:00 *** CONTAINER ID:(10) 2021-06-04T14:55:26.580329+08:00 >>> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! <<< row cache enqueue: session: 0xf023c08b8, mode: N, request: S =================================================== SYSTEM STATE (level=2, with short stacks) Dumping the Process Summary 196: P003 ospid 131182 sid 1734 ser 17149, waiting for 'cursor: pin S wait on X' 202: P000 ospid 131176 sid 2020 ser 51651, waiting for 'cursor: pin S wait on X' ...... 409: P05K ospid 133040 sid 4278 ser 49172, waiting for 'row cache lock' Cmd: UPDATE ...... 499: USER ospid 58020 sid 915 ser 48524, waiting for 'PX Deq: Txn Recovery Reply' Cmd: CREATE PLUGGABLE DATABASE 500: P080 ospid 133477 sid 964 ser 53585, waiting for 'cursor: pin S wait on X'
复制
PROCESS 499: ---------------------------------------- SO: 0xefff0f108, type: process (2), map: 0xf41b51cd8 state: LIVE (0x4532), flags: 0x1 owner: (nil), proc: 0xefff0f108 link: 0xefff0f128[0xefff0f128, 0xefff0f128] child list count: 22, link: 0xefff0f178[0xeff24fbc8, 0x8bfc96838] conid: 1, conuid: 1, SGA version=(1,0), pg: 0 SOC: 0xf41b51cd8, type: process (2), map: 0xefff0f108 state: LIVE (0x99fc), flags: INIT (0x1) (process) Oracle pid:499, ser:17182, calls cur/top: 0x8c946e3b8/0x8c946e3b8 flags : (0x0) - icon_uid:0 logon_pdbid=0 flags2: (0x0), flags3: (0x510) call error: 0, sess error: 0, txn error 0 intr queue: empty (post info) last post received: 0 0 530 last post received-location: kxfp.h LINE:6110 ID:kxfprienq: slave last process to post me: 0xf21abd2a0 1 0 last post sent: 0 0 530 last post sent-location: kxfp.h LINE:6110 ID:kxfprienq: slave last process posted by me: 0xf01ab6e08 1 0 waiter on post event: 0 (latch info) hold_bits=0x0 ud_influx=0x23f48e Process Group: DEFAULT, pseudo proc: 0xf41eaf7c0 O/S info: user: oracle, term: UNKNOWN, ospid: 58020 OSD pid info: Short stack dump: *** 2021-06-04T14:55:43.882219+08:00 ((1)) (session) sid: 915 ser: 48524 trans: 0xdd2ed7210, creator: 0xf41b51cd8 flags: (0x8100041) USR/- flags2: (0x80048009) -/DDLT2/INC flags_idl: (0x1) status: BSY/-/-/- kill: -/-/-/- DID: 0001-01F3-0005ED410001-01F3-0005ED3E, short-term DID: txn branch: (nil) con_id/con_uid/con_name: 10/3609791335/PARTNER con_logonuid: 1 con_logonid: 1 con_scuid: 1 con_scid: 1 edition#: 0 user#/name: 0/SYS oct: 226, prv: 0, sql: 0x7ffff26b07c0, psql: (nil) stats: 0x58c4c9e40, PX stats: 0x12ac1d44 service name: SYS$USERS client details: O/S info: user: oracle, term: pts/0, ospid: 58019 machine: rac1 program: sqlplus@rac1 (TNS V1-V3) application name: sqlplus@rac1 (TNS V1-V3), hash value=2145945074 Current Wait Stack: 0: waiting for 'PX Deq: Txn Recovery Reply' <<<<<<<<<<< sleeptime/senderid=0x64, passes=0x171, =0x0 wait_id=38339 seq_num=56891 snap_id=1 wait times: snap=5 min 17 sec, exc=5 min 17 sec, total=5 min 17 sec wait times: max=infinite, heur=5 min 17 sec wait counts: calls=369 os=369 in_wait=1 iflags=0x15a2 Wait State: fixed_waits=0 flags=0x22 boundary=(nil)/-1 instance id=1 server id=65535 (_58020) numa#=0 flags= ISQC <<<<<<<<<<<<<<<<<<<<<< PX coordinator process. instance id=1 server id=319 (P08V_134181) numa#=0 flags= INIT HUB instance id=2 server id=12 (P00C_78270) numa#=0 flags= INIT HUB instance id=1 server id=318 (P08U_134176) numa#=0 flags= INIT HUB instance id=1 server id=165 (P04L_132967) numa#=0 flags= INIT HUB instance id=1 server id=317 (P08T_134172) numa#=0 flags= INIT HUB instance id=1 server id=13 (P00D_131534) numa#=0 flags= INIT HUB ...... instance id=1 server id=200 (P05K_133040) numa#=0 flags= INIT HUB ...... Cmd: CREATE PLUGGABLE DATABASE
复制
PROCESS 196: P003 <<<<<<<<<<<<<<<<<<<<<<<<<<< PROCESS 499 的一个 slave ---------------------------------------- SO: 0xefff07018, type: process (2), map: 0xea1a76018 state: LIVE (0x4532), flags: 0x1 owner: (nil), proc: 0xefff07018 link: 0xefff07038[0xefff07038, 0xefff07038] child list count: 21, link: 0xefff07088[0xefffd2ce8, 0x8bfc62b88] conid: 1, conuid: 1, SGA version=(1,0), pg: 0 SOC: 0xea1a76018, type: process (2), map: 0xefff07018 state: LIVE (0x99fc), flags: INIT (0x1) (process) Oracle pid:196, ser:2, calls cur/top: 0x401a1d260/0x962d32440 flags : (0x0) - icon_uid:0 logon_pdbid=0 flags2: (0x0), flags3: (0x10) call error: 0, sess error: 0, txn error 0 intr queue: empty (post info) last post received: 0 0 391 last post received-location: kgl.h LINE:10257 ID:kgllldl: post after freeing latch last process to post me: 0xf61aa3b48 1 0 last post sent: 0 0 391 last post sent-location: kgl.h LINE:10257 ID:kgllldl: post after freeing latch last process posted by me: 0xea1a98d78 1 0 waiter on post event: 0 (latch info) hold_bits=0x0 ud_influx=0xc8ec2ae Process Group: DEFAULT, pseudo proc: 0xf41eaf7c0 O/S info: user: oracle, term: UNKNOWN, ospid: 131182 (session) sid: 1734 ser: 17149 trans: 0xe78513d78, creator: 0xea1a76018 flags: (0x41) USR/- flags2: (0x40009) -/-/INC flags_idl: (0x1) status: BSY/-/-/- kill: -/-/-/- DID: 0001-00C4-002D76990000-0000-00000000, short-term DID: txn branch: (nil) con_id/con_uid/con_name: 10/3609791335/PDB con_logonuid: 1 con_logonid: 1 con_scuid: 1 con_scid: 1 edition#: 0 user#/name: 0/SYS oct: 0, prv: 0, sql: 0x715055370, psql: 0x894e93860 stats: 0x58cc65440, PX stats: 0x12ac1d44 service name: SYS$USERS client details: O/S info: user: oracle, term: pts/0, ospid: 131182 machine: rac1 program: oracle@rac1 (P003) application name: sqlplus@rac1 (TNS V1-V3), hash value=2145945074 Current Wait Stack: 0: waiting for 'cursor: pin S wait on X' idn=0x4d078f90, value=0x10b600000000, where=0x500000000 <<<<<<<<<<<<<<<<<<<<<<< wait_id=13 seq_num=15 snap_id=1 wait times: snap=12.775400 sec, exc=12.775400 sec, total=12.775400 sec wait times: max=infinite, heur=5 min 0 sec wait counts: calls=0 os=0 in_wait=1 iflags=0x5a2 $ mutex.sh 0x10b600000000 Mutex Holder Session: 4278 <<<<<<<<<<<< Holder 为 sid: 4278
复制
PROCESS 409: P05K ---------------------------------------- SO: 0xefff0cde8, type: process (2), map: 0xf21aae6d0 state: LIVE (0x4532), flags: 0x1 owner: (nil), proc: 0xefff0cde8 link: 0xefff0ce08[0xefff0ce08, 0xefff0ce08] child list count: 14, link: 0xefff0ce58[0xefd080670, 0x8bfd4a438] conid: 1, conuid: 1, SGA version=(1,0), pg: 0 SOC: 0xf21aae6d0, type: process (2), map: 0xefff0cde8 state: LIVE (0x99fc), flags: INIT (0x1) (process) Oracle pid:409, ser:1, calls cur/top: 0x96386a378/0x883ecce90 flags : (0x0) - icon_uid:0 logon_pdbid=0 flags2: (0x1000), flags3: (0x10) call error: 0, sess error: 0, txn error 0 intr queue: empty (post info) last post received: 0 0 390 last post received-location: kgl.h LINE:10254 ID:kgllldl: in loop last process to post me: 0xf01a9d6b8 1 0 last post sent: 0 0 391 last post sent-location: kgl.h LINE:10257 ID:kgllldl: post after freeing latch last process posted by me: 0xf01a854d8 1 0 waiter on post event: 0 (latch info) hold_bits=0x0 ud_influx=0x9a8 (osp latch info) hold_bits=0x0 ud_influx=0x0 Process Group: DEFAULT, pseudo proc: 0xf41eaf7c0 O/S info: user: oracle, term: UNKNOWN, ospid: 133040 OSD pid info: *** 2021-06-04T14:55:38.385867+08:00 (PDB(10)) Short stack dump: ksedsts <- ksdxfstk <- ksdxdocmdmultex <- ksdxdocmdmult <- ksudmp_proc <- ksudss_main <- ksudss_opt <- kqrget <- kqrLockPo <- kqrpre1 <- ktuGetRowCache1 <- ktuGetUsegDba <- ktuGetCommitTimesMain_Int <- ktuGetCommitTimesMain <- ktuGetCommitTimes <- ktbgcl1 <- ktrgcm <- ktrget2 <- kdsgrp <- qetlbr <- qertbFetchByRowID <- opifch2 <- opiall0 <- opikpr <- opiodr <- rpidrus <- skgmstack <- rpidru <- rpiswu2 <- kprball <- kqdobr_new <- kqrReadFromDB <- kqrpre1 <- kqlhdlod <- kqlCallback <- kqllod <- kglLoadOnLock <- kgllkal <- kglLock <- kglget <- kglgob <- qcdlgbo <- qcdlgob <- qcsfgob <- qcsprfro <- qcsprfro_tree <- qcsprfro_tree <- qcspafq <- qcspqbDescendents <- qcspqb <- kkmdrv <- opiSem <- opiprs <- kksParseChildCursor <- rpiswu2 <- kksLoadChild <- kxsGetRuntimeLock <- kksfbc <- kkspsc0 <- kksParseCursor <- opiosq0 <- opiall0 <- opikpr <- opiodr <- rpidrus <- skgmstack <- rpidru <- rpiswu2 <- kprball <- kqdGetBundledCursor <- kqldcar <- kglslod <- kqlsublod <- kqlCallback <- kqllod_new <- kqlCallback <- kqllod <- kglobld <- kglobpn <- kglpim <- kglpin <- kglgob <- kkdcloc <- kkdcacr <- kkdcacc <- kkmacr <- opiprsdml <- opiSem <- opiprs <- kksParseChildCursor <- rpiswu2 <- kksLoadChild <- kxsGetRuntimeLock <- kksfbc <- kkspsc0 <- kksParseCursor <- opiosq0 <- opiodr <- rpidrus <- skgmstack <- rpidru <- rpiswu2 <- rpidrv <- rpisplu_internal <- ktuscu <- kqrcmt <- ktcCommitTxn_new <- ktcCommitTxn <- kturfptrSlaveWork <- kturfptrSlaveMain <- kxfprdp_int <- opirip <- opidrv <- sou2o <- opimai_real <- ssthrdmain <- main SO: 0xeffe23b18, type: session (4), map: 0xf023c08b8 state: LIVE (0x4532), flags: 0x1 owner: 0xefff0cde8, proc: 0xefff0cde8 link: 0xeffe23b38[0x8bfd4ab38, 0x8bfd4abb8] child list count: 16, link: 0xeffe23b88[0x8a9b78068, 0xe7f975670] conid: 10, conuid: 3609791335, SGA version=(1,0), pg: 0 SOC: 0xf023c08b8, type: session (4), map: 0xeffe23b18 state: LIVE (0x99fc), flags: INIT (0x1) (session) sid: 4278 ser: 49172 trans: 0xe786dffb8, creator: 0xf21aae6d0 <<<<<<<<<<<<< Mutex Holder Session: 4278 flags: (0x41) USR/- flags2: (0x9) -/-/INC flags_idl: (0x1) status: BSY/-/-/- kill: -/-/-/- DID: 0001-0199-0000016A0001-0199-0000016B, short-term DID: txn branch: (nil) con_id/con_uid/con_name: 10/3609791335/PDB con_logonuid: 1 con_logonid: 1 con_scuid: 1 con_scid: 1 edition#: 0 user#/name: 0/SYS oct: 6, prv: 0, sql: 0x715055370, psql: 0x894e93860 stats: 0x871ebcc58, PX stats: 0x12ac1d44 service name: SYS$USERS client details: O/S info: user: oracle, term: pts/0, ospid: 133040 machine: rac1 program: oracle@rac1 (P05K) application name: sqlplus@rac1 (TNS V1-V3), hash value=2145945074 Current Wait Stack: 0: waiting for 'row cache lock' cache id=0x3, mode=0x0, request=0x3 <<<<<<<<<<<<<<< wait_id=131 seq_num=550 snap_id=201 wait times: snap=0.134069 sec, exc=5 min 1 sec, total=5 min 12 sec wait times: max=infinite, heur=5 min 12 sec wait counts: calls=100 os=100 in_wait=1 iflags=0x15a2 Wait State: fixed_waits=0 flags=0x23 boundary=(nil)/-1 SO: 0x8d53d1c40, type: row cache enqueues (111), map: 0x6f8dfa278 state: LIVE (0x4532), flags: 0x0 owner: 0x8bdab6828, proc: 0xefff0cde8 link: 0x8d53d1c60[0xefcf81eb8, 0x8bdab6898] conid: 10, conuid: 3609791335, SGA version=(1,0), pg: 0 SOC: 0x6f8dfa278, type: row cache enqueues (111), map: 0x8d53d1c40 state: LIVE (0x99fc), flags: INIT (0x1) row cache enqueue: count=1 session=0xf023c08b8 object=0x37e2b08a0, request=S <<<<<<<< request S mode flag=03 WAI/TRC/-/-/-/-/-/- savepoint=0x3279 row cache parent object: addr=0x37e2b08a0 cid=3(dc_rollback_segments) conid=10 conuid=3609791335 <<<<<<<< addr=0x37e2b08a0 hash=41d8150b typ=21 transaction=0xdd2feaed8 flags=0000012a inc=1, pdbinc=1 version=1 mtx version=1234 own=0x37e2b0970[0x48bfe8720,0x48bfe8720] wat=0x37e2b0980[0x6f8dfa2f8,0x6f8dfa2f8] mode=X req=N status=VALID/UPDATE/-/-/IO/-/-/-/- KGH pinned data= 0000002d 00000005 0000022e 000002b8 535f0015 4d535359 5f353455 35373533 LibraryHandle: Address=0x715055370 Hash=4d078f90 LockMode=N PinMode=0 LoadLockMode=0 Status=VALD ObjectName: Name=update /*+ rule */ undo$ set name=:2,file#=:3,block#=:4,status$=:5,user#=:6,undosqn=:7,xactsqn=:8,scnbas=:9,scnwrp=:10,inst#=:11,ts#=:12,spare1=:13 where us#=:1
复制
PROCESS 248: P018 ---------------------------------------- SO: 0xefff07e28, type: process (2), map: 0xf21a8f9c0 state: LIVE (0x4532), flags: 0x1 owner: (nil), proc: 0xefff07e28 link: 0xefff07e48[0xefff07e48, 0xefff07e48] child list count: 18, link: 0xefff07e98[0xeffe22d28, 0x8bfca8118] conid: 1, conuid: 1, SGA version=(1,0), pg: 0 SOC: 0xf21a8f9c0, type: process (2), map: 0xefff07e28 state: LIVE (0x99fc), flags: INIT (0x1) (process) Oracle pid:248, ser:1, calls cur/top: 0x481b4f658/0x8c8a985d0 flags : (0x0) - icon_uid:0 logon_pdbid=0 flags2: (0x0), flags3: (0x10) call error: 0, sess error: 0, txn error 0 intr queue: empty (post info) last post received: 0 0 391 last post received-location: kgl.h LINE:10257 ID:kgllldl: post after freeing latch last process to post me: 0xf01a7a958 1 0 last post sent: 0 0 391 last post sent-location: kgl.h LINE:10257 ID:kgllldl: post after freeing latch last process posted by me: 0xea1a98d78 1 0 waiter on post event: 0 (latch info) hold_bits=0x0 ud_influx=0x4e97 Process Group: DEFAULT, pseudo proc: 0xf41eaf7c0 O/S info: user: oracle, term: UNKNOWN, ospid: 131828 (session) sid: 4226 ser: 59669 trans: 0xdd2feaed8, creator: 0xf21a8f9c0 flags: (0x41) USR/- flags2: (0x40009) -/-/INC flags_idl: (0x1) status: BSY/-/-/- kill: -/-/-/- DID: 0001-00F8-0000045B0000-0000-00000000, short-term DID: txn branch: (nil) con_id/con_uid/con_name: 10/3609791335/PDB con_logonuid: 1 con_logonid: 1 con_scuid: 1 con_scid: 1 edition#: 0 user#/name: 0/SYS oct: 0, prv: 0, sql: 0x715055370, psql: 0x894e93860 stats: 0x87d629c70, PX stats: 0x12ac1d44 service name: SYS$USERS client details: O/S info: user: oracle, term: pts/0, ospid: 131828 machine: rac1 program: oracle@rac1 (P018) application name: sqlplus@rac1 (TNS V1-V3), hash value=2145945074 Current Wait Stack: 0: waiting for 'cursor: pin S wait on X' idn=0x4d078f90, value=0x10b600000000, where=0x500000000 <<<<<<<<<<<<<<<<<< wait_id=16 seq_num=18 snap_id=1 wait times: snap=15.151057 sec, exc=15.151057 sec, total=15.151057 sec wait times: max=infinite, heur=5 min 2 sec wait counts: calls=0 os=0 in_wait=1 iflags=0x5a2 Wait State: row cache enqueue: count=1 session=0xf424660c8 object=0x37e2b08a0, mode=X <<<<<<<< 以 X 模式持有 row cache 0x37e2b08a0 flag=00 -/-/-/-/-/-/-/- savepoint=0x321a row cache parent object: addr=0x37e2b08a0 cid=3(dc_rollback_segments) conid=10 conuid=3609791335 hash=41d8150b typ=21 transaction=0xdd2feaed8 flags=0000012a inc=1, pdbinc=1 version=1 mtx version=1230 own=0x37e2b0970[0x48bfe8720,0x48bfe8720] wat=0x37e2b0980[0x6f8dfa2f8,0x6f8dfa2f8] mode=X req=N status=VALID/UPDATE/-/-/IO/-/-/-/- KGH pinned data= 0000002d 00000005 0000022e 000002b8 535f0015 4d535359 5f353455 35373533 $ mutex.sh 0x10b600000000 Mutex Holder Session: 4278 <<<<<<<<<<<< Holder 为 sid: 4278 Summary ======== 'row cache lock' & 'cursor: pin S wait on X' deadlock between PX slave processes P05K(ospid: 133040) and P018(ospid: 131828).
复制

该问题的原因是:

Bug 31747989 - tracking bug for dependent fixes of first-pass transaction recovery ( Doc ID 31747989.8 )

The following justifies how the issue is related to this specific customer:

  1. 从 system state dump 可见 ‘row cache lock’ & ‘cursor: pin S wait on X’ deadlock
    ‘row cache lock’ & ‘cursor: pin S wait on X’ deadlock between PX slave processes P05K(ospid: 133040) and P018(ospid: 131828).
  2. 该问题与 Bug 32257492 提到的问题匹配, Bug 32257492 被关闭成 Bug 30159581 的 duplicate bug,而 Bug 30159581 最终被 Bug 31747989 取代。
  3. 注意该现象与 Bug 30931981 也非常相似,不过本 SR 的问题并没有发现 kklmsle call stack,因此并不是相同问题,但是仍然推荐该补丁,因为它们的现象非常类似。
    This is explained in the following bug:
    Bug 30931981 - Open Reset Logs Hangs With ‘row cache lock’ and ‘cursor: pin s wait for x’ Waits ( Doc ID 30931981.8 )
    Bug 32257492 : DEADLOCK BETWEEN PARALLEL SLAVES ON ROW CACHE LOCK AND CURSOR PIN S WAIT ON X EVEN AFTER APPLYING 30931981

同时打上 Patch 31747989 + Patch 30931981,以修复所有类似问题。

最后修改时间:2021-06-10 21:48:38
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论