1.影响版本基本匹配,当前环境为 11.2.0.3 且无任何补丁:
--------------------------------------------------------------------------------
Installed Top-level Products (1):
Oracle Database 11g 11.2.0.3.0
There are 1 products installed in this Oracle Home.
There are no Interim patches installed in this Oracle Home.
Rac system comprising of multiple nodes
Local node = n1smsdb1
Remote node = n1smsdb2
--------------------------------------------------------------------------------
2.cursor_sharing 设置为 EXACT
3.optimizer 参数相同
show parameter optimizer;
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
optimizer_capture_sql_plan_baselines boolean FALSE
optimizer_dynamic_sampling integer 2
4.同
时/picclife/app/oracle/diag/rdbms/dbsms/dbsms1/incident/incdir_1200623/dbsms1_m0
00_12514_i1200623.trc 中堆栈信息与 incident 的 trace 文件匹配:
STACK TRACE:
------------
skdstdst <- ksedst1 <- ksedst <- dbkedDefDump <- ksedmp
<- ssexhd <- sighandler <- kglic0 <- kksIterCursorStat <-
kewrrtsq_rank_topsq
<- kewrbtsq_build_tops <- kewrftsq_flush_tops <- kewrft_flush_table
<- kewrftec_flush_tabl <- e_ehdlcx
<- kewrfat_flush_all_t <- ables <- kewrfsr_flush_snaps <- hot_r <-
kewrrfs_remote_flus
<- h_slave <- kebm_slave_main <- ksvrdp <- opirip <- opidrv
<- sou2o <- opimai_real <- ssthrdmain <- main <- libc_start_main
<- start
在 10 点 30 分已经开始出现问题,影响到 mmon 进程调度 AWR 的快照自动生成:
实例 1 AWR 从 10 点半开始就无记录:
Instance DB Name Snap Id Snap Started Level
------------ ------------ --------- ------------------ -----
dbsms1 SMS 111561 23 Feb 2020 00:00 1
111581 23 Feb 2020 10:00 1
111588 23 Feb 2020 13:30 1
111589 23 Feb 2020 14:00 1
该 bug 会导致业务会话阻塞 LCK0 进程(LCK 进程主要在 RAC 环境上处理 library 和 row cache 的请求)
获得 shared pool 的 latch,同时造成业务会话大量堆积,杀业务会话无效果,直到实例重启。
在 11 点 54 分时候已经出现过一次阻塞,后业务会话被 LMHB 进程 kill 掉而释放:
Sun Feb 23 11:54:57 2020
LCK0 (ospid: 5113) waits for latch 'shared pool' for 93 secs.
Errors in file
评论