暂无图片
暂无图片
1
暂无图片
暂无图片
暂无图片

模拟ORACLE STEAL UNDO EXTENT机制产生的等待enq: CR - block range reuse ckpt

原创 范计杰 2021-12-01
2569

巡检发现某生产库 TOP 1等待事件为enq: CR - block range reuse ckpt,经分析定位为UNDO STEAL导致,这里主要测试重现ORACLE STEAL UNDO EXTENT及enq: CR - block range reuse ckpt等待事件产生的过程。


模拟过程


1、创建表空


SQL> create undo tablespace undo2 datafile '/oracle/app/oracle/oradata/NOCDB19C/undo2.dbf' size 20m;

Tablespace created.

SQL> alter system set undo_tablespace=undo2;

System altered.

SQL> select RETENTION from dba_tablespaces where TABLESPACE_NAME='UNDO2';

RETENTION
-----------
NOGUARANTEE

SQL> @p undo_retention

NAME VALUE
---------------------------------------- ----------------------------------------
undo_retention 900

2、登录数据库创建测试表,查找对应SERVER SPID,GDB设置断点



create table dbmt.tobj as select * from dba_objects;


在ORACLE UNDO STEAL相关函数上设置断点
(gdb) break ktusm_stealext
Breakpoint 2 at 0x1c48910
(gdb) c


3、更新dbmt.tobj表,产生UNDO占用


SQL> update dbmt.tobj set object_name=object_name;

72634 rows updated.
SQL> @trans

       SID    SERIAL# USERNAME   TADDR            SES_ADDR          USED_UBLK  USED_UREC    0xFLAG STATUS                        START_DATE             XIDUSN    XIDSLOT     XIDSQN XID              PRV_XID          PTX_XID
---------- ---------- --------------------------- ---------------- ---------- ---------- --------- ----------------------------- ------------------ ---------- ---------- ---------- ---------------- ---------------- ----------------
         1      49117 SYS        0000000073733698 000000007856C6F0       1026      76885       E03 ACTIVE                        30-NOV-21                  11         15          5 0B000F0005000000 0000000000000000 0000000000000000
commit;


 SQL> @df

TABLESPACE_NAME                   TotalMB     UsedMB     FreeMB % Used Ext Used
------------------------------ ---------- ---------- ---------- ------ --- ----------------------
ILMTSDEMO_LOWCOST                      50          9         41    18% YES |####                |
ILM_SOURCE                             12          1         11     9% YES |##                  |
SYSAUX                                700        661         39    95% YES |################### |
SYSTEM                                980        971          9   100% YES |####################|
TEMP                                  111        111          0   100% YES |####################|
UNDO2                                  20         12          8    60% NO  |############        | <<<<
UNDOTBS1                              645        300        345    47% YES |##########          |
USERS                                  43         40          3    94% YES |################### |
 col segment_name for a30
select b.us#,segment_name,round(bytes/1024/1024,2)mb,b.STATUS$  from dba_segments a,undo$ b where tablespace_name='UNDO2' and a.segment_name=b.name;SQL> SQL>

       US# SEGMENT_NAME                           MB    STATUS$
---------- ------------------------------ ---------- ----------
        11 _SYSSMU11_218934645$                    9          3 <<<<
        12 _SYSSMU12_1212566812$                 .13          3
        13 _SYSSMU13_1508681309$                 .13          3
        14 _SYSSMU14_1297497020$                 .13          3
        15 _SYSSMU15_3396771108$                 .13          3
        16 _SYSSMU16_1123823253$                 .13          3
        17 _SYSSMU17_1247462769$                 .13          3
        18 _SYSSMU18_74029735$                   .13          3
        19 _SYSSMU19_3285312421$                 .13          3
        20 _SYSSMU20_2211160347$                 .13          3

10 rows selected.


4、再次更新dbmt.tobj表,一次少更新几行,确定事务未分配到刚刚使用的UNDO SEGMENT上


SQL> update dbmt.tobj set object_name=object_name where object_id<100;

98 rows updated.


SQL>
 SQL> @trans

       SID    SERIAL# USERNAME     TADDR            SES_ADDR          USED_UBLK  USED_UREC    0xFLAG STATUS                        START_DATE             XIDUSN    XIDSLOT     XIDSQN XID              PRV_XID          PTX_XID
---------- ---------- ------------ ---------------- ---------------- ---------- ---------- --------- ----------------------------- ------------------ ---------- ---------- ---------- ---------------- ---------------- ----------------
         1      49117 SYS          0000000073733698 000000007856C6F0          1          6       E03 ACTIVE                        30-NOV-21                  19         16          5 1300100005000000 0000000000000000 0000000000000000


XIDUSN 19

 col segment_name for a30

select b.us#,segment_name,round(bytes/1024/1024,2)mb,b.STATUS$  from dba_segments a,undo$ b where tablespace_name='UNDO2' and a.segment_name=b.name;
SQL>
       US# SEGMENT_NAME                           MB    STATUS$
---------- ------------------------------ ---------- ----------
        11 _SYSSMU11_218934645$                    9          3
        12 _SYSSMU12_1212566812$                 .13          3
        13 _SYSSMU13_1508681309$                 .13          3
        14 _SYSSMU14_1297497020$                 .13          3
        15 _SYSSMU15_3396771108$                 .13          3
        16 _SYSSMU16_1123823253$                 .13          3
        17 _SYSSMU17_1247462769$                 .13          3
        18 _SYSSMU18_74029735$                   .13          3
        19 _SYSSMU19_3285312421$                 .13          3 <<<<
        20 _SYSSMU20_2211160347$                 .13          3

10 rows selected.


5、GDB 继续调试,观察UNDO SEGMENT会产生STEAL,但未观察到活动会话中的等待事务



  SQL> update dbmt.tobj set object_name=object_name where object_id<100000;



SQL> @df

TABLESPACE_NAME                   TotalMB     UsedMB     FreeMB % Used Ext Used
------------------------------ ---------- ---------- ---------- ------ --- ----------------------
ILMTSDEMO_LOWCOST                      50          9         41    18% YES |####                |
ILM_SOURCE                             12          1         11     9% YES |##                  |
SYSAUX                                700        661         39    95% YES |################### |
SYSTEM                                980        971          9   100% YES |####################|
TEMP                                  111        111          0   100% YES |####################|
UNDO2                                  20         20          0   100% NO  |####################|
UNDOTBS1                              645        300        345    47% YES |##########          |
USERS                                  43         40          3    94% YES |################### |

8 rows selected.




       US# SEGMENT_NAME                           MB    STATUS$
---------- ------------------------------ ---------- ----------
        11 _SYSSMU11_218934645$                    9          3
        12 _SYSSMU12_1212566812$                 .13          3
        13 _SYSSMU13_1508681309$                 .13          3
        14 _SYSSMU14_1297497020$                 .13          3
        15 _SYSSMU15_3396771108$                 .13          3
        16 _SYSSMU16_1123823253$                 .13          3
        17 _SYSSMU17_1247462769$                 .13          3
        18 _SYSSMU18_74029735$                   .13          3
        19 _SYSSMU19_3285312421$                   9          3
        20 _SYSSMU20_2211160347$                 .13          3

10 rows selected.


USERNAME       SID EVENT                MACHINE    MODULE               STATUS   LAST_ET     SQL_DT SQL_ID             PLOBJ WAI_SECINW  ROW_OBJ SQLTEXT                        BS         OSUSER
----------- ------ -------------------- ---------- -------------------- -------- ------- ---------- --------------- -------- ---------- -------- ------------------------------ ---------- ----------
SYS              1 On CPU / runqueue    rac1       SQL*Plus             ACTIVE       128        128 7gn43xth7dhkz            3855:166         -1  update dbmt.tobj set object_n :          oracle



Breakpoint 2, 0x0000000001c48910 in ktusm_stealext ()
(gdb) bt
#0  0x0000000001c48910 in ktusm_stealext ()
#1  0x0000000001c461cf in ktusmasp1r ()
#2  0x0000000001c4fc27 in ktrsexecExecuteInExcepHdlr ()
#3  0x0000000001c4ece1 in ktrsexec ()
#4  0x0000000001bf38db in ktuaspResumableInPdb ()
#5  0x0000000001c45505 in ktusmasp ()
#6  0x00000000126e5bbf in ktuchg2 ()
#7  0x00000000127177e7 in ktbchg2 ()
#8  0x00000000039d4ad1 in kdu_array_flush_retry ()
#9  0x00000000039d407c in kdu_array_flush1 ()
#10 0x0000000012b60510 in qerupUpdRow ()
#11 0x0000000012b5f023 in qerupRopRowsets ()
#12 0x0000000003b3897f in kdstf110110100001000km ()
#13 0x0000000012ac19f7 in kdsttgr ()
#14 0x0000000012b43054 in qertbFetch ()
#15 0x0000000012b5f378 in qerupFetch ()
#16 0x0000000012a053c8 in updaul ()
#17 0x0000000012a041a5 in updThreePhaseExe ()
#18 0x0000000012a0176b in updexe ()
#19 0x00000000129267fd in opiexe ()
#20 0x0000000012991a52 in kpoal8 ()
#21 0x000000001291fe22 in opiodr ()
#22 0x0000000012c849e7 in ttcpip ()
#23 0x000000000293e9c9 in opitsk ()
#24 0x00000000029432c8 in opiino ()
#25 0x000000001291fe22 in opiodr ()
#26 0x000000000293a746 in opidrv ()
#27 0x0000000003507375 in sou2o ()
#28 0x0000000000dbe406 in opimai_real ()
#29 0x0000000003513421 in ssthrdmain ()
#30 0x0000000000dbe230 in main ()
(gdb)


6、GDB 继续调试,观察UNDO SEGMENT会产生STEAL


(gdb) n
Single stepping until exit from function ktusm_stealext,
which has no line number information.
0x0000000001c461cf in ktusmasp1r ()

(gdb) n
Single stepping until exit from function ktusmasp1r,
which has no line number information.

Breakpoint 1, 0x0000000001e77a90 in kcbrbr_int ()

(gdb) n
Single stepping until exit from function kcbrbr_int,
which has no line number information.
0x0000000001e77a6f in kcbrbr ()
(gdb) n
Single stepping until exit from function kcbrbr,
which has no line number information.
0x000000000202f51d in kteopdelete1 ()
(gdb) n
Single stepping until exit from function kteopdelete1,
which has no line number information.
0x000000000208e469 in ktsxr_delete ()

(gdb) n
Single stepping until exit from function ktsxr_delete,
which has no line number information.
0x0000000001c146b6 in ktusp_delextent ()
(gdb) n
Single stepping until exit from function ktusp_delextent,
which has no line number information.
0x0000000001c12daa in ktuShrinkUndoSeg ()
(gdb) n
Single stepping until exit from function ktuShrinkUndoSeg,
which has no line number information.
0x0000000001c48ec4 in ktusmrsn ()
(gdb) n
Single stepping until exit from function ktusmrsn,
which has no line number information.
0x0000000001c470c0 in ktusmasp1r ()


(gdb) bt
#0  0x0000000001c470c0 in ktusmasp1r ()
#1  0x0000000001c4fc27 in ktrsexecExecuteInExcepHdlr ()
#2  0x0000000001c4ece1 in ktrsexec ()
#3  0x0000000001bf38db in ktuaspResumableInPdb ()
#4  0x0000000001c45505 in ktusmasp ()
#5  0x00000000126e5bbf in ktuchg2 ()
#6  0x00000000127177e7 in ktbchg2 ()
#7  0x00000000039d4ad1 in kdu_array_flush_retry ()
#8  0x00000000039d407c in kdu_array_flush1 ()
#9  0x0000000012b60510 in qerupUpdRow ()
#10 0x0000000012b5f023 in qerupRopRowsets ()
#11 0x0000000003b3897f in kdstf110110100001000km ()
#12 0x0000000012ac19f7 in kdsttgr ()
#13 0x0000000012b43054 in qertbFetch ()
#14 0x0000000012b5f378 in qerupFetch ()
#15 0x0000000012a053c8 in updaul ()
#16 0x0000000012a041a5 in updThreePhaseExe ()
#17 0x0000000012a0176b in updexe ()
#18 0x00000000129267fd in opiexe ()
#19 0x0000000012991a52 in kpoal8 ()
#20 0x000000001291fe22 in opiodr ()
#21 0x0000000012c849e7 in ttcpip ()
#22 0x000000000293e9c9 in opitsk ()
#23 0x00000000029432c8 in opiino ()
#24 0x000000001291fe22 in opiodr ()
#25 0x000000000293a746 in opidrv ()
#26 0x0000000003507375 in sou2o ()
#27 0x0000000000dbe406 in opimai_real ()
#28 0x0000000003513421 in ssthrdmain ()
#29 0x0000000000dbe230 in main ()



 

       US# SEGMENT_NAME                           MB    STATUS$
---------- ------------------------------ ---------- ----------
        11 _SYSSMU11_218934645$                 8.88          3 <<<<被偷窃,从9M缩小到8.88
        12 _SYSSMU12_1212566812$                 .13          3
        13 _SYSSMU13_1508681309$                 .13          3
        14 _SYSSMU14_1297497020$                 .13          3
        15 _SYSSMU15_3396771108$                 .13          3
        16 _SYSSMU16_1123823253$                 .13          3
        17 _SYSSMU17_1247462769$                 .13          3
        18 _SYSSMU18_74029735$                   .13          3
        19 _SYSSMU19_3285312421$                9.13          3<<<<活动事务使用,增大
        20 _SYSSMU20_2211160347$                 .13          3

10 rows selected.

设置10046 重复上面的过程,可以观察到enq: CR - block range reuse ckpt


前面通过GDB加断点测试,捕捉到了ktusm_stealext函数调用,但观察不到等待enq: CR - block range reuse ckpt,猜想这个等待是毫秒级的,如果加断点的位置并不是加wait event之后,解除之前应该是看不到的,通过10026再观察。

先尝试STEAL,如果不成功,才能使用FREE EXTENT
通过trace 观察到了enq: CR - block range reuse ckpt
nocdb19c_ora_2279.trc

行 558: WAIT #139934742178240: nam='enq: CR - block range reuse ckpt' ela= 1446 name|mode=1129447430 2=65570 0=1 obj#=-1 tim=1126482933
行 581: WAIT #139934742178240: nam='enq: CR - block range reuse ckpt' ela= 597 name|mode=1129447430 2=65570 0=1 obj#=-1 tim=1126486780
行 604: WAIT #139934742178240: nam='enq: CR - block range reuse ckpt' ela= 1652 name|mode=1129447430 2=65570 0=1 obj#=-1 tim=1126491195

并且在等待enq: CR - block range reuse ckpt之前的等待事件为reliable message
WAIT #139934742178240: nam='reliable message' ela= 301 channel context=1989491688 channel handle=1989402640 broadcast message=1991183576 obj#=-1 tim=1126481460
WAIT #139934742178240: nam='enq: CR - block range reuse ckpt' ela= 1446 name|mode=1129447430 2=65570 0=1 obj#=-1 tim=1126482933

等待是发生成update执行阶段,并非解析SQL阶段
PARSING IN CURSOR #139934742178240 len=65 dep=0 uid=0 oct=6 lid=0 tim=1126068945 hv=149629389 ad='69cba1c0' sqlid='cucft6n4fqafd'
update dbmt.tobj2 set object_name=object_name where object_id>100

等待在以下递归SQL之后,看起来就是递归UNDO相关操作。
PARSING IN CURSOR #139934736487328 len=85 dep=1 uid=0 oct=3 lid=0 tim=1126297347 hv=3573688918 ad='69475af8' sqlid='86708bvah4akq'
select name from undo$ where file#=:1 and block#=:2 and ts#=:3 and status$ != 1





最后修改时间:2021-12-01 16:49:42
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论