暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

其他2

原创 yczloveyy 2023-08-29
127

锁 单个SESSION锁等待队列超过20个 #应用#

一、数据库状态检查的步骤/命令(通过本步骤可以确定数据库当前的状态是否OK)

检查连接情况,确保能正常登录、有应用连接、连接数使用率正常

col PROCESS_LIMIT for a20;
select process_current,value process_limit,round(process_current / value, 2) * 100 || '%' as process_pct
from (select count(*) as process_current from v$process) a, v$parameter where name = 'processes';
复制


二、请描述告警出现的可能原因

锁等待,可能是应用逻辑或者语句效率问题


三、预计多久不处理就可以出现UIOC(P1)

持续等待则影响业务语句正常执行,需15分钟内处理正常


四、告警检查的步骤(通过本步骤确定告警是否是误告)

--单个会话的锁等待队列长度

select /*+rule*/'lock_num:'||nvl(max(count(aa.holding_session)),0) lock_num  from dba_waiters aa group by aa.holding_session;
复制

五、需要保存现场的步骤/命令

create table dbmgr.ash_20220000 as select tt.* from gv$active_session_history tt;
create table dbmgr.lock_20220000 as select *from gv$lock;
create table dbmgr.locked_object_20220000 as select *from GV$LOCKED_OBJECT;
复制

六、需要反馈上级以及运营的信息

将堵塞的情况反馈运营/DA,同时确认业务的影响情况,运营确认是否kill holder会话


七、建议处理的流程,步骤和对应命令

1、检查堵塞情况,找到holder,发给运营评估能否kill

a.
set line 1000 pagesize 999
col EVENT for a40
select 
s.username,
 decode(lc.block, 0, 'waiter: ', 'holder: ') || lc.sid sid,
 decode(lc.lmode,
        0,
        'none',
        1,
        'null',
        2,
        'row share',
        3,
        'row excl.',
        4,
        'share',
        5,
        's/row excl.',
        6,
        'exclusive',
        lmode,
        ltrim(to_char(lc.lmode, '990'))) lmode,
 decode(lc.request,
        0,
        'none',
        1,
        'null',
        2,
        'row share',
        3,
        'row excl.',
        4,
        'share',
        5,
        's/row excl.',
        6,
        'exclusive',
        request,
        ltrim(to_char(lc.request, '990'))) request,
 lc.type,
decode( s.sql_id,null,s.PREV_SQL_ID,s.sql_id) sql_id,
s.EVENT,s.BLOCKING_SESSION_STATUS,s.BLOCKING_SESSION,s.BLOCKING_INSTANCE,
 'alter system kill session ' || '''' || S.SID || ',' || S.SERIAL# || ',@' ||
       s.INST_ID || ''''||' immediate;'  kill_command
  from gv$lock lc, gv$session s
 where (lc.id1, lc.id2, lc.type) in
       (select t.id1, t.id2, t.type from gv$lock t where t.request > 0)
   and lc.sid = s.sid
   and lc.inst_id = s.inst_id
 order by lc.id1, lc.request;
复制


b.
with l as(
select /*+materialize*/ decode(request,0,'Holder:','    Waiter:') role, l.inst_id,l.sid, id1,id2,lmode,request,l.type from gv$lock l
where (id1,id2,l.type) in (select id1,id2,type from gv$lock where request>0 ) 
order by id1,request)
select l.*, s.SQL_ID, s.PREV_SQL_ID from l, gv$session s where l.sid = s.sid ;

with l as (select/*+materialize*/ sid from gv$lock where block = 1)
select 'alter system kill session '''||s.sid||','||serial#||',@'||inst_id||''' immediate;' as sql from gv$session s, l where s.sid = l.sid 
order by s.sid,s.serial#;
复制

2、反馈运营会话详情,评估能否kill

--查看sid对应会话详情

set linesize 150 pagesize 999
col EVENT for a30
col USERNAME for a10
col MACHINE for a10
col OSUSER for a10
col PROGRAM for a10
select inst_id,sid,serial#,sql_id,event,blocking_instance,blocking_session,blocking_session_status,final_blocking_instance,final_blocking_session,final_blocking_session_status,username,status,LAST_CALL_ET,machine,osuser,program,sql_hash_value,to_char(logon_time,'yyyy-mm-dd hh24:mi:ss'),prev_sql_id from gv$session where sid=&sid;
复制

--查看sql_id对应具体语句sql_text

set linesize 150 pagesize 999
select a.sql_id,a.child_number,a.sql_text,a.parsing_schema_name,a.first_load_time,a.last_load_time,a.last_active_time,a.plan_hash_value,a.sql_plan_baseline from gv$sql a where sql_id='&sql_id' order by last_active_time desc;
复制


3、杀会话(注意核对会话信息)

执行步骤1生成的kill session语句

八、验证问题已经恢复的步骤和命令

--单个会话的锁等待队列长度,恢复为0则无堵塞

select /*+rule*/'lock_num:'||nvl(max(count(aa.holding_session)),0) lock_num  from dba_waiters aa group by aa.holding_session;
复制

LOCK_NUM

-------------------------------------------------

lock_num:0


oracle行锁等待处理

1、接到告警,有几个SQL大量行锁等待

2、查询top event sql

set linesize 1000
 select inst_id,sql_id,EVENT,sum(DELTA_TIME),count(1)
    from gv$active_session_history ash
    where ash.SAMPLE_TIME >= sysdate - 1/24
    group by inst_id,sql_id,EVENT
    order by count(1) desc,sum(DELTA_TIME) desc
    OFFSET 0 ROWS FETCH NEXT 10 ROWS ONLY;
复制

3、查SQL文本,确定被锁的表为 beneficiary_info

select

distinct sql_id,sql_text from gv$sqlarea t where t.sql_id in

('3cp53mvc1wvwj','brwrfn81as7nk');


4、查表锁情况及阻塞源

select sess.inst_id,sess.sid,sess.serial#, lo.oracle_username,lo.os_user_name, ao.object_name,lo.locked_mode,
sess.prev_sql_id,sess.sql_id,sess.status,sess.LAST_CALL_ET,sess.LOGON_TIME,sess.event,
sess.ROW_WAIT_OBJ#,sess.BLOCKING_SESSION,sess.FINAL_BLOCKING_SESSION,sess.BLOCKING_INSTANCE,
(select s.sql_text from gv$sql s where s.sql_id=sess.sql_id and rownum=1) sql_text,
(select s.sql_text from gv$sql s where s.sql_id=sess.prev_sql_id and rownum=1) sql_text1
from gv$locked_object lo,dba_objects ao,gv$session sess 
where ao.object_id=lo.object_id 
and lo.session_id=sess.sid
and lo.inst_id = sess.inst_id
and ao.object_name=upper('beneficiary_info');
复制

发现有大量阻塞,且没有明确的阻塞源

5、sys登录数据库做hang分析

SYS@lucd01> oradebug setmypid
Statement processed.
SYS@lucd01>  oradebug -g all hanganalyze 3
Hang Analysis in /paic/app/oracle/rdbms/diag/rdbms/lucd0/lucd01/trace/lucd01_diag_409518.trc
SYS@lucd01> exit
Disconnected from Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.5.0.0.0
复制

6、快速截取hang分析关键信息

cnsz083184:lucd01 > grep -n "HANG ANALYSIS:" /paic/app/oracle/rdbms/diag/rdbms/lucd0/lucd01/trace/lucd01_diag_409518.trc

2445:HANG ANALYSIS:

cnsz083184:lucd01 > vi /paic/app/oracle/rdbms/diag/rdbms/lucd0/lucd01/trace/lucd01_diag_409518.trc

vi后esc :2440 跳转到相应行

从HANG ANALYSIS:开始截取,截取到主要阻塞,重点关注 Chains

most likely to have caused the hang 内容

*** 2022-07-28T14:39:38.300811+08:00
===============================================================================
HANG ANALYSIS:
  instances (db_name.oracle_sid): lucd0.lucd01, lucd0.lucd02, lucd0.lucd03
  oradebug_node_dump_level: 3
  analysis initiated by oradebug
  os thread scheduling delay history: (sampling every 1.000000 secs)
    0.000000 secs at [ 14:39:37 ]
      NOTE: scheduling delay has not been sampled for 0.630678 secs
    0.000000 secs from [ 14:39:33 - 14:39:38 ], 5 sec avg
    0.000000 secs from [ 14:38:38 - 14:39:38 ], 1 min avg
    0.000000 secs from [ 14:34:38 - 14:39:38 ], 5 min avg
  vktm time drift history
===============================================================================

Chains most likely to have caused the hang:
 [a] Chain 1 Signature: 'SQL*Net message from client'<='enq: TX - row lock contention'
     Chain 1 Signature Hash: 0x38c48850
 [b] Chain 2 Signature: 'SQL*Net message from client'<='enq: TX - row lock contention'
     Chain 2 Signature Hash: 0x38c48850
 [c] Chain 3 Signature: 'SQL*Net message from client'<='enq: TX - row lock contention'
     Chain 3 Signature Hash: 0x38c48850

===============================================================================
Non-intersecting chains:

-------------------------------------------------------------------------------
Chain 1:
-------------------------------------------------------------------------------
    Oracle session identified by:
    {
                instance: 1 (lucd0.lucd01)
                   os id: 377755
              process id: 1783, oracle@cnsz083184
              session id: 1615
        session serial #: 14965
             module name: 0 (JDBC Thin Client85 (TNS V1-V3)3)SH046036 (TNS V1)
    }
    is waiting for 'enq: TX - row lock contention' with wait info:
    {
                      p1: 'name|mode'=0x54580006
                      p2: 'usn<<16 | slot'=0x15d000b
                      p3: 'sequence'=0x2f6fd8
                     px1: 'pdb uid'=0x0
                     px2: 'master hint'=0x2
            time in wait: 1741 min 15 sec
           timeout after: never
                 wait id: 7894
                blocking: 0 sessions
          current sql_id: 911516574
             current sql: update beneficiary_info
                set


                        applicant_relation = :1 ,
                        applicant_relation_desc = :2

                where polno = :3
                  and ben_type = :4
                  and benef_seq = :5
                  and benef_sts = 1
             short stack: ksedsts()+783<-ksdxfstk()+58<-ksdxcb()+872<-sspuser()+200<-__sighandler()<-semtimedop()+10<-skgpwwait()+187<-ksliwat()+2179<-kslwaitctx()+192<-kjusuc()+4698<-ksipgetctxia()+1968<-ksqcmi()+2757<-ksqgtlctx()+6620<-ksqgelctx()+838<-ktuGetTxForXid()+324<-ktcwit1()+367<-kdddgb()+5478<-kdusru()+458<-kauupd()+356<-updrow()+1699<-qerupUpdRow()+725<-qerupFetch()+644<-updaul()+1416<-updThreePhaseExe()+340<-updexe()+443<-opiexe()+11815<-kpoal8()+2387<-opiodr()+1202<-ttcpip()+1222<-opitsk()+1903<-opiino()+936<-opiodr()+1202<-
            wait history:
              * time between current wait and wait #1: 0.000043 sec
              1.       event: 'gc current block busy'
                 time waited: 0.000165 sec
                     wait id: 7893             p1: ''=0x8ee
                                               p2: ''=0xe40d6
                                               p3: ''=0x2000001
              * time between wait #1 and #2: 0.000182 sec
              2.       event: 'gc cr block 2-way'
                 time waited: 0.000265 sec
                     wait id: 7892             p1: ''=0x8ee
                                               p2: ''=0xe40d6
                                               p3: ''=0x1
              * time between wait #2 and #3: 0.000314 sec
              3.       event: 'SQL*Net message from client'
                 time waited: 9.914156 sec
                     wait id: 7891             p1: 'driver id'=0x54435000
                                               p2: '#bytes'=0x1
    }
    and is blocked by
 => Oracle session identified by:
    {
                instance: 1 (lucd0.lucd01)
                   os id: 61727
              process id: 1165, oracle@cnsz083184
              session id: 7528
        session serial #: 29673
             module name: 0 (JDBC Thin Client85 (TNS V1-V3)))dg-79b9f9b7dd-sg)
    }
    which is waiting for 'SQL*Net message from client' with wait info:
    {
                      p1: 'driver id'=0x54435000
                      p2: '#bytes'=0x1
            time in wait: 2.929703 sec
           timeout after: never
                 wait id: 151340
                blocking: 16 sessions
          current sql_id: 0
             current sql: <none>
             short stack: ksedsts()+783<-ksdxfstk()+58<-ksdxcb()+872<-sspuser()+200<-__sighandler()<-read()+14<-snttread()+16<-nttfprd()+354<-nsbasic_brc()+399<-nioqrc()+438<-opikndf2()+999<-opitsk()+913<-opiino()+936<-opiodr()+1202<-opidrv()+1094<-sou2o()+165<-opimai_real()+422<-ssthrdmain()+417<-main()+256<-__libc_start_main()+245
            wait history:
              * time between current wait and wait #1: 0.000034 sec
              1.       event: 'SQL*Net message to client'
                 time waited: 0.000002 sec
                     wait id: 151339           p1: 'driver id'=0x54435000
                                               p2: '#bytes'=0x1
              * time between wait #1 and #2: 0.000006 sec
              2.       event: 'SQL*Net break/reset to client'
                 time waited: 0.001725 sec
                     wait id: 151338           p1: 'driver id'=0x54435000
                                               p2: 'break?'=0x0
              * time between wait #2 and #3: 0.000001 sec
              3.       event: 'SQL*Net break/reset to client'
                 time waited: 0.000004 sec
                     wait id: 151337           p1: 'driver id'=0x54435000
                                               p2: 'break?'=0x1
    }

Chain 1 Signature: 'SQL*Net message from client'<='enq: TX - row lock contention'
Chain 1 Signature Hash: 0x38c48850
-------------------------------------------------------------------------------

-------------------------------------------------------------------------------
Chain 2:
-------------------------------------------------------------------------------
    Oracle session identified by:
    {
                instance: 1 (lucd0.lucd01)
                   os id: 423707
              process id: 3271, oracle@cnsz083184
              session id: 7120
        session serial #: 1978
             module name: 0 (JDBC Thin Client(TNS V1-V3)V3)3)dg-79b9f9b7dd-rz)
    }
    is waiting for 'enq: TX - row lock contention' with wait info:
    {
                      p1: 'name|mode'=0x54580006
                      p2: 'usn<<16 | slot'=0x600013
                      p3: 'sequence'=0x1fad46
                     px1: 'pdb uid'=0x0
                     px2: 'master hint'=0x2
            time in wait: 1046 min 56 sec
           timeout after: never
                 wait id: 13477
                blocking: 0 sessions
          current sql_id: 3153260194
             current sql: update beneficiary_info i
        set i.sex            = :1 ,
        i.country            = :2 ,
        i.id_type            = :3 ,
        i.idno_validity_date = :4 ,
        i.prof_code          = :5 ,
        i.ben_addr           = :6 ,
        i.ben_tel            = :7
        where i.polno=:8  and i.clientno=
             short stack: ksedsts()+783<-ksdxfstk()+58<-ksdxcb()+872<-sspuser()+200<-__sighandler()<-semtimedop()+10<-skgpwwait()+187<-ksliwat()+2179<-kslwaitctx()+192<-kjusuc()+4698<-ksipgetctxia()+1968<-ksqcmi()+2757<-ksqgtlctx()+6620<-ksqgelctx()+838<-ktuGetTxForXid()+324<-ktcwit1()+367<-kdddgb()+5478<-kdusru()+458<-kauupd()+356<-updrow()+1699<-qerupUpdRow()+725<-qerupFetch()+644<-updaul()+1416<-updThreePhaseExe()+340<-updexe()+443<-opiexe()+11815<-kpoal8()+2387<-opiodr()+1202<-ttcpip()+1222<-opitsk()+1903<-opiino()+936<-opiodr()+1202<-
            wait history:
              * time between current wait and wait #1: 0.000331 sec
              1.       event: 'SQL*Net message from client'
                 time waited: 0.002083 sec
                     wait id: 13476            p1: 'driver id'=0x54435000
                                               p2: '#bytes'=0x1
              * time between wait #1 and #2: 0.000157 sec
              2.       event: 'SQL*Net message to client'
                 time waited: 0.000002 sec
                     wait id: 13475            p1: 'driver id'=0x54435000
                                               p2: '#bytes'=0x1
              * time between wait #2 and #3: 0.000196 sec
              3.       event: 'SQL*Net message from client'
                 time waited: 8.129646 sec
                     wait id: 13474            p1: 'driver id'=0x54435000
                                               p2: '#bytes'=0x1
    }
    and is blocked by
 => Oracle session identified by:
    {
                instance: 1 (lucd0.lucd01)
                   os id: 117934
              process id: 827, oracle@cnsz083184
              session id: 8504
        session serial #: 40824
             module name: 0 (JDBC Thin Client84 (TNS V1-V3)3)g-79b9f9b7dd-hhp)
    }
    which is waiting for 'SQL*Net message from client' with wait info:
    {
                      p1: 'driver id'=0x54435000
                      p2: '#bytes'=0x1
            time in wait: 6.853693 sec
           timeout after: never
                 wait id: 172510
                blocking: 2 sessions
          current sql_id: 0
             current sql: <none>
             short stack: ksedsts()+783<-ksdxfstk()+58<-ksdxcb()+872<-sspuser()+200<-__sighandler()<-read()+14<-snttread()+16<-nttfprd()+354<-nsbasic_brc()+399<-nioqrc()+438<-opikndf2()+999<-opitsk()+913<-opiino()+936<-opiodr()+1202<-opidrv()+1094<-sou2o()+165<-opimai_real()+422<-ssthrdmain()+417<-main()+256<-__libc_start_main()+245
            wait history:
              * time between current wait and wait #1: 0.000007 sec
              1.       event: 'SQL*Net message to client'
                 time waited: 0.000001 sec
                     wait id: 172509           p1: 'driver id'=0x54435000
                                               p2: '#bytes'=0x1
              * time between wait #1 and #2: 0.000106 sec
              2.       event: 'db file sequential read'
                 time waited: 0.000512 sec
                     wait id: 172508           p1: 'file#'=0x5d1
                                               p2: 'block#'=0x219c5b
                                               p3: 'blocks'=0x1
              * time between wait #2 and #3: 0.000662 sec
              3.       event: 'SQL*Net message from client'
                 time waited: 0.265024 sec
                     wait id: 172507           p1: 'driver id'=0x54435000
                                               p2: '#bytes'=0x1
    }

Chain 2 Signature: 'SQL*Net message from client'<='enq: TX - row lock contention'
Chain 2 Signature Hash: 0x38c48850
-------------------------------------------------------------------------------

-------------------------------------------------------------------------------
Chain 3:
-------------------------------------------------------------------------------
    Oracle session identified by:
    {
                instance: 1 (lucd0.lucd01)
                   os id: 338848
              process id: 2047, oracle@cnsz083184
              session id: 7675
        session serial #: 39507
             module name: 0 (JDBC Thin Client85 (TNS V1-V3)3)dg-79b9f9b7dd-dk)
    }
    is waiting for 'enq: TX - row lock contention' with wait info:
    {
                      p1: 'name|mode'=0x54580006
                      p2: 'usn<<16 | slot'=0xcd0011
                      p3: 'sequence'=0x2e8cea
                     px1: 'pdb uid'=0x0
                     px2: 'master hint'=0x5
            time in wait: 140 min 13 sec
           timeout after: never
                 wait id: 45070
                blocking: 0 sessions
          current sql_id: 3153260194
             current sql: update beneficiary_info i
        set i.sex            = :1 ,
        i.country            = :2 ,
        i.id_type            = :3 ,
        i.idno_validity_date = :4 ,
        i.prof_code          = :5 ,
        i.ben_addr           = :6 ,
        i.ben_tel            = :7
        where i.polno=:8  and i.clientno=
             short stack: ksedsts()+783<-ksdxfstk()+58<-ksdxcb()+872<-sspuser()+200<-__sighandler()<-semtimedop()+10<-skgpwwait()+187<-ksliwat()+2179<-kslwaitctx()+192<-kjusuc()+4698<-ksipgetctxia()+1968<-ksqcmi()+2757<-ksqgtlctx()+6620<-ksqgelctx()+838<-ktuGetTxForXid()+324<-ktcwit1()+367<-kdddgb()+5478<-kdusru()+458<-kauupd()+356<-updrow()+1699<-qerupUpdRow()+725<-qerupFetch()+644<-updaul()+1416<-updThreePhaseExe()+340<-updexe()+443<-opiexe()+11815<-kpoal8()+2387<-opiodr()+1202<-ttcpip()+1222<-opitsk()+1903<-opiino()+936<-opiodr()+1202<-
            wait history:
              * time between current wait and wait #1: 0.000044 sec
              1.       event: 'gc current block 2-way'
                 time waited: 0.000102 sec
                     wait id: 45069            p1: ''=0x5
                                               p2: ''=0x1b0
                                               p3: ''=0x1a9
              * time between wait #1 and #2: 0.000041 sec
              2.       event: 'Disk file operations I/O'
                 time waited: 0.000007 sec
                     wait id: 45068            p1: 'FileOperation'=0x2
                                               p2: 'fileno'=0x5
                                               p3: 'filetype'=0x2
              * time between wait #2 and #3: 0.000191 sec
              3.       event: 'gc current block busy'
                 time waited: 0.000161 sec
                     wait id: 45067            p1: ''=0x8cc
                                               p2: ''=0x34607c
                                               p3: ''=0x2000001
    }
    and is blocked by
 => Oracle session identified by:
    {
                instance: 3 (lucd0.lucd03)
                   os id: 240730
              process id: 3294, oracle@cnsz083186
              session id: 8722
        session serial #: 48990
             module name: 0 (JDBC Thin Clientconnection-prd-dg-79b9f9b7dd-sp8)
    }
    which is waiting for 'SQL*Net message from client' with wait info:
    {
                      p1: 'driver id'=0x54435000
                      p2: '#bytes'=0x1
            time in wait: 0.763401 sec
           timeout after: never
                 wait id: 121252
                blocking: 7 sessions
          current sql_id: 0
             current sql: <none>
             short stack: ksedsts()+783<-ksdxfstk()+58<-ksdxcb()+872<-sspuser()+200<-__sighandler()<-read()+14<-snttread()+16<-nttfprd()+354<-nsbasic_brc()+399<-nioqrc()+438<-opikndf2()+999<-opitsk()+913<-opiino()+936<-opiodr()+1202<-opidrv()+1094<-sou2o()+165<-opimai_real()+422<-ssthrdmain()+417<-main()+256<-__libc_start_main()+245
            wait history:
              * time between current wait and wait #1: 0.000006 sec
              1.       event: 'SQL*Net message to client'
                 time waited: 0.000001 sec
                     wait id: 121251           p1: 'driver id'=0x54435000
                                               p2: '#bytes'=0x1
              * time between wait #1 and #2: 0.000086 sec
              2.       event: 'db file sequential read'
                 time waited: 0.000711 sec
                     wait id: 121250           p1: 'file#'=0x5e6
                                               p2: 'block#'=0x1b1c5d
                                               p3: 'blocks'=0x1
              * time between wait #2 and #3: 0.000007 sec
              3.       event: 'gc current grant busy'
                 time waited: 0.000147 sec
                     wait id: 121249           p1: ''=0x5e6
                                               p2: ''=0x1b1c5d
                                               p3: ''=0x2000001
    }

Chain 3 Signature: 'SQL*Net message from client'<='enq: TX - row lock contention'
Chain 3 Signature Hash: 0x38c48850

复制

7、分析阻塞源

从上面的信息中找到 is blocked by,提取里面的session信息,就是阻塞源

同时结合锁表情况,发现阻塞源是这4个INACTIVE的session


8、联系运营kill掉阻塞源

alter system kill session '&sid,&serial#,@&inst_id';

再次查询锁表情况,看效果

「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论