暂无图片
暂无图片
暂无图片
暂无图片
2
暂无图片

\'导致的insert hang

背景

mysql定期归档到oracle12c任务,这个月突然卡住了,mysql 批量生成的insert 脚本执行时hang住了。脚本如下:

#!/bin/bash
$ORACLE_HOME/bin/sql usern/pass >> imp.log <<eof
set define off
show user;
@imp.sql
commit;
set time on;
exit;
eof
复制

问题分析

1、等待事件趋势
第一次排查hang后通过event入手

TO_CHAR(SAMPL SESSION_ID SESSION_SERIAL# EVENT                                                            STATUS  BLOCKING_SESSION   COUNT(*)
------------- ---------- --------------- ---------------------------------------------------------------- ------- ---------------- ----------
2021-08-12 18       2003            9370 latch: shared pool                                               WAITING             1429          1
2021-08-12 18       2003            9370                                                                  ON CPU                         1912
2021-08-12 19       2003            9370                                                                  ON CPU                         1833
2021-08-12 20       2003            9370 latch: shared pool                                               WAITING             1706          1
2021-08-12 20       2003            9370                                                                  ON CPU                         1912
2021-08-12 21       2003            9370                                                                  ON CPU                         1880
2021-08-12 22       2003            9370 latch: shared pool                                               WAITING              866          2
2021-08-12 22       2003            9370                                                                  ON CPU                         1900
2021-08-12 23       2003            9370                                                                  ON CPU                         1917
2021-08-13 00       2003            9370 SQL*Net break/reset to client                                    WAITING                           1
2021-08-13 00       2003            9370                                                                  ON CPU                         1923

TO_CHAR(SAMPL SESSION_ID SESSION_SERIAL# EVENT                                                            STATUS  BLOCKING_SESSION   COUNT(*)
------------- ---------- --------------- ---------------------------------------------------------------- ------- ---------------- ----------
2021-08-13 01       2003            9370                                                                  ON CPU                         1864
2021-08-13 02       2003            9370                                                                  ON CPU                         1844
2021-08-13 03       2003            9370                                                                  ON CPU                         1897
2021-08-13 04       2003            9370 latch: shared pool                                               WAITING             1431          1
2021-08-13 04       2003            9370                                                                  ON CPU                         1841
2021-08-13 05       2003            9370                                                                  ON CPU                         1910
2021-08-13 06       2003            9370                                                                  ON CPU                          881

最后的blcking session 1431,12c⾃动捕获报告特性.但频率不是很高,时间也对应不上,凌晨6点卡住,最后block是凌晨5点
TO_CHAR(SAMPL SESSION_ID SESSION_SERIAL# EVENT                                                            STATUS  BLOCKING_SESSION MODULE                 COUNT(*)
------------- ---------- --------------- ---------------------------------------------------------------- ------- ---------------- -------------------- ----------
2021-08-12 15       1431            3223                                                                  ON CPU                   MMON_SLAVE                   11
2021-08-12 15       1431           31349                                                                  ON CPU                   MMON_SLAVE                    1
2021-08-12 16       1431           55435                                                                  ON CPU                   MMON_SLAVE                    1
2021-08-12 16       1431           62320                                                                  ON CPU                   MMON_SLAVE                    1
2021-08-12 16       1431           64971                                                                  ON CPU                   MMON_SLAVE                    1
2021-08-12 17       1431           49301                                                                  ON CPU                   MMON_SLAVE                    1
2021-08-12 17       1431           54968                                                                  ON CPU                   MMON_SLAVE                    1
2021-08-12 23       1431           23088                                                                  ON CPU                   MMON_SLAVE                    1
2021-08-12 23       1431           38171                                                                  ON CPU                   MMON_SLAVE                    1
2021-08-13 02       1431           36114                                                                  ON CPU                   MMON_SLAVE                    1
2021-08-13 02       1431           39168                                                                  ON CPU                   MMON_SLAVE                    1

TO_CHAR(SAMPL SESSION_ID SESSION_SERIAL# EVENT                                                            STATUS  BLOCKING_SESSION MODULE                 COUNT(*)
------------- ---------- --------------- ---------------------------------------------------------------- ------- ---------------- -------------------- ----------
2021-08-13 04       1431           14682                                                                  ON CPU                   MMON_SLAVE                    1
2021-08-13 04       1431           20053                                                                  ON CPU                   MMON_SLAVE                    1
2021-08-13 05       1431           62271                                                                  ON CPU                   MMON_SLAVE                    1
2021-08-13 09       1431           47165                                                                  ON CPU                   MMON_SLAVE                    1

复制

2、锁信息


EVENT                          SQL_ID               MODULE                         STATUS   STATE               BLOCKING_SESSION LOCKWAIT          WAIT_TIME
------------------------------ -------------------- ------------------------------ -------- ------------------- ---------------- ---------------- ----------
SQL*Net message from client                         java@xxxx (TNS V1-V3)          INACTIVE WAITING   

从这看很奇怪,会话的等待为SQL*Net message from client,从lock看也没有blocking。

       SID TYPE         HOLD         REQUEST             ID1        ID2      CTIME BLOCK_OTHERS
---------- ------------ ------------ ------------ ---------- ---------- ---------- ----------------
         4 AE           Share        None                133          0    6964804 Not Blocking
        12 AE           Share        None                133          0      57237 Not Blocking
       864 AE           Share        None                133          0      71450 Not Blocking
       866 AE           Share        None                133          0       2083 Not Blocking
      1139 Redo Thread  Exclusive    None                  1          0    6964808 Not Blocking
      1141 AE           Share        None                133          0      64226 Not Blocking
      1423 XR           Null         None                  4          0    6964813 Not Blocking
      1423 RD           Null         None                  1          0    6964813 Not Blocking
      1423 Control File Row-S (SS)   None                  0          0    6964813 Not Blocking
      1423 RS           Row-S (SS)   None                 25          1    6964808 Not Blocking
      1429 AE           Share        None                133          0      74523 Not Blocking
      1706 KT           Share        None              18974          0      32728 Not Blocking
      1706 KD           Exclusive    None                  0          0    6964805 Not Blocking
      1717 AE           Share        None                133          0          0 Not Blocking
      1991 Temp Segment Row-X (SX)   None                  3          1    6964805 Not Blocking
      2003 AE           Share        None                133          0      82454 Not Blocking
      2003 DML          Row-X (SX)   None             328259          0      82451 Not Blocking
      2003 Transaction  Exclusive    None             327704     292546      82451 Not Blocking
      
复制

3、分析10046 trace ,截取部分内容如下:

SO: 0xb1118e88, type: 109, owner: 0xee4b9740, flag: INIT/-/-/-/0x00 if: 0x3 c: 0x3
       proc=0xed7414c8, name=LIBRARY OBJECT LOCK, file=kgl.h LINE:9855, pg=0, conuid=0

      LibraryObjectLock:  Address=0xb1118e88 Handle=0x89c36e20 Mode=N 
        CanBeBrokenCount=1 Incarnation=1 ExecutionCount=0 
        Context=0x7f04ca789660 
        User=0xee4b9740 Session=0xee4b9740 ReferenceCount=1 
        Flags=[0000] SavepointNum=0 Time=08/13/2021 06:28:25 
      LibraryHandle:  Address=0x89c36e20 Hash=0 LockMode=N PinMode=0 LoadLockMode=0 Status=VALD 
        Name:  Namespace=SQL AREA(00) Type=CURSOR(00) ContainerId=0 
        Statistics:  InvalidationCount=0 ExecutionCount=1 LoadCount=1 ActiveLocks=1 TotalLockCount=1 TotalPinCount=2 
        Counters:  BrokenCount=1 RevocablePointer=1 KeepDependency=0 Version=0 BucketInUse=0 HandleInUse=0 HandleReferenceCount=0 
        Concurrency:  DependencyMutex=0x89c36ed0(0, 0, 0, 0) Mutex=0x7b9e2598(0, 73, 0, 0) 
        Flags=RON/PIN/PN0/EXP/CHD/[10012111] Flags2=[0000] 
        WaitersLists:  
          Lock=0x89c36eb0[0x89c36eb0,0x89c36eb0] 
          Pin=0x89c36e90[0x89c36e90,0x89c36e90] 
          LoadLock=0x89c36f08[0x89c36f08,0x89c36f08] 
        LibraryObject:  Address=0x8bbc4a28 HeapMask=0000-0001-0001-0000 Flags=EXS[0000] Flags2=[0000] PublicFlags=[0000] 
        NamespaceDump:  
          Child Cursor:  Heap0=0x8bbc4af8 Heap6=0xabc80400 Heap0 Load Time=08-13-2021 06:28:25 Heap6 Load Time=08-13-2021 06:28:25 
      ----------------------------------------
      SO: 0xc3005ff8, type: 109, owner: 0xee4b9740, flag: INIT/-/-/-/0x00 if: 0x3 c: 0x3
       proc=0xed7414c8, name=LIBRARY OBJECT LOCK, file=kgl.h LINE:9855, pg=0, conuid=0

      LibraryObjectLock:  Address=0xc3005ff8 Handle=0x7b9e2448 Mode=N 
        CanBeBrokenCount=1 Incarnation=1 ExecutionCount=1 
        
        User=0xee4b9740 Session=0xee4b9740 ReferenceCount=1 
        Flags=CNB/[0001] SavepointNum=1a1607ca Time=08/13/2021 06:28:25 
      LibraryHandle:  Address=0x7b9e2448 Hash=51d22eff LockMode=N PinMode=0 LoadLockMode=0 Status=VALD 
        ObjectName:  Name=INSERT INTO table—name VALUES (xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx) 
10046的trace中并没有发现异常,在对应insert hang的时间点发现最后执行的insert 语句也是正常的。
          FullHashValue=b27cdaeb5a2603c11dfc564551d22eff Namespace=SQL AREA(00) Type=CURSOR(00) ContainerId=0 ContainerUid=0 Identifier=1372729087 OwnerIdn=80 
        Statistics:  InvalidationCount=0 ExecutionCount=1 LoadCount=2 ActiveLocks=1 TotalLockCount=1 TotalPinCount=1 
        Counters:  BrokenCount=1 RevocablePointer=1 KeepDependency=1 Version=0 BucketInUse=0 HandleInUse=0 HandleReferenceCount=0 
        Concurrency:  DependencyMutex=0x7b9e24f8(0, 7, 0, 0) Mutex=0x7b9e2598(0, 73, 0, 0) 
        Flags=RON/PIN/TIM/PN0/DBN/[10012841] Flags2=[0000] 
        WaitersLists:  
          Lock=0x7b9e24d8[0x7b9e24d8,0x7b9e24d8] 
          Pin=0x7b9e24b8[0x7b9e24b8,0x7b9e24b8] 
          LoadLock=0x7b9e2530[0x7b9e2530,0x7b9e2530] 
        Timestamp:  Current=08-13-2021 06:28:25 
        HandleReference:  Address=0x7b9e26f0 Handle=(nil) Flags=[00] 
        LibraryObject:  Address=0xb3bae630 HeapMask=0000-0001-0001-0000 Flags=EXS[0000] Flags2=[0000] PublicFlags=[0000] 
          ChildTable:  size='16' 
            Child:  id='0' Table=0xb3baf4b0 Reference=0xb3baefa0 Handle=0x89c36e20 
        NamespaceDump:  
          Parent Cursor:  sql_id=1vz2q8p8x4brz parent=0xb3bae700 maxchild=1 plk=y ppn=n prsfcnt=0 obscnt=0 
复制

处理思路

好像没有头绪了,操作系统信息,alert日志等也没有异常。多次测试都在凌晨4 5点发生异常,怀疑是不是这个时间点受oracle job问题。做了如下操作尝试:

禁用sql tuning 与space advisor
exrc dbms_auto_task_admin.disable('auto space advisor',NULL,NULL);
exec dbms_auto_task_admin.disable('sql tuning advisor',NULL,NULL);
并锁住导入表,避免收集统计信息
EXEC DBMS_STATS.delete_table_stats('<owner name>','<table name>'); 
EXEC DBMS_STATS.LOCK_TABLE_STATS ('owner name', 'table name'); 
复制

再次尝试仍是相同问题。并且测试换个时间段在白天也复现了。继续返回trace文件分析。上述trace中记录到的最后一条insert 语句是正常的,分析批量脚本。

grep '001'  imp.sql -C 3 截取部分数据,001是trace中记录到的最后insert。
INSERT INTO tf_b_sms_receive_202009 VALUES ('001');
001的下一行002 乱码与\' \0 
INSERT INTO tf_b_sms_receive_202009 VALUES ('002','�\0�Ip�x(\'\0\0\0\0\0.��\0v�����C����6ўz��\0��]b\".۰F	N�О��^#q�','�\0�Ip�x(\'\0\0\0\0\0.��\0v�����C����6ўz��\0��]b\".۰F	N�О��^#q�');
复制

直接删除这条语句再试,成功导入。

模拟分析

删除后虽然跑成功了,但还要细看是什么原因导致的,如何避免,几百万行导入不可能从头排查。

$ORACLE_HOME/bin/sql t1/t1 > /home/oracle/test/imp.log <<eof
set define off
show user;
alter session set max_dump_file_size = unlimited;
alter session set events '10046 trace name context forever, level 12';
@/home/oracle/test/hang.sql
commit;
set time on;
exit; 
eof
复制

测试环境执行脚本导入并生成10046 trace分析,结果和上述trace相同,仍然到问题行就hang住,且trace中没有记录解析此sql任何信息。
使用sql developer 图形化工具单独执行报错ORA-01756。

'�\0�Ip�x(\'\0\0\0\0\0.��\0v�����C����6ўz��\0��]b\".۰F	N�О��^#q�'
复制

尝试将’替换为\。再次执行通过。这其实也是调用$ORACLE_HOME/bin/sql的原因。使用sqlplus是直接报错跳过的,但sql支持更长的insert字符,并且其他的符号兼容相对较好,各有利弊吧。
预防其实很简单,导入前批量替换 \’。如担心误伤,可以通过grep -n 将带\'的sql先抓出来,然后sed -i 删除指定行。

最后修改时间:2021-08-24 15:50:49
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论

weizhao.zhang (anbob)
暂无图片
3年前
评论
暂无图片 0
我还是看到了 \0
3年前
暂无图片 点赞
1
茂材
暂无图片
3年前
回复
暂无图片 0
是有\0,不过只去掉\后的单引号\0报错跳过去了
3年前
暂无图片 点赞
回复