暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

Index partition unusable and wait ‘kpodplck wait before retrying ORA-54’ event during direct sqlldr

原创 Anbob 2015-06-14
894

最近有人遇到用SQL*Loader 向数据库加载数据时发现很慢,并且反映有几个分区表上的local索引总是递归的变成invalid or unusable,rebuild该分区失效索引,另一个索引分区又失效。
对于失效索引开始启用了个DDL trigger,并没有发现这段时间有前台进程DDL,还有一种可能在sqlldr 加载数据时使用了直接路径时(DIRECT=TRUE) 也会导致索引失效,要来了sqlldr 的脚本如下:
sqlldr WEEJAR/WEEJAR_321@anbob_n2 control=../tmp/15861558temp.CTL log=../tmp/15861558temp.LOG data=../tmp/15861558temp.DATA parallel=flase direct=true rows=1000000
复制

Note:
这里sqlldr确实使用了direct path load, 当sqlldr使用了直接路径加载时,对于表上的索引是使用了一种独立的排序旧的index和新增的索引值合并成新的index segment,而且对于索引维护默认是推迟到加载完所有的数据后自动维护的, 对于大表或大的索引这过程将会花费较长的时间,加载时查询index partition是invalid状态,自动维护索引rebuild前提是有足够可用的temp 表空间,否则sqlldr 的数据仍会导入,但是index会被剩下,停留在invalid or unusable状态。因为sqlldr 中没有skip_index_maintenance参数,告知了自动维护的原理并没必要跟着rebuild index, 最后等load完所有数据查询确认了无无效索引。
下面分析load慢的现象
SQL> select username,machine,program,event ,status from v$session where event like 'kpodplck wait before%';
USERNAME MACHINE PROGRAM EVENT STATUS
---------- ---------- -------------- ------------------------------------ --------
WEEJAR anbob2 sqlldr@anbob2 ( kpodplck wait before retrying ORA-54 ACTIVE
WEEJAR anbob2 sqlldr@anbob2 ( kpodplck wait before retrying ORA-54 ACTIVE
WEEJAR anbob2 sqlldr@anbob2 ( kpodplck wait before retrying ORA-54 ACTIVE
WEEJAR anbob2 sqlldr@anbob2 ( kpodplck wait before retrying ORA-54 ACTIVE
WEEJAR anbob2 sqlldr@anbob2 ( kpodplck wait before retrying ORA-54 ACTIVE
WEEJAR anbob2 sqlldr@anbob2 ( kpodplck wait before retrying ORA-54 ACTIVE
WEEJAR anbob2 sqlldr@anbob2 ( kpodplck wait before retrying ORA-54 ACTIVE
... retrying ORA-54
WEEJAR anbob2 sqlldr@anbob2 ( kpodplck wait before retrying ORA-54 ACTIVE
WEEJAR anbob2 sqlldr@anbob2 ( kpodplck wait before retrying ORA-54 ACTIVE
WEEJAR anbob2 sqlldr@anbob2 ( kpodplck wait before retrying ORA-54 ACTIVE
WEEJAR anbob2 sqlldr@anbob2 ( kpodplck wait before retrying ORA-54 ACTIVE
WEEJAR anbob2 sqlldr@anbob2 ( kpodplck wait before retrying ORA-54 ACTIVE
WEEJAR anbob2 sqlldr@anbob2 ( kpodplck wait before retrying ORA-54 ACTIVE
137 rows selected.
SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE 10.2.0.4.0 Production
TNS for HPUX: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production
kdtzb1:/oracle> uname -a
HP-UX kdtzb1 B.11.31 U ia64 2547054003 unlimited-user license
复制

Note:
这个event比较少见从名字上kpodplck Oracle Direct Path lock, ora-54 资源占用的到比较常见。 搜索了一下这个事件在MOS中找到了一个bug,但是在11.2.0.1中已修复,现在是在10.2.0.4和另一套11.2.0.3的库同时运行相同的load,现象都一样也是“kpodplck wait before retrying ORA-54”
Bug 10079079 : KPODPLCK WAIT BEFORE RETRYING ORA-54 REPORTED IN SQL LOADER DIRECT PATH LOAD

当时做了systemstate dump,给O记提了个sr
PROCESS 86:
----------------------------------------
SO: c0000020f653e9f0, type: 2, owner: 0000000000000000, flag: INIT/-/-/0x00
(process) Oracle pid=86, calls cur/top: c000002049698f20/c000002049698f20, flag: (0) -
int error: 0, call error: 0, sess error: 0, txn error 0
(post info) last post received: 0 0 0
last post received-location: No post
last process to post me: none
last post sent: 0 0 0
last post sent-location: No post
last process posted by me: none
(latch info) wait_event=0 bits=0
Process Group: DEFAULT, pseudo proc: c0000020f861a060
O/S info: user: oracle, term: UNKNOWN, ospid: 17894
OSD pid info: Unix process pid: 17894, image: oracle@kdtzb1
Short stack dump:
ksdxfstk()+48<-ksdxcb()+5776<-sspuser()+640<-<-_pw_wait()+48<-pw_wait()+128<-sskgpwwait()+384<-skgpwwait()+208<-ksliwat()+1728<-kslwaitns_timed()+112<-kskthbwt()+400<-kslwait()+640<-$cold_kpod
plck()+2976<-kpodpp()+2384<-opiodr()+2128<-ttcpip()+3088<-opitsk()+2336<-opiino()+1840<-opiodr()+2128<-opidrv()+1088<-sou2o()+336<-opimai_real()+224<-main()+368<-main_opd_entry()+80
Dump of memory from 0xC0000020F352D590 to 0xC0000020F352D798
C0000020F352D590 00000005 00000000 C0000020 4B6FFCE8 [........... Ko..]
C0000020F352D5A0 00000010 000313A7 C0000020 49698F20 [........... Ii. ]
C0000020F352D5B0 00000003 000313A7 C0000020 F5BC5FF0 [........... .._.]
C0000020F352D5C0 00000013 0003129B C0000020 F58CAB00 [........... ....]
C0000020F352D5D0 0000000B 000313A7 C0000020 F579C008 [........... .y..]
C0000020F352D5E0 00000004 0003129B 00000000 00000000 [................]
C0000020F352D5F0 00000000 00000000 00000000 00000000 [................]
Repeat 25 times
C0000020F352D790 00000000 00000000 [........]
----------------------------------------
SO: c0000020f579c008, type: 4, owner: c0000020f653e9f0, flag: INIT/-/-/0x00
(session) sid: 2093 trans: 0000000000000000, creator: c0000020f653e9f0, flag: (41) USR/- BSY/-/-/-/-/-
DID: 0001-0056-0008B621, short-term DID: 0001-0056-0008B622
txn branch: 0000000000000000
oct: 0, prv: 0, sql: 0000000000000000, psql: c000001feee0b850, user: 208/WEEJAR
service name: SYS$USERS
O/S info: user: WEEJAR, term: , ospid: 23397538, machine: anbob2
program: sqlldr@anbob2 (TNS V1-V3)
application name: SQL Loader Direct Path Load, hash value=1090021382
action name: ../tmp/21823588temp.CTL, hash value=2698844265
waiting for 'kpodplck wait before retrying ORA-54' blocking sess=0x0000000000000000 seq=31 wait_time=0 seconds since wait started=16
=0, =0, =0
Dumping Session Wait History
for 'kpodplck wait before retrying ORA-54' count=1 wait_time=1292889
=0, =0, =0
for 'kpodplck wait before retrying ORA-54' count=1 wait_time=986043
=0, =0, =0
for 'kpodplck wait before retrying ORA-54' count=1 wait_time=985409
=0, =0, =0
for 'kpodplck wait before retrying ORA-54' count=1 wait_time=984200
=0, =0, =0
for 'kpodplck wait before retrying ORA-54' count=1 wait_time=987391
=0, =0, =0
for 'kpodplck wait before retrying ORA-54' count=1 wait_time=982784
=0, =0, =0
for 'kpodplck wait before retrying ORA-54' count=1 wait_time=988524
=0, =0, =0
for 'kpodplck wait before retrying ORA-54' count=1 wait_time=985431
=0, =0, =0
for 'kpodplck wait before retrying ORA-54' count=1 wait_time=985767
=0, =0, =0
for 'kpodplck wait before retrying ORA-54' count=1 wait_time=983840
=0, =0, =0

复制

SR 的恢复,该现象不是bug, 但是当使用direct=true增加parallel=true 对资源锁转换的方式来避免或减少该事件。
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论