暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

Oracle如何排查历史问题

原创 nirvana 2024-12-19
60

收集ASH信息

导出数据库出现问题时间段的ASH信息

SQL> conn user/passwd
SQL> create table t_ash as select * from dba_hist_active_sess_history where sample_time between to_timestamp ('2020-06-12 09:00:00', 'yyyy-mm-dd hh24:mi:ss') and to_timestamp ('2020-06-12 16:00:00', 'yyyy-mm-dd hh24:mi:ss');

exp user/passwd file=t_ash.dmp tables=t_ash
复制

导入到其他数据库进行分析

imp test/test file=t_ash.dmp full=y
复制

分析ash信息

1. Active sessions count in each sample.

select /*+ parallel(t,8) */
dbid,
instance_number,
sample_id,
sample_time,
count(*) session_count,
sum(pga_allocated) pga_allocated,
sum(temp_space_allocated) temp_space_allocated
from t_ash t
group by dbid, instance_number, sample_id, sample_time
order by dbid, instance_number, sample_time;

INSTANCE_NUMBER SAMPLE_ID SAMPLE_TIME SESSION_COUNT PGA_ALLOCATED
1 129739192 2020-03-21 11:12:17.212 1 1866752
1 129739202 2020-03-21 11:12:27.222 1 2587648
1 129739232 2020-03-21 11:12:57.258 1 2177024
1 129739242 2020-03-21 11:13:07.273 1 7354368
1 129739252 2020-03-21 11:13:17.598 9 74721280
1 129739262 2020-03-21 11:13:29.268 1 5257216
1 129739272 2020-03-21 11:13:39.519 4 47238144
1 129739282 2020-03-21 11:13:49.535 2 8491008 <<<<<< OSWatcher 此时中断采样,active session 数未见明显增长
1 129739292 2020-03-21 11:13:59.544 1 931840
1 129739322 2020-03-21 11:14:29.586 1 3487744
1 129739332 2020-03-21 11:14:39.601 2 7172096
1 129739342 2020-03-21 11:14:49.615 2 5485568
1 129739362 2020-03-21 11:15:09.637 1 2504704
1 129739372 2020-03-21 11:15:21.882 1 3832832
1 129739392 2020-03-21 11:15:41.907 1 7747584
1 129739402 2020-03-21 11:15:51.923 1 3046400
1 129739412 2020-03-21 11:16:01.938 1 2111488
1 129739422 2020-03-21 11:16:11.946 1 2766848
1 129739452 2020-03-21 11:16:42.657 1 3553280
1 129739462 2020-03-21 11:16:52.671 3 9152512
1 129739472 2020-03-21 11:17:04.964 23 93088768
1 129739492 2020-03-21 11:17:24.991 3 45756416
1 129739502 2020-03-21 11:17:35.006 2 46117888
1 129739512 2020-03-21 11:17:45.015 2 47756288
1 129739522 2020-03-21 11:17:55.219 3 48705536
1 129739532 2020-03-21 11:18:06.403 6 65180672
1 129739542 2020-03-21 11:18:16.497 4 50008064
1 129739552 2020-03-21 11:18:26.599 17 192386048
1 129739562 2020-03-21 11:18:36.904 2 44103680
1 129739572 2020-03-21 11:18:46.919 5 64069632
1 129739582 2020-03-21 11:18:56.926 14 106577920 <<<<<此时active session数才开始增长
1 129739592 2020-03-21 11:19:06.941 21 151456768
1 129739602 2020-03-21 11:19:16.961 17 108587008
1 129739612 2020-03-21 11:19:28.455 19 115138560
1 129739622 2020-03-21 11:19:38.625 18 110781440
1 129739632 2020-03-21 11:19:48.641 20 125948928
1 129739642 2020-03-21 11:19:58.649 20 125948928
1 129739652 2020-03-21 11:20:08.912 21 130223104
1 129739662 2020-03-21 11:20:18.940 30 239180800
1 129739672 2020-03-21 11:20:29.352 26 152922112
1 129739682 2020-03-21 11:20:39.510 28 166582272
1 129739692 2020-03-21 11:20:49.525 27 177275904
1 129739702 2020-03-21 11:20:59.533 27 177275904
1 129739712 2020-03-21 11:21:09.548 26 176392192
1 129739722 2020-03-21 11:21:19.568 25 145126400
1 129739732 2020-03-21 11:21:29.577 27 150039552
1 129739742 2020-03-21 11:21:39.593 31 191760384
1 129739752 2020-03-21 11:21:49.609 28 159761408
1 129739762 2020-03-21 11:22:00.008 32 199402496
1 129739772 2020-03-21 11:22:11.593 31 254276608
1 129739782 2020-03-21 11:22:21.639 33 178135040
1 129739792 2020-03-21 11:22:31.655 30 189888512
1 129739802 2020-03-21 11:22:41.665 35 206495744
1 129739812 2020-03-21 11:22:51.680 30 166278144
1 129739822 2020-03-21 11:23:01.693 29 163707904
1 129739832 2020-03-21 11:23:11.702 28 160613376
1 129739842 2020-03-21 11:23:21.719 29 164494336
1 129739852 2020-03-21 11:23:31.734 30 166623232
1 129739862 2020-03-21 11:23:41.744 30 166933504
1 129739872 2020-03-21 11:23:51.758 30 166933504
1 129739882 2020-03-21 11:24:01.774 30 166933504
1 129739892 2020-03-21 11:24:11.782 30 166933504
1 129739902 2020-03-21 11:24:21.800 31 169914368
1 129739912 2020-03-21 11:24:31.815 32 175744000
1 129739922 2020-03-21 11:24:41.826 31 194538496
1 129739932 2020-03-21 11:24:51.841 31 170486784
1 129739942 2020-03-21 11:25:01.854 31 170814464
1 129739952 2020-03-21 11:25:12.498 33 181608448
1 129739962 2020-03-21 11:25:22.517 35 211980288
1 129739972 2020-03-21 11:25:32.533 35 210686976
1 129739982 2020-03-21 11:25:43.104 36 190478336
1 129739992 2020-03-21 11:25:53.121 19 111382528
1 129740002 2020-03-21 11:26:09.671 7 56924160 <<<<<< OSWatcher 再次开始采样
1 129740012 2020-03-21 11:26:19.680 3 46280704
1 129740022 2020-03-21 11:26:29.697 3 45061120
1 129740032 2020-03-21 11:26:39.713 2 41613312
1 129740062 2020-03-21 11:27:09.752 2 18216960
1 129740082 2020-03-21 11:27:29.777 1 8075264
1 129740102 2020-03-21 11:27:49.800 1 4339712
复制

2. Top n waits in each sample.

select
t.dbid,
t.sample_id,
t.sample_time,
t.instance_number,
t.event,
t.session_state,
t.c session_count
from
(
select
t.*,
rank() over(partition by dbid, instance_number, sample_time order by c desc) r
from
(
select /*+ parallel(t,8) */
t.*,
count(*) over(partition by dbid, instance_number, sample_time, event) c,
row_number() over(partition by dbid, instance_number, sample_time, event order by 1) r1
from t_ash t
/*where sample_time > to_timestamp('2020-06-12 10:00:00', 'yyyy-mm-dd hh24:mi:ss')
and sample_time < to_timestamp('2020-06-12 16:00:00', 'yyyy-mm-dd hh24:mi:ss')*/
) t
where r1 = 1
) t
where r < 3
order by dbid, instance_number, sample_time, r;



SAMPLE_TIME INSTANCE_NUMBER EVENT SESSION_STATE SESSION_COUNT
2020-03-21 11:13:49.535 1 ON CPU 1 <<<<< 问题发生前并没有异常等待,只有极少量的session on cpu
2020-03-21 11:13:59.544 1 ON CPU 1
2020-03-21 11:14:29.586 1 ON CPU 1
2020-03-21 11:14:39.601 1 ON CPU 2
2020-03-21 11:14:49.615 1 ON CPU 2
2020-03-21 11:15:09.637 1 ON CPU 1
2020-03-21 11:15:21.882 1 ON CPU 1
2020-03-21 11:15:41.907 1 ON CPU 1
2020-03-21 11:15:51.923 1 ON CPU 1
2020-03-21 11:16:01.938 1 ON CPU 1
2020-03-21 11:16:11.946 1 ON CPU 1
2020-03-21 11:16:42.657 1 ON CPU 1
2020-03-21 11:16:52.671 1 ON CPU 3
2020-03-21 11:17:04.964 1 ON CPU 23
2020-03-21 11:17:24.991 1 ON CPU 2
2020-03-21 11:17:24.991 1 Streams AQ: qmn coordinator waiting for slave to start WAITING 1
2020-03-21 11:17:35.006 1 ON CPU 2
2020-03-21 11:17:45.015 1 ON CPU 2
2020-03-21 11:17:55.219 1 ON CPU 3
2020-03-21 11:18:06.403 1 ON CPU 6
2020-03-21 11:18:16.497 1 ON CPU 4
2020-03-21 11:18:26.599 1 ON CPU 16
2020-03-21 11:18:26.599 1 control file sequential read WAITING 1
2020-03-21 11:18:36.904 1 ON CPU 2
2020-03-21 11:18:46.919 1 ON CPU 5
2020-03-21 11:18:56.926 1 latch: shared pool WAITING 7 <<<<<< 在11:18 才开始latch mutex异常等待,不过此时OS早已经hang了,因此他们是问题结果而不是原因。
2020-03-21 11:18:56.926 1 latch: row cache objects WAITING 3
2020-03-21 11:18:56.926 1 ON CPU 3
2020-03-21 11:19:06.941 1 latch: shared pool WAITING 11
2020-03-21 11:19:06.941 1 latch: row cache objects WAITING 5
2020-03-21 11:19:16.961 1 latch: shared pool WAITING 10
2020-03-21 11:19:16.961 1 latch: row cache objects WAITING 4
2020-03-21 11:19:28.455 1 latch: shared pool WAITING 10
2020-03-21 11:19:28.455 1 latch: row cache objects WAITING 4
2020-03-21 11:19:38.625 1 latch: shared pool WAITING 10
2020-03-21 11:19:38.625 1 latch: row cache objects WAITING 4
2020-03-21 11:19:48.641 1 latch: shared pool WAITING 10
2020-03-21 11:19:48.641 1 latch: row cache objects WAITING 4
2020-03-21 11:19:58.649 1 latch: shared pool WAITING 10
2020-03-21 11:19:58.649 1 latch: row cache objects WAITING 4
2020-03-21 11:20:08.912 1 latch: shared pool WAITING 10
2020-03-21 11:20:08.912 1 library cache: mutex X WAITING 4
2020-03-21 11:20:08.912 1 latch: row cache objects WAITING 4
2020-03-21 11:20:18.940 1 latch: shared pool WAITING 10
2020-03-21 11:20:18.940 1 ON CPU 7
2020-03-21 11:20:29.352 1 latch: shared pool WAITING 11
2020-03-21 11:20:29.352 1 library cache lock WAITING 5
2020-03-21 11:20:39.510 1 latch: shared pool WAITING 11
2020-03-21 11:20:39.510 1 library cache lock WAITING 5
2020-03-21 11:20:49.525 1 latch: shared pool WAITING 11
2020-03-21 11:20:49.525 1 library cache: mutex X WAITING 5
2020-03-21 11:20:59.533 1 latch: shared pool WAITING 11
复制

3. Wait chains in the ASH dump.

select /*+ parallel(t,8) */
level lv,
connect_by_isleaf isleaf,
connect_by_iscycle iscycle,
t.dbid,
t.sample_id,
t.sample_time,
t.instance_number,
t.session_id,
t.session_serial#,
--t.program,
--t.action,
t.sql_id,
--t.sql_opname,
t.event,
--t.p1, t.p2 ,t.p3,
--t.seq#,
t.session_state,
t.blocking_inst_id,
t.blocking_session,
t.blocking_session_serial#
from t_ash t
/*where sample_time > to_timestamp('2020-06-12 09:00:00', 'yyyy-mm-dd hh24:mi:ss')
and sample_time < to_timestamp('2020-06-12 16:00:00', 'yyyy-mm-dd hh24:mi:ss')*/
start with blocking_session is not null
connect by nocycle
prior dbid = dbid
and prior blocking_inst_id = instance_number
and prior blocking_session = session_id
and prior blocking_session_serial# = session_serial#
/*and prior sample_time = sample_time*/
and ((prior sample_time) - sample_time between interval '-3' second and interval '3' second)
order siblings by dbid, sample_time;

SAMPLE_TIME INSTANCE_NUMBER SESSION_ID SESSION_SERIAL# SQL_ID EVENT SESSION_STATE BLOCKING_INST_ID BLOCKING_SESSION BLOCKING_SESSION_SERIAL#
2020-03-21 11:19:28.455 1 864 49189 library cache: mutex X WAITING 1 2768 23333
2020-03-21 11:19:28.455 1 2768 23333 latch: shared pool WAITING

2020-03-21 11:20:08.912 1 198 35091 2gu0kq7f40vxk library cache lock WAITING 1 1437 35339
2020-03-21 11:20:08.912 1 1437 35339 2gu0kq7f40vxk latch: shared pool WAITING
复制

通过上面查询已经找出引发问题时所执行sql id,后续操作优化对应的SQL语句。

最后修改时间:2024-12-20 08:42:11
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论