前言
前一段时间巡检发现一套数据库归档量近几日发生了爆增,虽然最终定位并非数据库自身原因,这里也总结下分析步骤和思路。
分析步骤
1、确认日志切换频率,明显30号和1号日志切换比之前增长一倍。
select to_char(first_time,'YYYY-MON-DD') "Date", to_char(first_time,'DY') day, to_char(sum(decode(to_char(first_time,'HH24'),'00',1,0)),'999') "00", to_char(sum(decode(to_char(first_time,'HH24'),'01',1,0)),'999') "01", to_char(sum(decode(to_char(first_time,'HH24'),'02',1,0)),'999') "02", to_char(sum(decode(to_char(first_time,'HH24'),'03',1,0)),'999') "03", to_char(sum(decode(to_char(first_time,'HH24'),'04',1,0)),'999') "04", to_char(sum(decode(to_char(first_time,'HH24'),'05',1,0)),'999') "05", to_char(sum(decode(to_char(first_time,'HH24'),'06',1,0)),'999') "06", to_char(sum(decode(to_char(first_time,'HH24'),'07',1,0)),'999') "07", to_char(sum(decode(to_char(first_time,'HH24'),'08',1,0)),'999') "08", to_char(sum(decode(to_char(first_time,'HH24'),'09',1,0)),'999') "09", to_char(sum(decode(to_char(first_time,'HH24'),'10',1,0)),'999') "10", to_char(sum(decode(to_char(first_time,'HH24'),'11',1,0)),'999') "11", to_char(sum(decode(to_char(first_time,'HH24'),'12',1,0)),'999') "12", to_char(sum(decode(to_char(first_time,'HH24'),'13',1,0)),'999') "13", to_char(sum(decode(to_char(first_time,'HH24'),'14',1,0)),'999') "14", to_char(sum(decode(to_char(first_time,'HH24'),'15',1,0)),'999') "15", to_char(sum(decode(to_char(first_time,'HH24'),'16',1,0)),'999') "16", to_char(sum(decode(to_char(first_time,'HH24'),'17',1,0)),'999') "17", to_char(sum(decode(to_char(first_time,'HH24'),'18',1,0)),'999') "18", to_char(sum(decode(to_char(first_time,'HH24'),'19',1,0)),'999') "19", to_char(sum(decode(to_char(first_time,'HH24'),'20',1,0)),'999') "20", to_char(sum(decode(to_char(first_time,'HH24'),'21',1,0)),'999') "21", to_char(sum(decode(to_char(first_time,'HH24'),'22',1,0)),'999') "22", to_char(sum(decode(to_char(first_time,'HH24'),'23',1,0)),'999') "23" , count(*) Total from v$log_history group by to_char(first_time,'YYYY-MON-DD'), to_char(first_time,'DY') order by to_date(to_char(first_time,'YYYY-MON-DD'),'YYYY-MON-DD');
Date DAY 00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 TOTAL
-------------- ------ ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----------
2022-NOV-27 SUN 71 33 52 36 53 51 68 45 56 58 51 56 74 56 49 63 51 50 78 54 54 47 33 59 1298
2022-NOV-28 MON 59 32 54 44 52 39 65 113 117 96 65 148 110 79 62 67 97 56 84 47 55 58 76 56 1731
2022-NOV-29 TUE 60 32 54 36 60 42 70 47 60 59 44 56 63 51 49 51 48 51 83 112 122 138 136 164 1688
2022-NOV-30 WED 124 83 97 102 121 116 142 84 119 126 100 110 142 87 116 124 97 137 165 102 117 126 68 137 2742
2022-DEC-01 THU 133 82 90 106 114 124 150 123 95 131 102 114 130 104 118 181 69 0 0 0 0 0 0 0 1966
2、查看每日的日志量,日志切换频繁不见得日志量就大,还是要确认下日志量的情况。从查询结果看,最近两天单日归档量快到4T了。日常也就2T左右。
select dest_id,to_char(first_time,'yyyy-mm-dd') TIME,count(*),sum(blocks*block_size)/1024/1024/1024 GB
from v$archived_log =================》即使是rac也无需使用gv$视图。
where dest_id=1
and
first_time>to_date('20221126 00:00','yyyymmdd hh24:mi')
and
first_time<to_date('20221201 18:59','yyyymmdd hh24:mi')
group by dest_id,to_char(first_time,'yyyy-mm-dd')
order by 1,2;
DEST_ID TIME COUNT(*) GB
---------- ---------- ---------- ----------
1 2022-11-26 1308 1949.53655
1 2022-11-27 1298 1890.3195
1 2022-11-28 1731 2526.82131
1 2022-11-29 1688 2589.85447
1 2022-11-30 2742 3995.80157
1 2022-12-01 2405 3509.1859
3、确认下活动会话是否有增长,如果活动会话有增长的话,可能是由于交易量上升导致的归档量变大。根据查询结果,除去两个异常时段有会话数激增,其他时段并没有发现活跃会话有明显的增长。
11-29号:
select to_char(sample_time,'yyyy/mm/dd hh24'),count(*) from dba_hist_active_sess_history where
sample_time>to_date('20221129 06:00','yyyymmdd hh24:mi')
and sample_time<to_date('20221129 12:00','yyyymmdd hh24:mi')
group by to_char(sample_time,'yyyy/mm/dd hh24') order by 1;
TO_CHAR(SAMPL COUNT(*)
------------- ----------
2022/11/29 06 4869
2022/11/29 07 4518
2022/11/29 08 8142
2022/11/29 09 108017 ============》当天出现数据库异常,导致会话堵塞。
2022/11/29 10 85392
2022/11/29 11 3078
11-30号:
select to_char(sample_time,'yyyy/mm/dd hh24'),count(*) from dba_hist_active_sess_history where
sample_time>to_date('20221130 06:00','yyyymmdd hh24:mi')
and sample_time<to_date('20221130 12:00','yyyymmdd hh24:mi')
group by to_char(sample_time,'yyyy/mm/dd hh24') order by 1;
TO_CHAR(SAMPL COUNT(*)
------------- ----------
2022/11/30 06 4392
2022/11/30 07 10890 ==================》当天由于归档目录满了,导致会话堵塞。
2022/11/30 08 4963
2022/11/30 09 4722
2022/11/30 10 3549
2022/11/30 11 4070
12-1号
select to_char(sample_time,'yyyy/mm/dd hh24'),count(*) from dba_hist_active_sess_history where
sample_time>to_date('20221201 06:00','yyyymmdd hh24:mi')
and sample_time<to_date('20221201 12:00','yyyymmdd hh24:mi')
group by to_char(sample_time,'yyyy/mm/dd hh24') order by 1;
TO_CHAR(SAMPL COUNT(*)
------------- ----------
2022/12/01 06 4481
2022/12/01 07 4146
2022/12/01 08 5200
2022/12/01 09 4671
2022/12/01 10 4283
2022/12/01 11 3394
4、参考mos文档How to identify the causes of High Redo Generation (Doc ID 2265722.1),选取日志生成量比较大的06-07点,查看block changed的对象情况。
对比了三天的block changed,发现12月1号和11月30号的TOP4都是一样的,而在11月29号中并没有这几个对象,这几个对象从名字看应该是一个表和三个索引。
12月1号早晨6点-7点:
SELECT to_char(begin_interval_time,'YY-MM-DD HH24') snap_time, dhso.object_name, sum(db_block_changes_delta) BLOCK_CHANGED FROM dba_hist_seg_stat dhss, dba_hist_seg_stat_obj dhso, dba_hist_snapshot dhs WHERE dhs.snap_id = dhss.snap_id AND dhs.instance_number = dhss.instance_number AND dhss.obj# = dhso.obj# AND dhss.dataobj# = dhso.dataobj# AND begin_interval_time BETWEEN to_date('22-12-01 06:00','YY-MM-DD HH24:MI')
AND to_date('22-12-01 07:00','YY-MM-DD HH24:MI') GROUP BY to_char(begin_interval_time,'YY-MM-DD HH24'), dhso.object_name HAVING sum(db_block_changes_delta) > 0 ORDER BY sum(db_block_changes_delta) desc ;
SNAP_TIME OBJECT_NAME BLOCK_CHANGED
----------- ------------------------------ -------------
22-12-01 06 IX_GTMP_WIND_RECORD_LOG 106597968
22-12-01 06 IX2_GTMP_WIND_RECORD_LOG 68668736
22-12-01 06 IX1_GTMP_WIND_RECORD_LOG 48441232
22-12-01 06 GTMP_WIND_RECORD_LOG 46247936
22-12-01 06 GZ45_VN_BM_BAL 21052960
22-12-01 06 ** TRANSIENT: 23179648 18150128
22-12-01 06 PK_O32_TTRADESTOCK 16278256
22-12-01 06 PK_GZ45_VN_DAYF_CONFIRM 7566992
11月30号早晨6点到7点:
SELECT to_char(begin_interval_time,'YY-MM-DD HH24') snap_time, dhso.object_name, sum(db_block_changes_delta) BLOCK_CHANGED FROM dba_hist_seg_stat dhss, dba_hist_seg_stat_obj dhso, dba_hist_snapshot dhs WHERE dhs.snap_id = dhss.snap_id AND dhs.instance_number = dhss.instance_number AND dhss.obj# = dhso.obj# AND dhss.dataobj# = dhso.dataobj# AND begin_interval_time BETWEEN to_date('22-11-30 06:00','YY-MM-DD HH24:MI')
AND to_date('22-11-30 07:00','YY-MM-DD HH24:MI') GROUP BY to_char(begin_interval_time,'YY-MM-DD HH24'), dhso.object_name HAVING sum(db_block_changes_delta) > 0 ORDER BY sum(db_block_changes_delta) desc ;
SNAP_TIME OBJECT_NAME BLOCK_CHANGED
----------- ------------------------------ -------------
22-11-30 06 IX_GTMP_WIND_RECORD_LOG 59098880
22-11-30 06 IX2_GTMP_WIND_RECORD_LOG 44366832
22-11-30 06 IX1_GTMP_WIND_RECORD_LOG 39393808
22-11-30 06 GTMP_WIND_RECORD_LOG 24130704
22-11-30 06 GZ45_VN_BM_BAL 21051680
22-11-30 06 PK_O32_TTRADESTOCK 16285792
22-11-30 06 GZ45_T_R_FR_ASTSTAT 6459584
22-11-30 06 IX_GZ45_VN_BM_BAL 5837776
11月29号早晨6点-7点:
SELECT to_char(begin_interval_time,'YY-MM-DD HH24') snap_time, dhso.object_name, sum(db_block_changes_delta) BLOCK_CHANGED FROM dba_hist_seg_stat dhss, dba_hist_seg_stat_obj dhso, dba_hist_snapshot dhs WHERE dhs.snap_id = dhss.snap_id AND dhs.instance_number = dhss.instance_number AND dhss.obj# = dhso.obj# AND dhss.dataobj# = dhso.dataobj# AND begin_interval_time BETWEEN to_date('22-11-29 06:00','YY-MM-DD HH24:MI')
AND to_date('22-11-29 07:00','YY-MM-DD HH24:MI') GROUP BY to_char(begin_interval_time,'YY-MM-DD HH24'), dhso.object_name HAVING sum(db_block_changes_delta) > 0 ORDER BY sum(db_block_changes_delta) desc ;
SNAP_TIME OBJECT_NAME BLOCK_CHANGED
----------- ------------------------------ -------------
22-11-29 06 GZ45_VN_BM_BAL 21660112
22-11-29 06 PK_O32_TTRADESTOCK 16292176
22-11-29 06 IDX_ODS_WIND_CBONDAGENCY 11994048
22-11-29 06 PK_ODS_WIND_CBONDAGENCY 9990224
22-11-29 06 O32_TTRADESTOCK 9007728
22-11-29 06 ODS_WIND_CBONDAGENCY 7276320
22-11-29 06 GZ45_T_R_FR_ASTSTAT 6671232
22-11-29 06 IX_GZ45_VN_BM_BAL 5930496
5、根据以上对象可以找到执行的相应的SQL语句。
SELECT to_char(begin_interval_time,'YYYY_MM_DD HH24') WHEN, dbms_lob.substr(sql_text,4000,1) SQL, dhss.instance_number INST_ID, dhss.sql_id, executions_delta exec_delta, rows_processed_delta rows_proc_delta
FROM dba_hist_sqlstat dhss, dba_hist_snapshot dhs, dba_hist_sqltext dhst
WHERE upper(dhst.sql_text) LIKE '%GTMP_WIND_RECORD_LOG%'
AND ltrim(upper(dhst.sql_text)) NOT LIKE 'SELECT%'
AND dhss.snap_id=dhs.snap_id
AND dhss.instance_number=dhs.instance_number
AND dhss.sql_id=dhst.sql_id
AND begin_interval_time
BETWEEN to_date('22-12-01 06:00','YY-MM-DD HH24:MI')
AND to_date('22-12-01 07:00','YY-MM-DD HH24:MI');
INST_ID SQL_ID EXEC_DELTA ROWS_PROC_DELTA
---------- ------------- ---------- ---------------
2022_12_01 06
INSERT INTO ETL.GTMP_WIND_RECORD_LOG (ID, REC_ID, OPMODE, OPDATE, TABLE_NAME) SE
LECT ID, REC_ID, OPMODE, OPDATE, TABLENAME FROM WINDDF.RECORD_LOG@WINDNEWDB A WH
ERE A.TABLENAME=:B1
2 61z1sgnsccx6m 184 96347952
INST_ID SQL_ID EXEC_DELTA ROWS_PROC_DELTA
---------- ------------- ---------- ---------------
DELETE FROM ETL.GTMP_WIND_RECORD_LOG A WHERE A.ID>:B2 OR A.ID<:B1
2 d7k797d3827cg 221 96610616
6、其实上面的结论从awr报告中也能的出来,但是awr报告只能显示top5,没办法看到后面的。当我想从awr报告里证明上述结论的时候,我发现这个处于TOP的对象竟然是TEMP表。按我以前的理解,对临时表的操作是不会产生redo的。从度娘上查了下,临时表本身的操作虽然不会产生redo,但是对临时表操作需要产生undo,产生undo自然会产生redo,只是比普通表产生的redo要少很多。
总结
日常碰到的日志量突增大部分都是交易量上涨、业务逻辑调整等导致的,但是如果你直接去问开发人员,他可能会说啥也没变,这个时候你就会挺懵逼(这是本人亲身经历哈)。如果你拿着你查到的东西去找他,他可能就不会这么硬气了。所以啊,虽然可能不是数据库的问题,但作为万能的DBA,你也的会查啊。