暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

带有日志传送到备用数据库的 “检查点未完成” 消息,Oracle 19 (标准版)

ASKTOM 2020-11-03
317

问题描述

你好,

我发现,在我们的Oracle数据库19c标准版第2版19.0.0.0.0的警报日志中,每个日志开关都有 “检查点未完成” 消息。
我们正在使用日志传送到备用数据库,因此我们每2分钟强制日志切换开关archive_lag _target参数 (请参见下面的PFILE)。
我们的检查站似乎很长 (约4分钟)。似乎我们所有的检查点并不是一直都完整的,独立于数据库的负载。
但另一方面,我们的流程通常可以使用DML语句,而无需在事件 “检查点未完成” 上等待大量时间。

这就是为什么我恳请你解释一下:

1.在这种情况下,“检查点不完整” 的消息是什么意思?
2.为什么我们的DML语句在不完整检查点事件期间不挂起?
3.使用log_checkpoint_timeout参数每N秒设置增量检查点是否有意义?

提前谢谢,



===============================================================================
alert.log:
===============================================================================
Beginning log switch checkpoint up to RBA [0x100a.2.10], SCN: 39289565
2020-09-25T14:50:37.557027+02:00
Thread 1 advanced to log sequence 4106 (LGWR switch),  current SCN: 39289565
  Current log# 2 seq# 4106 mem# 0: /data/oradata/PROD/redo2_0.log
2020-09-25T14:50:37.709741+02:00
ARC1 (PID:12141): Archived Log entry 22746 added for T-1.S-4105 ID 0xf432ed53 LAD:1
2020-09-25T14:52:40.171692+02:00
Thread 1 cannot allocate new log, sequence 4107
Checkpoint not complete
  Current log# 2 seq# 4106 mem# 0: /data/oradata/PROD/redo2_0.log
2020-09-25T14:52:40.469412+02:00
Completed checkpoint up to RBA [0x1009.2.10], SCN: 39288123
2020-09-25T14:52:43.269670+02:00
Beginning log switch checkpoint up to RBA [0x100b.2.10], SCN: 39290953
2020-09-25T14:52:43.270757+02:00
Thread 1 advanced to log sequence 4107 (LGWR switch),  current SCN: 39290953
  Current log# 3 seq# 4107 mem# 0: /data/oradata/PROD/redo3_0.log
2020-09-25T14:52:43.281539+02:00
ARC0 (PID:12135): Archived Log entry 22747 added for T-1.S-4106 ID 0xf432ed53 LAD:1
2020-09-25T14:54:42.798805+02:00
Thread 1 cannot allocate new log, sequence 4108
Checkpoint not complete
  Current log# 3 seq# 4107 mem# 0: /data/oradata/PROD/redo3_0.log
2020-09-25T14:54:43.377453+02:00
Completed checkpoint up to RBA [0x100a.2.10], SCN: 39289565
2020-09-25T14:54:45.873354+02:00
Beginning log switch checkpoint up to RBA [0x100c.2.10], SCN: 39291331

===============================================================================
Statspack:
===============================================================================
                                                             Avg          %Total
                                          %Tim Total Wait   wait    Waits   Call
Event                               Waits  out   Time (s)   (ms)     /txn   Time
---------------------------- ------------ ---- ---------- ------ -------- ------
log file sync                       2,363    0        143     60      1.4    3.2
direct path read                    2,266    0         69     30      1.3    1.5
log buffer space                      123    0         42    344      0.1     .9
Failed Logon Delay                     32  100         33   1036      0.0     .7
log file switch (checkpoint            11    0          9    783      0.0     .2
direct path sync                       14    0          7    529      0.0     .2
log file switch completion             25    0          7    262      0.0     .1
direct path write                   1,360    0          6      5      0.8     .1

===============================================================================
 PFILE:
 ===============================================================================

PROD.__data_transfer_cache_size=0
PROD.__db_cache_size=4177526784
PROD.__java_pool_size=0
PROD.__large_pool_size=16777216
PROD.__oracle_base='/opt/oracle'#ORACLE_BASE set from environment
PROD.__pga_aggregate_target=2147483648
PROD.__sga_target=6006243328
PROD.__shared_io_pool_size=134217728
PROD.__shared_pool_size=1627389952
PROD.__streams_pool_size=33554432
PROD.__unified_pga_pool_size=0
*.archive_lag_target=120
*.audit_file_dest='/data/oradata/PROD/adump'
*.audit_trail='NONE'
*.compatible='12.1.0.2'
*.control_files='/data/oradata/PROD/control01.ctl','/data/oradata/PROD/flash_recovery_area/control02.ctl'#Restore Controlfile
*.core_dump_dest='/data/oradata/PROD/cdump'
*.db_block_size=8192
*.db_domain='corp.de'
*.db_file_multiblock_read_count=128# 128*8K=1M
*.db_name='PROD'
*.db_recovery_file_dest='/data/oradata/PROD/flash_recovery_area'
*.db_recovery_file_dest_size=4G
*.db_unique_name='PROD'
*.diagnostic_dest='/data/oradata/PROD'
*.job_queue_processes=10
*.log_archive_dest_1='LOCATION=/data/oradata/PROD/archive'
*.log_archive_dest_state_1='ENABLE'
*.log_archive_format='%t_%s_%r.dbf'
*.log_archive_max_processes=1
*.log_archive_min_succeed_dest=1
*.log_checkpoints_to_alert=TRUE
*.open_cursors=300
*.optimizer_adaptive_plans=FALSE
*.optimizer_adaptive_statistics=FALSE
*.pga_aggregate_target=2G
*.processes=600
*.recyclebin='off'
*.remote_login_passwordfile='EXCLUSIVE'
*.sec_protocol_error_further_action='(drop,10)'
*.sga_max_size=5992435712
*.sga_target=5992435712
*.shared_servers=0
*.undo_management='AUTO'
*.undo_retention=10800
*.undo_tablespace='UNDOTBS1'


===============================================================================
Redo:
===============================================================================

SQL> SELECT * FROM V$LOG;

    GROUP#    THREAD#  SEQUENCE#      BYTES  BLOCKSIZE    MEMBERS ARCHIVED  STATUS                                           FIRST_CHANGE# FIRST_TIME      NEXT_CHANGE# NEXT_TIME           CON_ID
---------- ---------- ---------- ---------- ---------- ---------- --------- ------------------------------------------------ ------------- --------------- ------------ --------------- ----------
         1          1       4126  104857600        512          1 YES       ACTIVE                                                39299332 25-SEP-20           39300899 25-SEP-20                0
         2          1       4127  104857600        512          1 NO        CURRENT                                               39300899 25-SEP-20         9.2954E+18                          0
         3          1       4125  104857600        512          1 YES       ACTIVE                                                39298149 25-SEP-20           39299332 25-SEP-20                0
   
SQL> SELECT * FROM V$LOGFILE;

    GROUP# STATUS                TYPE                  MEMBER                                             IS_RECOVE     CON_ID
---------- --------------------- --------------------- -------------------------------------------------- --------- ----------
         1                       ONLINE                /data/oradata/PROD/redo1_0.log                   NO                 0
         2                       ONLINE                /data/oradata/PROD/redo2_0.log                   NO                 0
         3                       ONLINE                /data/oradata/PROD/redo3_0.log                   NO                 0
复制

专家解答

DML * 可能是挂起的,但可能只是很短的时间。例如:

2020-09-25T14:54:42.798805+02:00
Thread 1 cannot allocate new log, sequence 4108

2020-09-25T14:54:43.377453+02:00
Completed checkpoint up to RBA [0x100a.2.10], SCN: 39289565
复制


所以在不到半秒钟的时间里,你就 “重新做生意” 了。不太可能有人注意到这个现象。

2分钟的切换非常频繁 (但是我可以从 “本地数据保护” 的角度看出动机)。所以我要看几件事:

a) 更多的日志组。当我们进入第1组并在4分钟内 (由于存档滞后目标) 浏览2,3时,您的检查点问题将出现。因此,我将查看 (例如) 8到10个组,这意味着一个组将在大约20分钟内重复使用,使其与 “典型” 重做日志组场景更加一致。

b) 如果您仍然看到消息,请查看 “fast_start_mttr_target” 以给增量检查点一个推送。
文章转载自ASKTOM,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论