暂无图片
暂无图片
3
暂无图片
暂无图片
暂无图片

定时任务引发的Oracle挂库

2021-07-02T02:59:57.758551+08:00

Archived Log entry 2000 added for T-1.S-288759 ID 0x5a3a0712 LAD:1

2021-07-02T02:59:58.223276+08:00

Media Recovery Waiting for thread 3 sequence 192176 (in transit)

2021-07-02T02:59:58.224105+08:00

Recovery of Online Redo Log: Thread 3 Group 26 Seq 192176 Reading mem 0

Mem# 0: /u01/oradata/onlinelogstb3_redo26.log

2021-07-02T03:00:24.666560+08:00

Primary database is in MAXIMUM PERFORMANCE mode

RFS[183]: Assigned to RFS process (PID:8409)

RFS[183]: Selected log 26 for T-3.S-192176 dbid 1513741333 branch 985960599

2021-07-02T03:01:02.160225+08:00

Primary database is in MAXIMUM PERFORMANCE mode

RFS[184]: Assigned to RFS process (PID:8431)

RFS[184]: Selected log 20 for T-2.S-225130 dbid 1513741333 branch 985960599

2021-07-02T03:01:26.494282+08:00

Errors in file /u01/app/oracle/diag/rdbms/orcldg3/orcl/trace/orcl_ofsd_27322_27324.trc:

ORA-27157: ??????????/????

ORA-27300: ??????????: semop ??, ???: 43

ORA-27301: ????????: Identifier removed

ORA-27302: ?????: sskgpwwait1

2021-07-02T03:01:26.496352+08:00

Errors in file /u01/app/oracle/diag/rdbms/orcldg3/orcl/trace/orcl_ckpt_27342.trc:

ORA-27157: 已删除了操作系统发送/等待功能

ORA-27300: 操作系统系统相关操作: semop 失败, 状态为: 43

ORA-27301: 操作系统故障消息: Identifier removed

ORA-27302: 错误发生在: sskgpwwait1

2021-07-02T03:01:26.508970+08:00

Errors in file /u01/app/oracle/diag/rdbms/orcldg3/orcl/trace/orcl_lgwr_27340.trc:

ORA-27157: 已删除了操作系统发送/等待功能

ORA-27300: 操作系统系统相关操作: semop 失败, 状态为: 43

ORA-27301: 操作系统故障消息: Identifier removed

ORA-27302: 错误发生在: sskgpwwait1

2021-07-02T03:01:26.515350+08:00

Errors in file /u01/app/oracle/diag/rdbms/orcldg3/orcl/trace/orcl_smon_27346.trc:

ORA-27157: 已删除了操作系统发送/等待功能

ORA-27300: 操作系统系统相关操作: semop 失败, 状态为: 43

ORA-27301: 操作系统故障消息: Identifier removed

ORA-27302: 错误发生在: sskgpwwait1

2021-07-02T03:01:26.527731+08:00

故障发生时点正好与定时调用这个脚本清理归档日志占用磁盘空间对上。

查看定时任务:

$ crontab -l

30 3,17 * * * sh /home/oracle/delete_archivelog.sh

再次梳理脚本,没有发现问题,清除前一天的归档不会导致数据库挂掉。Root Cause在于这个脚本用oracle用户执行,当执行完毕退出oracle用户,就会出现了故障。

解决方案:

vi /etc/systemd/logind.conf

# This file is part of systemd.

#

# systemd is free software; you can redistribute it and/or modify it

# under the terms of the GNU Lesser General Public License as published by

# the Free Software Foundation; either version 2.1 of the License, or

# (at your option) any later version.

#

# Entries in this file show the compile time defaults.

# You can change settings by editing this file.

# Defaults can be restored by simply deleting this file.

#

# See logind.conf(5) for details.

[Login]

#NAutoVTs=6

#ReserveVT=6

#KillUserProcesses=no

#KillOnlyUsers=

#KillExcludeUsers=root

#InhibitDelayMaxSec=5

#HandlePowerKey=poweroff

#HandleSuspendKey=suspend

#HandleHibernateKey=hibernate

#HandleLidSwitch=suspend

#HandleLidSwitchDocked=ignore

#PowerKeyIgnoreInhibited=no

#SuspendKeyIgnoreInhibited=no

#HibernateKeyIgnoreInhibited=no

#LidSwitchIgnoreInhibited=yes

#IdleAction=ignore

#IdleActionSec=30min

#RuntimeDirectorySize=10%

RemoveIPC=yes

一、将/etc/systemd/logind.conf里的RemoveIPC修改为no

RemoveIPC=no

二、重启system-login服务

systemctl restart systemd-logind

## 重启主机也OK (二选一就好,一般不用做下步)

systemctl daemon-reload

一切恢复正常!

三、将数据库重新正常拉起,并恢复Standby模式

最后修改时间:2021-07-08 08:45:59
文章转载自数据库工作笔记 Sharing,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论