导读
作者:杨漆
16年关系型数据库管理,从oracle 9i 、10g、11g、12c到Mysql5.5、5.6、5.7、8.0 到TiDB获得3个OCP、2个OCM;运维路上不平坦,跌过不少坑、熬过许多夜。把工作笔记整理出来分享给大伙儿,希望帮到大家少走弯路、少熬夜。
2021-07-02T02:59:57.758551+08:00
Archived Log entry 2000 added for T-1.S-288759 ID 0x5a3a0712 LAD:1
2021-07-02T02:59:58.223276+08:00
Media Recovery Waiting for thread 3 sequence 192176 (in transit)
2021-07-02T02:59:58.224105+08:00
Recovery of Online Redo Log: Thread 3 Group 26 Seq 192176 Reading mem 0
Mem# 0: /u01/oradata/onlinelogstb3_redo26.log
2021-07-02T03:00:24.666560+08:00
Primary database is in MAXIMUM PERFORMANCE mode
RFS[183]: Assigned to RFS process (PID:8409)
RFS[183]: Selected log 26 for T-3.S-192176 dbid 1513741333 branch 985960599
2021-07-02T03:01:02.160225+08:00
Primary database is in MAXIMUM PERFORMANCE mode
RFS[184]: Assigned to RFS process (PID:8431)
RFS[184]: Selected log 20 for T-2.S-225130 dbid 1513741333 branch 985960599
2021-07-02T03:01:26.494282+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcldg3/orcl/trace/orcl_ofsd_27322_27324.trc:
ORA-27157: ???/???
ORA-27300: ???: semop ??, ???: 43
ORA-27301: ???: Identifier removed
ORA-27302: ???: sskgpwwait1
2021-07-02T03:01:26.496352+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcldg3/orcl/trace/orcl_ckpt_27342.trc:
ORA-27157: 已删除了操作系统发送/等待功能
ORA-27300: 操作系统系统相关操作: semop 失败, 状态为: 43
ORA-27301: 操作系统故障消息: Identifier removed
ORA-27302: 错误发生在: sskgpwwait1
2021-07-02T03:01:26.508970+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcldg3/orcl/trace/orcl_lgwr_27340.trc:
ORA-27157: 已删除了操作系统发送/等待功能
ORA-27300: 操作系统系统相关操作: semop 失败, 状态为: 43
ORA-27301: 操作系统故障消息: Identifier removed
ORA-27302: 错误发生在: sskgpwwait1
2021-07-02T03:01:26.515350+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcldg3/orcl/trace/orcl_smon_27346.trc:
ORA-27157: 已删除了操作系统发送/等待功能
ORA-27300: 操作系统系统相关操作: semop 失败, 状态为: 43
ORA-27301: 操作系统故障消息: Identifier removed
ORA-27302: 错误发生在: sskgpwwait1
2021-07-02T03:01:26.527731+08:00
故障发生时点正好与定时调用这个脚本清理归档日志占用磁盘空间对上。
查看定时任务:
$ crontab -l
30 3,17 * * * sh /home/oracle/delete_archivelog.sh
再次梳理脚本,没有发现问题,清除前一天的归档不会导致数据库挂掉。Root Cause在于这个脚本用oracle用户执行,当执行完毕退出oracle用户,就会出现了故障。
解决方案:
vi /etc/systemd/logind.conf
This file is part of systemd.
systemd is free software; you can redistribute it and/or modify it
under the terms of the GNU Lesser General Public License as published by
the Free Software Foundation; either version 2.1 of the License, or
(at your option) any later version.
Entries in this file show the compile time defaults.
You can change settings by editing this file.
Defaults can be restored by simply deleting this file.
See logind.conf(5) for details.
[Login]
#NAutoVTs=6
#ReserveVT=6
#KillUserProcesses=no
#KillOnlyUsers=
#KillExcludeUsers=root
#InhibitDelayMaxSec=5
#HandlePowerKey=poweroff
#HandleSuspendKey=suspend
#HandleHibernateKey=hibernate
#HandleLidSwitch=suspend
#HandleLidSwitchDocked=ignore
#PowerKeyIgnoreInhibited=no
#SuspendKeyIgnoreInhibited=no
#HibernateKeyIgnoreInhibited=no
#LidSwitchIgnoreInhibited=yes
#IdleAction=ignore
#IdleActionSec=30min
#RuntimeDirectorySize=10%
RemoveIPC=yes
一、将/etc/systemd/logind.conf里的RemoveIPC修改为no
RemoveIPC=no
二、重启system-login服务
systemctl restart systemd-logind
重启主机也OK (二选一就好,一般不用做下步)
systemctl daemon-reload
一切恢复正常!
三、将数据库重新正常拉起,并恢复Standby模式
评论
