暂无图片
暂无图片
暂无图片
暂无图片
1
暂无图片

定时任务引发的挂库

原创 John2020 2021-07-07
896

导读

作者:杨漆
16年关系型数据库管理,从oracle 9i 、10g、11g、12c到Mysql5.5、5.6、5.7、8.0 到TiDB获得3个OCP、2个OCM;运维路上不平坦,跌过不少坑、熬过许多夜。把工作笔记整理出来分享给大伙儿,希望帮到大家少走弯路、少熬夜。

2021-07-02T02:59:57.758551+08:00
Archived Log entry 2000 added for T-1.S-288759 ID 0x5a3a0712 LAD:1
2021-07-02T02:59:58.223276+08:00
Media Recovery Waiting for thread 3 sequence 192176 (in transit)
2021-07-02T02:59:58.224105+08:00
Recovery of Online Redo Log: Thread 3 Group 26 Seq 192176 Reading mem 0
Mem# 0: /u01/oradata/onlinelogstb3_redo26.log
2021-07-02T03:00:24.666560+08:00
Primary database is in MAXIMUM PERFORMANCE mode
RFS[183]: Assigned to RFS process (PID:8409)
RFS[183]: Selected log 26 for T-3.S-192176 dbid 1513741333 branch 985960599
2021-07-02T03:01:02.160225+08:00
Primary database is in MAXIMUM PERFORMANCE mode
RFS[184]: Assigned to RFS process (PID:8431)
RFS[184]: Selected log 20 for T-2.S-225130 dbid 1513741333 branch 985960599
2021-07-02T03:01:26.494282+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcldg3/orcl/trace/orcl_ofsd_27322_27324.trc:
ORA-27157: ???/???
ORA-27300: ???: semop ??, ???: 43
ORA-27301: ???: Identifier removed
ORA-27302: ???: sskgpwwait1
2021-07-02T03:01:26.496352+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcldg3/orcl/trace/orcl_ckpt_27342.trc:
ORA-27157: 已删除了操作系统发送/等待功能
ORA-27300: 操作系统系统相关操作: semop 失败, 状态为: 43
ORA-27301: 操作系统故障消息: Identifier removed
ORA-27302: 错误发生在: sskgpwwait1
2021-07-02T03:01:26.508970+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcldg3/orcl/trace/orcl_lgwr_27340.trc:
ORA-27157: 已删除了操作系统发送/等待功能
ORA-27300: 操作系统系统相关操作: semop 失败, 状态为: 43
ORA-27301: 操作系统故障消息: Identifier removed
ORA-27302: 错误发生在: sskgpwwait1
2021-07-02T03:01:26.515350+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcldg3/orcl/trace/orcl_smon_27346.trc:
ORA-27157: 已删除了操作系统发送/等待功能
ORA-27300: 操作系统系统相关操作: semop 失败, 状态为: 43
ORA-27301: 操作系统故障消息: Identifier removed
ORA-27302: 错误发生在: sskgpwwait1
2021-07-02T03:01:26.527731+08:00
故障发生时点正好与定时调用这个脚本清理归档日志占用磁盘空间对上。
查看定时任务:
$ crontab -l
30 3,17 * * * sh /home/oracle/delete_archivelog.sh
再次梳理脚本,没有发现问题,清除前一天的归档不会导致数据库挂掉。Root Cause在于这个脚本用oracle用户执行,当执行完毕退出oracle用户,就会出现了故障。
解决方案:
vi /etc/systemd/logind.conf

This file is part of systemd.

systemd is free software; you can redistribute it and/or modify it

under the terms of the GNU Lesser General Public License as published by

the Free Software Foundation; either version 2.1 of the License, or

(at your option) any later version.

Entries in this file show the compile time defaults.

You can change settings by editing this file.

Defaults can be restored by simply deleting this file.

See logind.conf(5) for details.

[Login]
#NAutoVTs=6
#ReserveVT=6
#KillUserProcesses=no
#KillOnlyUsers=
#KillExcludeUsers=root
#InhibitDelayMaxSec=5
#HandlePowerKey=poweroff
#HandleSuspendKey=suspend
#HandleHibernateKey=hibernate
#HandleLidSwitch=suspend
#HandleLidSwitchDocked=ignore
#PowerKeyIgnoreInhibited=no
#SuspendKeyIgnoreInhibited=no
#HibernateKeyIgnoreInhibited=no
#LidSwitchIgnoreInhibited=yes
#IdleAction=ignore
#IdleActionSec=30min
#RuntimeDirectorySize=10%
RemoveIPC=yes
一、将/etc/systemd/logind.conf里的RemoveIPC修改为no
RemoveIPC=no
二、重启system-login服务
systemctl restart systemd-logind

重启主机也OK (二选一就好,一般不用做下步)

systemctl daemon-reload
一切恢复正常!
三、将数据库重新正常拉起,并恢复Standby模式123凡科快图.gif

「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论

郎啊狼
暂无图片
3年前
评论
暂无图片 0
不是定时任务的锅,是Linux 7.2的新特性导致的:当一个user完全退出os之后,remove掉所有的IPC objects。 可参考mos文档 2081410.1
3年前
暂无图片 点赞
评论
暂无图片
获得了70次点赞
暂无图片
内容获得22次评论
暂无图片
获得了2次收藏
目录
  • This file is part of systemd.
  • systemd is free software; you can redistribute it and/or modify it
  • under the terms of the GNU Lesser General Public License as published by
  • the Free Software Foundation; either version 2.1 of the License, or
  • (at your option) any later version.
  • Entries in this file show the compile time defaults.
  • You can change settings by editing this file.
  • Defaults can be restored by simply deleting this file.
  • See logind.conf(5) for details.
    • 重启主机也OK (二选一就好,一般不用做下步)