Oracle故障处理之ORA-00445: background process "J000" did not start after 120 seconds
问题背景:
客户反馈数据库宕机,协助排查原因
1> 查看alert日志:
Mon Dec 30 08:56:01 2019WARNING: inbound connection timed out (ORA-3136)Mon Dec 30 08:56:04 2019Errors in file u01/app/oracle/diag/rdbms/ecology/ecology/trace/ecology_cjq0_25270.trc (incident=300282):ORA-00445: background process "J001" did not start after 30 secondsIncident details in: u01/app/oracle/diag/rdbms/ecology/ecology/incident/incdir_300282/ecology_cjq0_25270_i300282.trcMon Dec 30 08:56:05 2019查看trc/u01/app/oracle/diag/rdbms/ecology/ecology/trace/ecology_cjq0_25270.trc*** 2019-12-31 08:49:26.444Process diagnostic dump for J000, OS id=23742-------------------------------------------------------------------------------os thread scheduling delay history: (sampling every 1.000000 secs)0.000000 secs at [ 08:49:21 ]NOTE: scheduling delay has not been sampled for 5.062184 secs 0.000000 secs from [ 08:49:21 - 08:49:26 ], 5 sec avg0.000000 secs from [ 08:49:21 - 08:49:26 ], 1 min avg*** 2019-12-31 08:49:28.3300.000000 secs from [ 08:45:08 - 08:49:28 ], 5 min avg*** 2019-12-31 08:49:43.789loadavg : 153.96 132.74 76.11Memory (Avail Total) = 289.81M 64411.24MSwap (Avail / Total) = 35820.70M / 64767.98Mskgpgcmdout: read() for cmd /bin/ps -elf | /bin/egrep 'PID | 23742' | /bin/grep -v grep timed out after 13.740 seconds*** 2019-12-31 08:49:56.451Stack:skgpgcmdout: read() for cmd /usr/bin/gdb --batch -quiet -x /tmp/stackTcHuSK /proc/23742/exe 23742 < /dev/null 2>&1 timed out after 12.660 seconds-------------------------------------------------------------------------------Process diagnostic dump actual duration=30.000000 sec(max dump time=30.000000 sec)*** 2019-12-31 08:49:56.451Waited for process J000 to initialize for 120 seconds*** 2019-12-31 08:49:56.451Process diagnostic dump for J000, OS id=23742-------------------------------------------------------------------------------os thread scheduling delay history: (sampling every 1.000000 secs)0.000000 secs at [ 08:49:21 ]NOTE: scheduling delay has not been sampled for 35.069379 secs 0.000000 secs from [ 08:49:21 - 08:49:56 ], 5 sec avg0.000000 secs from [ 08:49:21 - 08:49:56 ], 1 min avg0.000000 secs from [ 08:45:08 - 08:49:56 ], 5 min avg*** 2019-12-31 08:50:12.312loadavg : 154.88 134.93 78.63Memory (Avail / Total) = 288.15M / 64411.24MSwap (Avail / Total) = 35665.90M / 64767.98Mskgpgcmdout: read() for cmd /bin/ps -elf | /bin/egrep 'PID | 23742' | /bin/grep -v grep timed out after 15.000 seconds*** 2019-12-31 08:50:26.454Stack:skgpgcmdout: read() for cmd /usr/bin/gdb --batch -quiet -x /tmp/stackd1W3Ol /proc/23742/exe 23742 < /dev/null 2>&1 timed out after 14.140 seconds-------------------------------------------------------------------------------Process diagnostic dump actual duration=30.000000 sec(max dump time=30.000000 sec)*** 2019-12-31 08:50:26.454*** 2019-12-31 08:52:17.853Killing process (ospid 23742): (reason=KSOREQ_WAIT_CANCELLED error=0)... and the process is still alive after kill!*** 2019-12-31 08:53:07.555Incident 713 created, dump file: /u01/app/database/diag/rdbms/feilioa/feilioa_1/incident/incdir_713/feilioa_1_cjq0_1370_i713.trcORA-00445: background process "J000" did not start after 120 seconds
【ID 1379200.1】中对这个错误的描述:
What does this message mean ?
The message indicates that we failed to spawn a new process at the Operating System level to serve the request. There are various causes for this issue.
This typically occurs when there is a shortage or misconfiguration in Operating System Resources, and thereby the problem should be investigated from an OS perspective. However there are a few causes related to the Oracle Database as well.
文章转载自数据与人,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。





