Oracle故障处理之ORA-00445: process "J0" didnot start after 120 second

数据与人 2020-12-15

3995

Oracle故障处理之ORA-00445: background process "J000" did not start after 120 seconds

问题背景：

客户反馈数据库宕机，协助排查原因

1> 查看alert日志：

Mon Dec 30 08:56:01 2019
WARNING: inbound connection timed out (ORA-3136)
Mon Dec 30 08:56:04 2019
Errors in file u01/app/oracle/diag/rdbms/ecology/ecology/trace/ecology_cjq0_25270.trc (incident=300282):
ORA-00445: background process "J001" did not start after 30 seconds
Incident details in: u01/app/oracle/diag/rdbms/ecology/ecology/incident/incdir_300282/ecology_cjq0_25270_i300282.trc
Mon Dec 30 08:56:05 2019




查看trc
/u01/app/oracle/diag/rdbms/ecology/ecology/trace/ecology_cjq0_25270.trc
 
 
*** 2019-12-31 08:49:26.444
Process diagnostic dump for J000, OS id=23742
-------------------------------------------------------------------------------
os thread scheduling delay history: (sampling every 1.000000 secs)
0.000000 secs at [ 08:49:21 ]
NOTE: scheduling delay has not been sampled for 5.062184 secs 0.000000 secs from [ 08:49:21 - 08:49:26 ], 5 sec avg
0.000000 secs from [ 08:49:21 - 08:49:26 ], 1 min avg
 
*** 2019-12-31 08:49:28.330
0.000000 secs from [ 08:45:08 - 08:49:28 ], 5 min avg
 
*** 2019-12-31 08:49:43.789
loadavg : 153.96 132.74 76.11
Memory (Avail  Total) = 289.81M  64411.24M
Swap (Avail / Total) = 35820.70M / 64767.98M
skgpgcmdout: read() for cmd /bin/ps -elf | /bin/egrep 'PID | 23742' | /bin/grep -v grep timed out after 13.740 seconds
 
*** 2019-12-31 08:49:56.451
Stack:
skgpgcmdout: read() for cmd /usr/bin/gdb --batch -quiet -x /tmp/stackTcHuSK /proc/23742/exe 23742 < /dev/null 2>&1 timed out after 12.660 seconds
 
-------------------------------------------------------------------------------
Process diagnostic dump actual duration=30.000000 sec
(max dump time=30.000000 sec)
 
*** 2019-12-31 08:49:56.451
Waited for process J000 to initialize for 120 seconds
 
*** 2019-12-31 08:49:56.451
Process diagnostic dump for J000, OS id=23742
-------------------------------------------------------------------------------
os thread scheduling delay history: (sampling every 1.000000 secs)
0.000000 secs at [ 08:49:21 ]
NOTE: scheduling delay has not been sampled for 35.069379 secs 0.000000 secs from [ 08:49:21 - 08:49:56 ], 5 sec avg
0.000000 secs from [ 08:49:21 - 08:49:56 ], 1 min avg
0.000000 secs from [ 08:45:08 - 08:49:56 ], 5 min avg
 
*** 2019-12-31 08:50:12.312
loadavg : 154.88 134.93 78.63
Memory (Avail / Total) = 288.15M / 64411.24M
Swap (Avail / Total) = 35665.90M / 64767.98M
skgpgcmdout: read() for cmd /bin/ps -elf | /bin/egrep 'PID | 23742' | /bin/grep -v grep timed out after 15.000 seconds
 
*** 2019-12-31 08:50:26.454
Stack:
skgpgcmdout: read() for cmd /usr/bin/gdb --batch -quiet -x /tmp/stackd1W3Ol /proc/23742/exe 23742 < /dev/null 2>&1 timed out after 14.140 seconds
 
-------------------------------------------------------------------------------
Process diagnostic dump actual duration=30.000000 sec
(max dump time=30.000000 sec)
 
*** 2019-12-31 08:50:26.454
 
*** 2019-12-31 08:52:17.853
Killing process (ospid 23742): (reason=KSOREQ_WAIT_CANCELLED error=0)
... and the process is still alive after kill!
 
*** 2019-12-31 08:53:07.555
Incident 713 created, dump file: /u01/app/database/diag/rdbms/feilioa/feilioa_1/incident/incdir_713/feilioa_1_cjq0_1370_i713.trc
ORA-00445: background process "J000" did not start after 120 seconds

【ID 1379200.1】中对这个错误的描述：

What does this message mean ?

The message indicates that we failed to spawn a new process at the Operating System level to serve the request. There are various causes for this issue.

This typically occurs when there is a shortage or misconfiguration in Operating System Resources, and thereby the problem should be investigated from an OS perspective. However there are a few causes related to the Oracle Database as well.

往期回顾

Oracle故障处理之ORA-00371: not enough shared pool memory

Oracle故障处理之错误代码：Warning: VKTM detected a time drift.

Oracle故障处理之RAC环境下SPFILE文件修改

客官长按关注

吾辈自强不息

oracle oracle

文章转载自数据与人，如果涉嫌侵权，请发送邮件至：contact@modb.pro进行举报，并提供相关证据，一经查实，墨天轮将立刻删除相关内容。

Oracle故障处理之ORA-00445: process "J0" didnot start after 120 second

评论