暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

Troubleshoot ORA-27544,ORA-27300,ORA-27301,ORA-27302,"HPUX-ia64 Error: 23: File table overflow" issue

原创 Anbob 2014-12-06
590
A product database is 10205 2nodes rac ,our OS is HP-UNIX 11.31,There was a time unable establish new connections, then I check alert log found ORA-27544,ORA-27300,ORA-27301,ORA-27302,"HPUX-ia64 Error: 23: File table overflow" error. The problem seems to be OS related.
#alert log
Sun Nov 30 15:47:16 EAT 2014
Global Enqueue Services Deadlock detected. More info in file
/opt/oracle/app/admin/xxxdb/bdump/xxxdb1_lmd0_5890.trc.
Sun Nov 30 16:35:55 EAT 2014
Thread 1 advanced to log sequence 61987 (LGWR switch)
Current log# 14 seq# 61987 mem# 0: /dev/vg_anbob13/rvgcrm13_8_039
Sun Nov 30 16:47:05 EAT 2014
Thread 1 advanced to log sequence 61988 (LGWR switch)
Current log# 1 seq# 61988 mem# 0: /dev/vg_anbob01/rvgcrm01_redo01
Sun Nov 30 08:57:51 UTC 2014
Errors in file /opt/oracle/app/admin/xxxdb/udump/xxxdb1_ora_20033.trc:
ORA-00603: ORACLE 服务器会话因致命错误而终止
ORA-27544: 不支持内存映射通信
ORA-27300: 操作系统系统相关操作: socket 失败, 状态为: 23
ORA-27301: 操作系统故障消息: File table overflow
ORA-27302: 错误发生在: sskgxpcre1
Sun Nov 30 08:57:54 UTC 2014
Errors in file /opt/oracle/app/admin/xxxdb/udump/xxxdb1_ora_20235.trc:
ORA-00603: ORACLE 服务器会话因致命错误而终止
ORA-01116: 打开数据库文件 521 时出错
ORA-01110: 数据文件 521: '/dev/vg_anbob04/rvgcrm04_8_173'
ORA-27041: 无法打开文件
HPUX-ia64 Error: 23: File table overflow
Additional information: 3
ORA-01116: 打开数据库文件 504 时出错
ORA-01110: 数据文件 504: '/dev/vg_anbob03/rvgcrm03_8_021'
ORA-27041: 无法打开文件
HPUX-ia64 Error: 23: File table overflow
$ vi /var/adm/syslog/syslog.log
Nov 30 16:40:02 anbobdba su: - tty?? dsg-dsg
Nov 30 16:40:02 anbobdba su: - tty?? dsg-dsg
Nov 30 16:40:02 anbobdba above message repeats 2 times
Nov 30 16:45:01 anbobdba telnetd[21220]: getpid: peer died: Error 0
Nov 30 16:50:01 anbobdba telnetd[4104]: getpid: peer died: Error 0
Nov 30 16:50:02 anbobdba su: - tty?? dsg-dsg
Nov 30 16:50:02 anbobdba su: - tty?? dsg-dsg
Nov 30 16:50:02 anbobdba above message repeats 2 times
Nov 30 16:55:01 anbobdba telnetd[15605]: getpid: peer died: Error 0
Nov 30 16:57:51 anbobdba vmunix: file: table is full
Nov 30 16:57:51 anbobdba vmunix: ffiillee:: ttaabbllee iiss ffuullll
Nov 30 16:57:51 anbobdba vmunix:
Nov 30 16:57:51 anbobdba vmunix: ffiillee:: ttaabbllee iiss ffuullll
Nov 30 16:57:51 anbobdba vmunix:
Nov 30 16:57:51 anbobdba vmunix: file: table is full
Nov 30 16:57:53 anbobdba above message repeats 1900 times
Nov 30 16:57:54 anbobdba vmunix: file: table is full
Nov 30 16:57:54 anbobdba vmunix: file: table is full
Nov 30 16:57:54 anbobdba above message repeats 4984 times
Nov 30 16:57:54 anbobdba vmunix: file: table is full
Nov 30 16:57:54 anbobdba vmunix: ffiillee:: ttaabbllee iiss ffuullll
Nov 30 16:57:54 anbobdba vmunix:
Nov 30 16:57:54 anbobdba vmunix: file: table is full
Nov 30 16:57:54 anbobdba above message repeats 1174 times
Nov 30 16:57:54 anbobdba vmunix: file: table is full
Nov 30 16:57:55 anbobdba vmunix: file: table is full
oracle#ulimit -a
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) 2097152
stack(kbytes) 204800
memory(kbytes) unlimited
coredump(blocks) 4194303
nofiles(descriptors) 4096
[oracle@anbobdba:/opt/oracle/app/product/10.2.0/db_1/network/log]#/usr/sbin/kcweb -F
Kernel Configuration->Tunables (All)
------------------------------------------------------------------------------------------------------------------------------------------
Tunable Tuning Current Next Boot Default Usage Module
Capability Value Value Value
=================================================================================================================================SCROLL /\\
fcd_disable_mgmt_lun Dynamic 0 0 0 - fcd
fclp_ifc_disable_mgmt_lun Dynamic 0 0 0 - fclp
filecache_max Auto 15656274165 Automatic 97851711488 99.4% fs_bufcache
filecache_min Auto 9785167872 - 9785167872 - fs_bufcache
fr_rulecache Dynamic 0 0 0 - ipf
fr_statemax Dynamic 800000 800000 800000 - ipf
fr_tcpidletimeout Dynamic 86400 86400 86400 - ipf
fs_async Static 0 0 0 - fs
fs_symlinks Dynamic 20 20 20 - fs
ftable_hash_locks Static 64 64 64 - fs_filedscrp
gvid_no_claim_dev Dynamic 0 0 0 - gvid_core
hires_timeout_enable Dynamic 0 0 0 - pm_callout
hp_hfs_mtra_enabled Static 1 1 1 - ufs
intr_strobe_ics_pct Dynamic 80 80 80 - svc
io_ports_hash_locks Static 64 64 64 - io
ipf_icmp6_passthru Dynamic 0 0 0 - ipf
ipl_buffer_sz Dynamic 8192 8192 8192 - ipf
ipl_logall Dynamic 0 0 0 - ipf
ipl_suppress Dynamic 1 1 1 - ipf
ipmi_watchdog_action Dynamic 0 0 0 - ipmi
kmem_aggressive_caching Dynamic 0 0 0 - vm_kmem
ksi_alloc_max Dynamic 33600 33600 33600 - pm_sig
ksi_send_max Static 32 32 32 - pm_sig
lcpu_attr Auto 0 0 0 - pm_sched
lotsfree_pct Dynamic 0 0 0 - vm
max_acct_file_size Dynamic 2560000 2560000 2560000 - pm_acct
max_async_ports Dynamic 4096 4096 4096 - asyncdsk
max_mem_window Dynamic 0 0 0 - vm
max_thread_proc Dynamic 2048 2048 256 12.5% pm_proc
maxdsiz Dynamic 2147483648 2147483648 1073741824 66.2% vm
maxdsiz_64bit Dynamic 17179869184 17179869184 4294967296 1.8% vm
maxfiles Static 4096 10240 2048 - fs
maxfiles_lim Dynamic 10240 10240 4096 16.4% fs
maxrsessiz Static 8388608 8388608 8388608 - vm
maxrsessiz_64bit Static 8388608 8388608 8388608 - vm
maxssiz Dynamic 209715200 209715200 8388608 0.5% vm
maxssiz_64bit Dynamic 1073741824 1073741824 268435456 0.1% vm
maxtsiz Dynamic 100663296 100663296 100663296 35.6% vm
maxtsiz_64bit Dynamic 1073741824 1073741824 1073741824 20.3% vm
maxuprc Dynamic 20000 20000 256 4.2% pm_proc
mca_recovery_on Auto 1 - 1 - shutdown
mpas_readonly_text Dynamic 0 0 0 - vm
mprotect_reduce_protid_on Dynamic 0 0 0 - vm
msgmbs Dynamic 8 8 8 - pm_usync
msgmnb Dynamic 16384 16384 16384 - pm_usync
msgmni Dynamic 4096 4096 512 0.1% pm_usync
msgtql Dynamic 4096 4096 1024 0.0% pm_usync
ncdnode Static 150 150 150 - cdfs

$glance H f
Glance C.04.70.001 09:44:59 anbobdba ia64 Current Avg High
------------------------------------------------------------------------------------------------------------------------------------------
Cpu Util S SN NRU U | 95% 93% 95%
Disk Util F F |100% 100% 100%
Mem Util S SU U | 78% 78% 78%
Networkil U UR R | 68% 68% 68%
------------------------------------------------------------------------------------------------------------------------------------------
SYSTEM TABLES REPORT Users= 9
System Table Available Used Utilization High(%)
--------------------------------------------------------------------------------
Proc Table (nproc) 42975 2993 7 7
File Table (nfile) 650480 597066 92 92
Shared Mem Table (shmmni) 512 12 2 2
Message Table (msgmni) 4096 6 0 0
Semaphore Table (semmni) 4096 33 1 1
File Locks (nflocks) 36000 3356 9 9
Pseudo Terminals (npty) 60 0 0 0
Buffer Headers (nbuf) na 2560 na na
复制

or
#kctune -v nfile
复制

Tip:
nfile:Maximum number of files of all process open in operation systems
maxfiles_lim:Maximum number of files can be opened in a single process
nproc: Number of process systems can run concurrently
note: Increase the nfile parameter will effect of OS memory usage ,This value is typically the maximum number should be larger than the peak load 10-25%,Open the file user limit the Kernel parameters maxfiles . This is controlled by the value of a hard limit parameter maxfiles_lim, default limit is 2048 .see more can use "man nfile"(maxfiels_lim,nproc)
On 32-bit systems - Each entry nfile allocate 56 bytes.
On 64-bit systems - Each entry nfile allocate 88 bytes.
Also as per Oracle documentation, i have configured kernel parameters as below,
Oracle Recommended Kernel Parameter settings for HP Itanium v3 11.31
http://docs.oracle.com/cd/E14004_01/books/PerformTun/PerformTunOS12.html#wp1307268
Modify the HP-UX kernel parameters to values like those shown below (suggested guidelines). Use the HP-UX System Administration Manager (SAM) tool to make these changes.
nproc                       4096 - 4096
ksi_alloc_max 32768 - (NPROC*8)
max_thread_proc 4096 - 4096
maxdsiz 0x90000000 - 0X90000000
maxdsiz_64bit 2147483648 - 2147483648
maxfiles 4000 - 4000
maxssiz 401604608 - 401604608
maxssiz_64bit 1073741824 - 1073741824
maxtsiz 0x40000000 - 0X40000000
msgmap 4098 - (NPROC+2)
msgmni 4096 - (NPROC)
msgtql 4096 - (NPROC)
ncsize 35840 - (8*NPROC+2048+VX_NCSIZE)
nfile 67584 - (16*NPROC+2048)
ninode 34816 - (8*NPROC+2048)
nkthread 7184 - (((NPROC*7)/4)+16)
nproc 4096 - 4096
nsysmap 8192 - ((NPROC)>800?2*(NPROC):800)
nsysmap64 8192 - ((NPROC)>800?2*(NPROC):800)
semmni 1024 - 1024
semmns 16384 - ((NPROC*2)*2)
semmnu 2048 - 2048
semume 256 - 256
shmmax 0x40000000 Y 0X40000000
shmmni 1024 - 1024
shmseg 1024 Y 1024
vps_ceiling 64 - 64
复制

BTW.Another argument if you monitory can be set remind threshold=oracle.process*oracle.datafiles+2048,Limit=nproc*oracle.datafiles.
NAME                                                                       VALUE UNIT
---------------------------------------------------------------- --------------- ------------
aggregate PGA target parameter 42949672960 bytes
aggregate PGA auto target 36763075584 bytes
global memory bound 1073741824 bytes
total PGA inuse 2100608000 bytes
total PGA allocated 2831215616 bytes
maximum PGA allocated 10462026752 bytes
total freeable PGA memory 400359424 bytes
process count 831
max processes count 2999
PGA memory freed back to OS 15560814297088 bytes
total PGA used for auto workareas 0 bytes
maximum PGA used for auto workareas 4534104064 bytes
total PGA used for manual workareas 0 bytes
maximum PGA used for manual workareas 17006592 bytes
over allocation count 0
bytes processed 42472610850816 bytes
extra bytes read/written 494029733888 bytes
cache hit percentage 98.85 percent
recompute count (total) 2282871
SQL> show parameter process
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
aq_tm_processes integer 0
db_writer_processes integer 6
gcs_server_processes integer 12
job_queue_processes integer 10
log_archive_max_processes integer 2
processes integer 3000
复制

Use the "lsof" command to find what is using the file descriptiors on the system.
lsof -g | awk '{print $2}' | sort -u > /tmp/lsof_sort.txt 
lsof -g | awk '{print $2}' > /tmp/lsof.txt
for var in `cat /tmp/lsof_sort.txt`
do
echo `echo "$var ---- "``grep -x $var /tmp/lsof.txt | wc -l`
done
复制

This will list all the processes and the corresponding number of files opened by them. You can pick the processes which have the most number of files open and see what are they.
or
Use the scripts provided by HP engineers:
glance -adviser_only -syntax /tmp/proc_num_files -iterations 1
复制

or use SAR -v check.
Other useful command can be "fuser".
Summary:
Error 23> File table overflow. The system's table of open files is full,and temporarily no more open()s can be accepted
Increase the value of the kernel parameter "maxusers", as it influences the default value of "nfile". If this does not solve the problem, you could
increase "nfile" independently.
This is about kernel parameters in general -
http://docs.hp.com/en/939/KCParms/KCparams.OverviewAll.html
To modify kernel parameters (from HP docs) :
as root
#kctune nfile=xxxx
or
*Enter the SAM command to start the System Administration Manager (SAM)
program.
Double-click the Kernel Configuration icon.
Double-click the Configurable Parameters icon.
Double-click the parameter that you want to change and type the new value in the
Formula/Value field. Click OK.
Repeat these steps for all of the kernel configuration parameters that you want to change.
When you are finished setting all of the kernel configuration parameters,
select Action --> Process New Kernel from the action menu bar.
The HP-UX operating system automatically restarts after you change the values for the kernel configuration parameters.
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论