__libc_start_main() call main() 000000002 7FFF0B5E1188
+244 000000001 000000000
0098A0FE0 7FFF0B5DF888
_start()+36 call __libc_start_main() 000A07368 000000002
7FFF0B5E1178 000000000
0098A0FE0 000000002
--------------------- Binary Stack Dump ---------------------
再往前查看alertlog,发现还报了ora-07445
Tue Jan 17 08:42:12 2012
Archived Log entry 7472 added for thread 1 sequence 8444 ID 0x263e89b dest 1:
Tue Jan 17 09:00:14 2012
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x8] [PC:0xB0997A, ksmdscan_internal()+82] [flags: 0x0, count: 1]
Errors in file /oracle/app/diag/rdbms/skate01/skate01/trace/skate01_ora_25574.trc (incident=264155):
ora-07445: exception encountered: core dump [ksmdscan_internal()+82] [SIGSEGV] [ADDR:0x8] [PC:0xB0997A] [Address not mapped to objec
t] []
Incident details in: /oracle/app/diag/rdbms/skate01/skate01/incident/incdir_264155/skate01_ora_25574_i264155.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Tue Jan 17 09:00:21 2012
Dumping diagnostic data in directory=[cdmp_20120117090021], requested by (instance=1, osid=25574), summary=[incident=264155].
Tue Jan 17 09:00:22 2012
Sweep [inc][264155]: completed
Sweep [inc2][264155]: completed
Tue Jan 17 09:06:08 2012
Media Recovery Waiting for thread 1 sequence 8446
然后查看oracle文档“ID 1070812.1”,发现这个我启用hugepage有关,
当系统vm.drop_caches设置大于0,并且启用hugepage,这时这两个就会冲突,因为drop_caches是要释放内存,而hugepage是hold住内存。
参考:http://blog.csdn.net/wyzxg/article/details/7279986
解决方法
1.如果启用hugepage,那就设置vm.drop_caches=0
[root@localhost ~]# more /proc/sys/vm/drop_caches
3
[root@localhost ~]# sysctl -a | grep drop_caches
vm.drop_caches = 3
[root@localhost ~]# vi /etc/sysctl.conf
##skate add
vm.drop_caches=0
使其立刻生效
[root@localhost ~]# sysctl -p
检查是否生效
[root@localhost ~]# sysctl -a | grep drop_caches
vm.drop_caches = 0
或者
2.升级Linux Kernel version到 2.6.18-194.0.0.0.4.EL5
附上官方文档:
ORA-600 [KGHLKREM1] On Linux Using Parameter drop_cache On hugepages Configuration [ID 1070812.1]
Oracle Server - Enterprise Edition - Version: 10.2.0.1 and later [Release: 10.2 and later ]
Generic Linux
You are running an Oracle Database, single-instance or RAC. You have the SGA backed by hugepages.
You are getting the error
and the SGA heap Dump of memory around the offending addr (in this particular example: 0x6bc00020)
it's showing zeroed out :
1. On your system you are running with vm.drop_caches=1 (or 3), drop_cache have been set to a value greater than zero , or you are executing
2. You have setup the Hugepages
This is a Linux Kernel issue.
Using the linux kernel "drop_cache" parameter and having the hugepages a memory corruption can occurs.
Per internal Bug 9461825, executing vm.drop_caches corrupts Oracle Database SGA hugepages;
it is fixed in Linux Kernel version 2.6.18-194.0.0.0.4.EL5
1. As a workaround when hugepages are set avoid any vm.drop_cache settings.
OR
2. Upgrade to Linux Kernel version 2.6.18-194.0.0.0.4.EL5
----------end-----------