Linux 平台下Oracle 9i/10g/11gR1 IO-Fencing 的hangcheck-timer 模块说明(二)

2014-11-24 18:47:27 · 作者: · 浏览: 1
80 hangcheck_reboot=1


210g/11gR1: Assuming thedefault setting of "CSS misscount" is set to either 30 or 60seconds:


hangcheck_tick=1hangcheck_margin=10 hangcheck_reboot=1


--10g/11gR1: 假如"CSS misscount"的设置是30或者60秒,则hangcheck_tick=1hangcheck_margin=10 hangcheck_reboot=1



You must always ensure thatthe Cluster misscount setting is greater than the sum of the setting forhangcheck_tick + hangcheck_margin.


--注意:你必须设置集群的misscount值大于hangcheck_tick + hangcheck_margin之和。


When running OracleClusterware on Linux, hangcheck-timer should always be configured on each RACcluster node, as the functionality of this module is required to provide I/O Fencingto ensure no stray writes will occur from an evicted node in a RACcluster. To verify if the hangcheck-timer module is running on a nodeexecute as the root or oracle user:


--Linux 平台上的Clusterware,需要在每个节点上配置hangcheck-timer模块,可以用root用户执行如下命令来验证hangcheck-timer是否运行:


# /sbin/lsmod | grep hangcheck


hangcheck-timer 2672 0


If the hangcheck-timer moduleis loaded (running) you will see output similar to above. When hangcheck-timeris not loaded no output is generated, and the command prompt is returned to theuser.


In an Oracle Enterprise Linux,Red Hat 4/5, or SUSE 9/10 environment the hangcheck-timer module is loadedusing the modprobe command:


--使用如下命令来装载hangcheck-timer


# modprobe hangcheck-timer hangcheck_tick=1 hangcheck_margin=10hangcheck_reboot=1


In order to ensure the moduleis loaded at boot time, you should also place the same command in the appropriatelocal command execution directory (e.g. /etc/rc.d/rc.local, or/etc/init.d/boot.local). In earlier releases, hangcheck-timer was loadedusing insmod in place of modprobe. Consult your release specific documentationto determine which initialization method is required.


--为了确保在系统启动时就装载了hangcheck-timer模块,我们可以将命令添加到/etc/rc.d/rc.local,or /etc/init.d/boot.local中。


Hangcheck-timer will providemessage logging to the system messages log when a failure is detected, and anode restart is initiated by the module:


--hangcheck-timer检测到系统hang时,会在系统log里记录日志并重启系统。


(1) When Hangcheck-timer reboots itmay leave "Hangcheck: hangcheck is restarting the machine" message in/var/log/messages。


-- hangcheck-timer的启动信息都会记录在系统日志里“ /var/log/messages”,重启时会记录"Hangcheck:hangcheck is restarting the machine"信息到/var/log/messages


(2) If you see the followingmessage in /var/log/messages: "Hangcheck: hangcheck value pastmargin!" this means a reboot was required but was not performed, becausehangcheck_reboot was not set to 1. If this message is seen, you mustreload the hangcheck module as described earlier in this note, with thehangcheck_reboot value set to 1.


--如果你看到/var/log/messages中有"Hangcheck:hangcheck value past margin!"消息,表示系统需要重启但是没有重启,因为hangcheck-reboot参数没有设置为1


注:


Bug:6125546 which can preventhangcheck-timer from rebooting in RHEL4 (fixed in 2.6.9.56 or RHEL4.6)