I resolved the problem. I found this is a bug in ethmonitor agent.<div><br></div><div>in ethmonitor :</div><div><br></div><div><div>255 # get the link status on $NIC</div><div>256 # asks ip about running (up) interfaces, returns the number of matching interface names that are up</div>
<div>257 get_link_status () {</div><div>258 $IP2UTIL -o link show up dev "$NIC" | grep -c "$NIC"</div><div>259 }</div></div><div><br></div><div> The command "ip -o link show up dev eth0 ", just only detect the interface down. but can't detect the link down.</div>
<div> So , i guest the developer ,maybe just use command ifdown eth0/bond0 as test.</div><div> not consider the scene that unplug the cable.</div><div><br></div><div>Finaly, I decide add the function in IPaddr2. no longer use the agent ethmonitor.</div>
<div><br></div><div>I changed monitor fuction of the agent ocf:heartbeat:IPaddr2.</div><div><br></div><div><div>760 ip_monitor() {</div><div>761 # TODO: Implement more elaborate monitoring like checking for</div><div>
762 # interface health maybe via a daemon like FailSafe etc...</div><div>763</div><div>764 t=$(ip link show "$NIC" | grep -c "state UP")</div><div>765 #test $t -ne 1 && return $OCF_ERR_PERM</div>
<div>766 test $t -ne 1 && return $OCF_ERR_PERM</div><div>767</div></div><div><br></div><div> so if the nic link down or interface down, the resource will be switch to other node.</div><div><br></div><div>
but u need add the meta to the ocf:heatbeat:IPaddr2. Some like this</div><div><br></div><div><div>node sles11264-node1</div><div>node sles11264-node2</div><div>primitive p_apache lsb:apache2 \</div><div> op monitor interval="15" timeout="30"</div>
<div>primitive p_vip ocf:heartbeat:IPaddr2 \</div><div> params ip="192.168.203.250" nic="eth0" iflabel="0" \</div><div> op monitor interval="10" timeout="20" \</div>
<div> meta failure-timeout="5"</div><div>group g_apache p_vip p_apache \</div><div> meta target-role="Started"</div><div>property $id="cib-bootstrap-options" \</div><div> dc-version="1.1.6-b988976485d15cb702c9307df55512d323831a5e" \</div>
<div> cluster-infrastructure="openais" \</div><div> expected-quorum-votes="2" \</div><div> stonith-enabled="no" \</div><div> no-quorum-policy="ignore" \</div>
<div> last-lrm-refresh="1340872994"</div></div><div><br></div><div>about meta failure-timeout="5" , you must be careful to set this value. If you set to small, will cause the other side node doesn't have enough time take over. so calculate, set larger. </div>
<div><br></div><div>my english is so bad ,i hope so you can understand.</div><div><br></div><div>If you understand Chinese,you can see my blog. <a href="http://linux.52zhe.info/read.php/275.htm">http://linux.52zhe.info/read.php/275.htm</a></div>
<div><br></div><div><br></div><div><br></div><div><br><br><div class="gmail_quote">On Fri, Jun 29, 2012 at 1:01 PM, kook <span dir="ltr"><<a href="mailto:kookliu@gmail.com" target="_blank">kookliu@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">For test. I don't know how to reply this subject.<div class="HOEnZb"><div class="h5"><br><br><div class="gmail_quote">
On Mon, Jun 25, 2012 at 4:00 PM, kook <span dir="ltr"><<a href="mailto:kookliu@gmail.com" target="_blank">kookliu@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><pre><font face="arial, helvetica, sans-serif">Dear Fiorenza:</font></pre><pre><font face="arial, helvetica, sans-serif"> I have the same problem with you. I checked the newest ethmonitor ra (ClusterLabs-resource-agents-v3.9.2-0-ge261943.tar). It's same with my sles 11 sp2. </font></pre>
<pre><font face="arial, helvetica, sans-serif">Failed actions:</font></pre><pre><font face="arial, helvetica, sans-serif"> p_ethmonitor:1_monitor_15000 (node=sles11264-node1, call=1591, rc=-2, status=Timed Out): unknown exec error
</font></pre><div><font face="arial, helvetica, sans-serif"> so, can you tell me. how did you solved this problem. Thanks.</font></div><div><font face="arial, helvetica, sans-serif"><br></font></div><div><font face="arial, helvetica, sans-serif">liujia</font></div>
<pre><br></pre><pre><br></pre><pre>Il 21/03/2012 09:06, Florian Haas ha scritto:
><i> On Tue, Mar 20, 2012 at 4:18 PM, Fiorenza Meini<<a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">fmeini at esseweb.eu</a>> wrote:
</i>>><i> Hi there,
</i>>><i> has anybody configured successfully the RA specified in the object of the
</i>>><i> message?
</i>>><i>
</i>>><i> I got this error: if_eth0_monitor_0 (node=fw1, call=2297, rc=-2,
</i>>><i> status=Timed Out): unknown exec error
</i>><i>
</i>><i> Your ethmonitor RA missed its 50-second timeout on the probe (that is,
</i>><i> the initial monitor operation). You should be seeing "Monitoring of
</i>><i> if_eth0 failed, X retries left" warnings in your logs. Grepping your
</i>><i> syslog for "ethmonitor" will probably turn up some useful results.
</i>><i>
</i>><i> Cheers,
</i>><i> Florian
</i>><i>
</i>
Thank you, I solved the problem.
Regards
--
Fiorenza Meini
Spazio Web S.r.l.
V. Dante Alighieri, 10 - 13900 Biella
Tel.: 015.2431982 - 015.9526066
Fax: 015.2522600
Reg. Imprese, CF e P.I.: 02414430021
Iscr. REA: BI - 188936
Iscr. CCIAA: Biella - 188936
Cap. Soc.: 30.000,00 Euro i.v.
</pre><br>----------------------------<br>Side A or B<br>
</blockquote></div><br><br clear="all"><div><br></div></div></div><span class="HOEnZb"><font color="#888888">-- <br>----------------------------<br>我有一个梦想.呵呵....<br>
</font></span></blockquote></div><br><br clear="all"><div><br></div>-- <br>----------------------------<br>我有一个梦想.呵呵....<br>
</div>