<div dir="ltr"><div>Hello Lars:<br></div><div>I'm still somewhat not clear about this monitor interval setting. What I observed is that the pacemaker always quickly (in less then 2 seconds)  schedule the failed resource when I just cut down the network (via DROP INPUT, or freeze kernel). <br>

</div><div>And it also schedule the failed resource in no more than 5 seconds while I put the online node to standby state.<br></div><div class="gmail_extra">Is there wrong assumption made by me?<br></div><div class="gmail_extra">

Thanks.<br></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Fri, Sep 13, 2013 at 2:45 PM, Lars Marowsky-Bree <span dir="ltr"><<a href="mailto:lmb@suse.com" target="_blank">lmb@suse.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">On 2013-09-13T12:20:54, Xiaomin Zhang <<a href="mailto:zhangxiaomin@gmail.com">zhangxiaomin@gmail.com</a>> wrote:<br>


<br>

> Hi, Gurus:<br>

> Here's a question about service Monitor Interval: considering this value is<br>

> configured as '15' seconds, does this mean corosync/pacemaker will take<br>

> average 15 seconds to schedule failed resource on a ready node?<br>

<br>

</div></div>It'll take about a maximum of 15 seconds to schedule a monitoring<br>

operation that can detect the error.<br>

<br>

If the monitor operation returns within <1s with a failure, that'll mean<br>

the recovery will begin real quick.<br>

<br>

If the monitor operation *doesn't* return but hit it's "timeout" (and is<br>

aborted by the lrmd), then the recovery will be delayed by that much. So<br>

for an ``interval=15 timeout=30'', it could take up to 45s before<br>

recovery is scheduled.<br>

<br>

Note however this on timeouts:<br>

<a href="http://advogato.org/person/lmb/diary/108.html" target="_blank">http://advogato.org/person/lmb/diary/108.html</a> Just making them shorter<br>

isn't necessarily always beneficial, either.<br>

<br>

<br>

Best,<br>

    Lars<br>

<br>

--<br>

Architect Storage/HA<br>

SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg)<br>

"Experience is the name everyone gives to their mistakes." -- Oscar Wilde<br>

<br>

<br>

_______________________________________________<br>

Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a><br>

<a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>

<br>

Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>

Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>

Bugs: <a href="http://bugs.clusterlabs.org" target="_blank">http://bugs.clusterlabs.org</a><br>

</blockquote></div><br></div></div>