<div dir="ltr"><div>Hello Lars:<br></div><div>I'm still somewhat not clear about this monitor interval setting. What I observed is that the pacemaker always quickly (in less then 2 seconds) schedule the failed resource when I just cut down the network (via DROP INPUT, or freeze kernel). <br>
</div><div>And it also schedule the failed resource in no more than 5 seconds while I put the online node to standby state.<br></div><div class="gmail_extra">Is there wrong assumption made by me?<br></div><div class="gmail_extra">
Thanks.<br></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Fri, Sep 13, 2013 at 2:45 PM, Lars Marowsky-Bree <span dir="ltr"><<a href="mailto:lmb@suse.com" target="_blank">lmb@suse.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">On 2013-09-13T12:20:54, Xiaomin Zhang <<a href="mailto:zhangxiaomin@gmail.com">zhangxiaomin@gmail.com</a>> wrote:<br>
<br>
> Hi, Gurus:<br>
> Here's a question about service Monitor Interval: considering this value is<br>
> configured as '15' seconds, does this mean corosync/pacemaker will take<br>
> average 15 seconds to schedule failed resource on a ready node?<br>
<br>
</div></div>It'll take about a maximum of 15 seconds to schedule a monitoring<br>
operation that can detect the error.<br>
<br>
If the monitor operation returns within <1s with a failure, that'll mean<br>
the recovery will begin real quick.<br>
<br>
If the monitor operation *doesn't* return but hit it's "timeout" (and is<br>
aborted by the lrmd), then the recovery will be delayed by that much. So<br>
for an ``interval=15 timeout=30'', it could take up to 45s before<br>
recovery is scheduled.<br>
<br>
Note however this on timeouts:<br>
<a href="http://advogato.org/person/lmb/diary/108.html" target="_blank">http://advogato.org/person/lmb/diary/108.html</a> Just making them shorter<br>
isn't necessarily always beneficial, either.<br>
<br>
<br>
Best,<br>
Lars<br>
<br>
--<br>
Architect Storage/HA<br>
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg)<br>
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde<br>
<br>
<br>
_______________________________________________<br>
Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a><br>
<a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>
<br>
Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>
Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
Bugs: <a href="http://bugs.clusterlabs.org" target="_blank">http://bugs.clusterlabs.org</a><br>
</blockquote></div><br></div></div>