[ClusterLabs] cluster-recheck-interval and failure-timeout

Wed Mar 31 09:48:15 EDT 2021

On Wed, 2021-03-31 at 14:32 +0200, Antony Stone wrote:
> Hi.
> 
> I'm trying to understand what looks to me like incorrect behaviour
> between 
> cluster-recheck-interval and failure-timeout, under pacemaker 2.0.1
> 
> I have three machines in a corosync (3.0.1 if it matters) cluster,
> managing 12 
> resources in a single group.
> 
> I'm following documentation from:
> 
> https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html/
> Pacemaker_Explained/s-cluster-options.html
> 
> and
> 
> https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html/
> Pacemaker_Explained/s-resource-options.html
> 
> I have set a cluster property:
> 
> 	cluster-recheck-interval=60s
> 
> I have set a resource property:
> 
> 	failure-timeout=180
> 
> The docs say failure-timeout is "How many seconds to wait before
> acting as if 
> the failure had not occurred, and potentially allowing the resource
> back to 
> the node on which it failed."
> 
> I think this should mean that if the resource fails and gets
> restarted, the 
> fact that it failed will be "forgotten" after 180 seconds (or maybe a
> little 
> longer, depending on exactly when the next cluster recheck is done).
> 
> However what I'm seeing is that if the resource fails and gets
> restarted, and 
> this then happens an hour later, it's still counted as two
> failures.  If it 

That is exactly correct.

> fails and gets restarted another hour after that, it's recorded as
> three 
> failures and (because I have "migration-threshold=3") it gets moved
> to another 
> node (and therefore all the other resources in group are moved as
> well).
> 
> So, what am I misunderstanding about "failure-timeout", and what
> configuration 
> setting do I need to use to tell pacemaker that "provided the
> resource hasn't 
> failed within the past X seconds, forget the fact that it failed more
> than X 
> seconds ago"?

Unfortunately, there is no way. failure-timeout expires *all* failures
once the *most recent* is that old. It's a bit counter-intuitive but
currently, Pacemaker only remembers a resource's most recent failure
and the total count of failures, and changing that would be a big
project.

> Thanks,
> 
> 
> Antony.
> 
-- 
Ken Gaillot <kgaillot at redhat.com>