[ClusterLabs] Antw: [EXT] Re: cluster-recheck-interval and failure-timeout

Tue Apr 6 03:15:46 EDT 2021

>>> Ken Gaillot <kgaillot at redhat.com> schrieb am 31.03.2021 um 15:48 in
Nachricht
<7dfc7c46442db17d9645854081f1269261518f84.camel at redhat.com>:
> On Wed, 2021‑03‑31 at 14:32 +0200, Antony Stone wrote:
>> Hi.
>> 
>> I'm trying to understand what looks to me like incorrect behaviour
>> between 
>> cluster‑recheck‑interval and failure‑timeout, under pacemaker 2.0.1
>> 
>> I have three machines in a corosync (3.0.1 if it matters) cluster,
>> managing 12 
>> resources in a single group.
>> 
>> I'm following documentation from:
>> 
>> https://clusterlabs.org/pacemaker/doc/en‑US/Pacemaker/2.0/html/ 
>> Pacemaker_Explained/s‑cluster‑options.html
>> 
>> and
>> 
>> https://clusterlabs.org/pacemaker/doc/en‑US/Pacemaker/2.0/html/ 
>> Pacemaker_Explained/s‑resource‑options.html
>> 
>> I have set a cluster property:
>> 
>> 	cluster‑recheck‑interval=60s
>> 
>> I have set a resource property:
>> 
>> 	failure‑timeout=180
>> 
>> The docs say failure‑timeout is "How many seconds to wait before
>> acting as if 
>> the failure had not occurred, and potentially allowing the resource
>> back to 
>> the node on which it failed."
>> 
>> I think this should mean that if the resource fails and gets
>> restarted, the 
>> fact that it failed will be "forgotten" after 180 seconds (or maybe a
>> little 
>> longer, depending on exactly when the next cluster recheck is done).
>> 
>> However what I'm seeing is that if the resource fails and gets
>> restarted, and 
>> this then happens an hour later, it's still counted as two
>> failures.  If it 
> 
> That is exactly correct.
> 
>> fails and gets restarted another hour after that, it's recorded as
>> three 
>> failures and (because I have "migration‑threshold=3") it gets moved
>> to another 
>> node (and therefore all the other resources in group are moved as
>> well).
>> 
>> So, what am I misunderstanding about "failure‑timeout", and what
>> configuration 
>> setting do I need to use to tell pacemaker that "provided the
>> resource hasn't 
>> failed within the past X seconds, forget the fact that it failed more
>> than X 
>> seconds ago"?
> 
> Unfortunately, there is no way. failure‑timeout expires *all* failures
> once the *most recent* is that old. It's a bit counter‑intuitive but
> currently, Pacemaker only remembers a resource's most recent failure
> and the total count of failures, and changing that would be a big
> project.

Hi!

Sorry I don't get it: If you have a timestamp for each failure-timeout, what's
so hard to put all the fail counts that are older than failure-timeout on a
list, and then reset that list to zero?
I mean: That would be what everyone expects.
What is implemented instead is like FIFO scheduling: As long as there is a new
entry at the head of the queue, the jobs at the tail will never be executed.

Regards,
Ulrich

> 
> 
>> Thanks,
>> 
>> 
>> Antony.
>> 
> ‑‑ 
> Ken Gaillot <kgaillot at redhat.com>
> 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/