[ClusterLabs] Trying to understand dampening (ping)

martin doc db1280 at hotmail.com
Sun Oct 17 15:25:25 EDT 2021


Some other notes... I really wish there was better documentation for the individual resources. from the clusterlabs website, I cannot find a page that describes "ping" in any detail.

There's been some suggestions about using the same host more than once. I suspect that only really works if you disable fping (but I haven't tried.)

The description for timeout is "how long, in seconds, to wait before delcaring a ping lost". That kind of sounds like it means that each ping is allowed to take "<timeout> seconds", but in the fping case it really means "the total time to wait, in seconds, before declaring the ping monitor has failed." I suppose it depends on how you interpret "a ping": does it mean one instance of the ping command or one ICMP echo?

>From the script, the timeout value allowed per ping is actually "timeout * 1000 / attempts". That's for fping. If fping isn't used, it's "timeout" per instance of ping being run.

As an example, using timeout=5,attempts=5 with fping results in fping retruing after a maximum of 6 seconds, whereas with ping, it can take 9-10 seconds to return. To get equivalent behaviour to ping with fping, there should be a "-i 1000" added to its command line. This behaviour difference is very significant because a disruption to the network for 1 second can make fping report a failure when ping wouldn't. Unless you dig into the source code, and can comprehend the differences, there's no reason to want to use one or the other.

The ping resource is very important and needs much better documentation, and perhaps should be more than one reasource ... if only there wasn't the problem of backwards compatibility.

________________________________
From: Users <users-bounces at clusterlabs.org> on behalf of martin doc <db1280 at hotmail.com>
Sent: Monday, 18 October 2021 5:35 AM
To: users at clusterlabs.org <users at clusterlabs.org>
Subject: Re: [ClusterLabs] Trying to understand dampening (ping)


The use case is to detect if the network path to the default gateway has failed in one of 3 hosts. The use of "ping" covers cable failure, SFP failure, or some other sort of failure that is local to a single host.

In none of the reading I did on the web was there ever a sentence that said "dampen is not active if failure_score is not 0."

Given the incompatibility between the two attributes, should both coexist on the same resource?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20211017/ed37696a/attachment.htm>


More information about the Users mailing list