[ClusterLabs] setting up SBD_WATCHDOG_TIMEOUT, stonith-timeout and stonith-watchdog-timeout

Sat Dec 17 23:55:45 CET 2016

On Wed, 14 Dec 2016 14:52:41 +0100
Klaus Wenninger <kwenning at redhat.com> wrote:

> On 12/14/2016 01:26 PM, Jehan-Guillaume de Rorthais wrote:
> > On Thu, 8 Dec 2016 11:47:20 +0100
> > Jehan-Guillaume de Rorthais <jgdr at dalibo.com> wrote:
> >  
> >> Hello,
> >>
> >> While setting this various parameters, I couldn't find documentation and
> >> details about them. Bellow some questions.
> >>
> >> Considering the watchdog module used on a server is set up with a 30s timer
> >> (lets call it the wdt, the "watchdog timer"), how should
> >> "SBD_WATCHDOG_TIMEOUT", "stonith-timeout" and "stonith-watchdog-timeout" be
> >> set?
> >>
> >> Here is my thinking so far:
> >>
> >> "SBD_WATCHDOG_TIMEOUT < wdt". The sbd daemon should reset the timer before
> >> the wdt expire so the server stay alive. Online resources and default
> >> values are usually "SBD_WATCHDOG_TIMEOUT=5s" and "wdt=30s". But what if
> >> sbd fails to reset the timer multiple times (eg. because of excessive
> >> load, swap storm etc)? The server will not reset before
> >> random*SBD_WATCHDOG_TIMEOUT or wdt, right?   
> 
> SBD_WATCHDOG_TIMEOUT (e.g. in /etc/sysconfig/sbd) is already the
> timeout the hardware watchdog is configured to by sbd-daemon.

Oh, ok, I did not realized sbd was actually setting the hardware watchdog
timeout itself based on this variable. After some quick search to make sure I
understand it right, I suppose it is done there?
https://github.com/ClusterLabs/sbd/blob/172dcd03eaf26503a10a18501aa1b9f30eed7ee2/src/sbd-common.c#L123

> sbd-daemon is triggering faster - timeout_loop defaults to 1s but
> is configurable.
> 
> SBD_WATCHDOG_TIMEOUT (and maybe the loop timeout as well
> but significantly shorter should be sufficient)
> has to be configured so that failing to trigger within time means
> a failure with high enough certainty or the machine showing
> comparable response-times would anyway violate timing requirements
> of the services running on itself and in the cluster.

OK. So I understand now why 5s is fine as a default value then.

> Have in mind that sbd-daemon defaults to running realtime-scheduled
> and thus is gonna be more responsive than the usual services
> on the system. Although you of course have to consider that
> the watchers (child-processes of sbd that are observing e.g.
> the block-device(s), corosync, pacemaker_remoted or
> pacemaker node-health) might be significantly less responsive
> due to their communication partners.

I'm not sure yet to understand clearly the mechanism and interactions of sbd
with other daemons. So far, I understood that Pacemaker/stonithd was able to
poke sbd to ask it to trigger a node reset through the wd device. I'm very new
to this area and I still lake of self documentation.

> >> "stonith-watchdog-timeout > SBD_WATCHDOG_TIMEOUT". I'm not quite sure what
> >> is stonith-watchdog-timeout. Is it the maximum time to wait from stonithd
> >> after it asked for a node fencing before it considers the watchdog was
> >> actually triggered and the node reseted, even with no confirmation? I
> >> suppose "stonith-watchdog-timeout" is mostly useful to stonithd, right?  
> 
> Yes, the time we can assume a node to be killed by the hardware-watchdog...
> Double the hardware-watchdog-timeout is a good choice.

OK, thank you

> >> "stonith-watchdog-timeout < stonith-timeout". I understand the stonith
> >> action timeout should be at least greater than the wdt so stonithd will
> >> not raise a timeout before the wdt had a chance to exprire and reset the
> >> node. Is it right?  
> 
> stonith-timeout is the cluster-wide-defaut to wait for stonith-devices
> to carry out their duty. In the sbd-case without a block-device (sbd used
> for pacemaker to be observed by a hardware-watchdog) it shouldn't
> play a role.

I thought self-fencing through sbd/wd was carried by stonithd because of such
messages in my PoC log files:

  stonith-ng: notice: unpack_config: Relying on watchdog integration for fencing

That's why I thought "stonith-timeout" might have a role there, as it looks
like a stonith device then...

By pure tech interest here, some more input or documentation to read about how
it works would be really appreciated.

> When a block-device is being used it guards the
> communication with the fence-agent communicating with the
> block-device.

OK

Thank you for your help!
-- 
Jehan-Guillaume de Rorthais
Dalibo