[ClusterLabs] Antw: Re: setting up SBD_WATCHDOG_TIMEOUT, stonith-timeout and stonith-watchdog-timeout
Ulrich Windl
Ulrich.Windl at rz.uni-regensburg.de
Fri Dec 9 02:11:30 EST 2016
>>> emmanuel segura <emi2fast at gmail.com> schrieb am 08.12.2016 um 14:37 in
Nachricht
<CAE7pJ3CSQyxQvBqLFvsFU=NLp95JWQdZvAP_6cLyBo5rSdCRng at mail.gmail.com>:
> the only thing that I can say is: sbd is a realtime process
Hi!
You are saying it's scheduled with policy SCHED_RR and priority 0? A realtime-process is more than ist scheduling policy IMHO.
What are you really trying to say?
Regards,
Ulrich
>
> 2016-12-08 11:47 GMT+01:00 Jehan-Guillaume de Rorthais <jgdr at dalibo.com>:
>> Hello,
>>
>> While setting this various parameters, I couldn't find documentation and
>> details about them. Bellow some questions.
>>
>> Considering the watchdog module used on a server is set up with a 30s timer
>> (lets call it the wdt, the "watchdog timer"), how should
>> "SBD_WATCHDOG_TIMEOUT", "stonith-timeout" and "stonith-watchdog-timeout" be
> set?
>>
>> Here is my thinking so far:
>>
>> "SBD_WATCHDOG_TIMEOUT < wdt". The sbd daemon should reset the timer before
> the
>> wdt expire so the server stay alive. Online resources and default values are
>> usually "SBD_WATCHDOG_TIMEOUT=5s" and "wdt=30s". But what if sbd fails to
> reset
>> the timer multiple times (eg. because of excessive load, swap storm etc)?
> The
>> server will not reset before random*SBD_WATCHDOG_TIMEOUT or wdt, right?
>>
>> "stonith-watchdog-timeout > SBD_WATCHDOG_TIMEOUT". I'm not quite sure what is
>> stonith-watchdog-timeout. Is it the maximum time to wait from stonithd after
> it
>> asked for a node fencing before it considers the watchdog was actually
>> triggered and the node reseted, even with no confirmation? I suppose
>> "stonith-watchdog-timeout" is mostly useful to stonithd, right?
>>
>> "stonith-watchdog-timeout < stonith-timeout". I understand the stonith action
>> timeout should be at least greater than the wdt so stonithd will not raise a
>> timeout before the wdt had a chance to exprire and reset the node. Is it
> right?
>>
>> Any other comments?
>>
>> Regards,
>> --
>> Jehan-Guillaume de Rorthais
>> Dalibo
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> http://lists.clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>
>
> --
> .~.
> /V\
> // \\
> /( )\
> ^`~'^
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Users
mailing list