[ClusterLabs] Antw: Re: Antw: Re: When the DC crmd is frozen, cluster decisions are delayed infinitely
Klaus Wenninger
kwenning at redhat.com
Thu Oct 6 16:03:23 UTC 2016
On 10/05/2016 04:22 PM, renayama19661014 at ybb.ne.jp wrote:
> Hi All,
>
>>> If a user uses sbd, can the cluster evade a problem of SIGSTOP of crmd?
>>
>> As pointed out earlier, maybe crmd should feed a watchdog. Then stopping crmd
>> will reboot the node (unless the watchdog fails).
>
> Thank you for comment.
>
> We examine watchdog of crmd, too.
> In addition, I comment after examination advanced.
Was thinking of doing a small test implementation going
a little in the direction Lars Ellenberg had been pointing out.
a couple of thoughts I had so far:
- add an API (via DBus or libqb - favoring libqb atm) to sbd
an application can use to create a watchdog within sbd
- parameters for the first are a name and a timeout
- first use-case would be crmd observation
- later on we could think of removing pacemaker dependencies
from sbd by moving the actual implementation of
pacemaker-watcher and probably cluster-watcher as well
into pacemaker - using the new API
- this of course creates sbd dependency within pacemaker so
that it would make sense to offer a simpler and self-contained
implementation within pacemaker as an alternative
thus it would be favorable to have the dependency
within a non-compulsory pacemaker-rpm so that
we can offer an alternative that doesn't use sbd
at maybe the cost of being less reliable or one
that owns a hardware-watchdog by itself for systems
where this is still unused.
- e.g. via some kind of plugin (Andrew forgive me -
no pils ;-) )
- or via an additional daemon
What did you have in mind?
Maybe it makes sense to synchronize...
Regards,
Klaus
>
>
> Best Regards,
> Hideo Yamauchi.
>
>
>
> ----- Original Message -----
>> From: Ulrich Windl <Ulrich.Windl at rz.uni-regensburg.de>
>> To: users at clusterlabs.org; renayama19661014 at ybb.ne.jp
>> Cc:
>> Date: 2016/10/5, Wed 23:08
>> Subject: Antw: Re: [ClusterLabs] Antw: Re: When the DC crmd is frozen, cluster decisions are delayed infinitely
>>
>>>>> <renayama19661014 at ybb.ne.jp> schrieb am 21.09.2016 um 11:52
>> in Nachricht
>> <876439.61305.qm at web200311.mail.ssk.yahoo.co.jp>:
>>> Hi All,
>>>
>>> Was the final conclusion given about this problem?
>>>
>>> If a user uses sbd, can the cluster evade a problem of SIGSTOP of crmd?
>> As pointed out earlier, maybe crmd should feed a watchdog. Then stopping crmd
>> will reboot the node (unless the watchdog fails).
>>
>>> We are interested in this problem, too.
>>>
>>> Best Regards,
>>>
>>> Hideo Yamauchi.
>>>
>>>
>>> _______________________________________________
>>> Users mailing list: Users at clusterlabs.org
>>> http://clusterlabs.org/mailman/listinfo/users
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Users
mailing list