[ClusterLabs] Antw: Re: When the DC crmd is frozen, cluster decisions are delayed infinitely

Shermal Fernando shermalfe at millenniumit.com
Thu Sep 8 04:58:15 EDT 2016

Hi Jehan-Guillaume,

Does this means watchdog will serf-terminate the machine when the crm daemon is frozen?

Shermal Fernando

-----Original Message-----
From: Jehan-Guillaume de Rorthais [mailto:jgdr at dalibo.com] 
Sent: Thursday, September 08, 2016 12:52 PM
To: Digimer
Cc: Cluster Labs - All topics related to open-source clustering welcomed
Subject: Re: [ClusterLabs] Antw: Re: When the DC crmd is frozen, cluster decisions are delayed infinitely

On Thu, 8 Sep 2016 15:55:50 +0900
Digimer <lists at alteeve.ca> wrote:

> On 08/09/16 03:47 PM, Ulrich Windl wrote:
> >>>> Shermal Fernando <shermalfe at millenniumit.com> schrieb am 
> >>>> 08.09.2016 um
> >>>> 06:41 in
> > Nachricht
> > <8CE6E8D87F896546B9C65ED80D30A4336578CB4A at LG-SPMB-MBX02.lseg.stockex.local>:
> >> The whole cluster will fail if the DC (crm daemon) is frozen due to 
> >> CPU starvation or hanging while trying to perform a IO operation.
> >> Please share some thoughts on this issue.
> > 
> > What is "the whole cluster will fail"? If the DC times out, some 
> > recovery will take place.
> Yup. The starved node should be declared lost by corosync, the 
> remaining nodes reform and if they're still quorate, the hung node 
> should be fenced. Recovery occur and life goes on.


And fencing might either come from outside, or just from the server itself using watchdog.

Jehan-Guillaume (ioguix) de Rorthais

Users mailing list: Users at clusterlabs.org http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

This e-mail transmission (inclusive of any attachments) is strictly confidential and intended solely for the ordinary user of the e-mail address to which it was addressed. It may contain legally privileged and/or CONFIDENTIAL information. The unauthorized use, disclosure, distribution printing and/or copying of this e-mail or any information it contains is prohibited and could, in certain circumstances, constitute an offence. If you have received this e-mail in error or are not an intended recipient please inform the sender of the email and MillenniumIT immediately by return e-mail or telephone (+94-11) 2416000. We advise that in keeping with good computing practice, the recipient of this e-mail should ensure that it is virus free. We do not accept responsibility for any virus that may be transferred by way of this e-mail. E-mail may be susceptible to data corruption, interception and unauthorized amendment, and we do not accept liability for any such corruption, interception or amendment or any consequences thereof.  www.millenniumit.com 

More information about the Users mailing list