[ClusterLabs] Antw: Re: When the DC crmd is frozen, cluster decisions are delayed infinitely
Jehan-Guillaume de Rorthais
jgdr at dalibo.com
Thu Sep 8 07:22:14 UTC 2016
On Thu, 8 Sep 2016 15:55:50 +0900
Digimer <lists at alteeve.ca> wrote:
> On 08/09/16 03:47 PM, Ulrich Windl wrote:
> >>>> Shermal Fernando <shermalfe at millenniumit.com> schrieb am 08.09.2016 um
> >>>> 06:41 in
> > Nachricht
> > <8CE6E8D87F896546B9C65ED80D30A4336578CB4A at LG-SPMB-MBX02.lseg.stockex.local>:
> >> The whole cluster will fail if the DC (crm daemon) is frozen due to CPU
> >> starvation or hanging while trying to perform a IO operation.
> >> Please share some thoughts on this issue.
> >
> > What is "the whole cluster will fail"? If the DC times out, some recovery
> > will take place.
>
> Yup. The starved node should be declared lost by corosync, the remaining
> nodes reform and if they're still quorate, the hung node should be
> fenced. Recovery occur and life goes on.
+1
And fencing might either come from outside, or just from the server itself
using watchdog.
--
Jehan-Guillaume (ioguix) de Rorthais
Dalibo
More information about the Users
mailing list