[ClusterLabs] Antw: Re: Antw: Re: Antw: Re: OCFS2 on cLVM with node waiting for fencing timeout

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Fri Oct 14 03:17:11 EDT 2016


>>> Ken Gaillot <kgaillot at redhat.com> schrieb am 13.10.2016 um 16:49 in Nachricht
<97fdafc7-7efe-41d8-99fa-20abb20506f6 at redhat.com>:
> On 10/13/2016 03:36 AM, Ulrich Windl wrote:
>> That's what I'm talking about: If 1 of 3 nodes is rebooting (or the cluster 
> is split-brain 1:2), the single node CANNOT continue due to lack of quorum, 
> while the remaining two nodes can. Is it still necessary to wait for 
> completion of stonith?
> 
> If the 2 nodes have working communication with the 1 node, then the 1
> node will leave the cluster in an orderly way, and fencing will not be
> involved. In that case, yes, quorum is used to prevent the 1 node from
> starting services until it rejoins the cluster.

The $%&/@ problem of a root process having a file open on OCFS prevented the clean unmount of the filesystem. I think newer versions of the RA now even kill root processes.
Can anybody explain why root processes were excluded before?

> 
> However, if the 2 nodes lose communication with the 1 node, they cannot
> be sure it is functioning well enough to respect quorum. In this case,
> they have to fence it. DLM has to wait for the fencing to succeed to be
> sure the 1 node is not messing with shared resources.
> 
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org 
> http://clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org 








More information about the Users mailing list