[ClusterLabs] Two node cluster and extended distance/site failure

Klaus Wenninger kwenning at redhat.com
Mon Jun 29 04:17:32 EDT 2020


On 6/29/20 10:12 AM, Klaus Wenninger wrote:
> On 6/29/20 9:56 AM, Klaus Wenninger wrote:
>> On 6/24/20 8:09 AM, Andrei Borzenkov wrote:
>>> Two node is what I almost exclusively deal with. It works reasonably
>>> well in one location where failures to perform fencing are rare and can
>>> be mitigated by two different fencing methods. Usually SBD is reliable
>>> enough, as failure of shared storage also implies failure of the whole
>>> cluster.
>>>
>>> When two nodes are located on separate sites (not necessary
>>> Asia/America, two buildings across the street is already enough) we have
>>> issue of complete site isolation where normal fencing becomes impossible
>>> together with missing node (power outage, network outage etc).
>>>
>>> Usual recommendation is third site which functions as witness. This
>>> works fine up to failure of this third site itself. Unavailability of
>>> the witness makes normal maintenance of either of two nodes impossible.
>>> If witness is not available and (pacemaker on) one of two nodes needs to
>>> be restarted the remaining node goes out of quorum or commits suicide.
>>> At most we can statically designate one node as tiebreaker (and this is
>>> already incompatible with qdevice).
>>>
>>> I think I finally can formulate what I miss. The behavior that I would
>>> really want is
>>>
>>> - if (pacemaker on) one node performs normal shutdown, remaining node
>>> continues managing services, independently of witness state or
>>> availability. Usually this is achieved either by two_node or by
>>> no-quorum-policy=ignore, but that absolutely requires successful
>>> fencing, so cannot be used alone. Such feature likely mandates WFA, but
>>> that is probably unavoidable.
>>>
>>> - if other node is lost unexpectedly, first try normal fencing between
>>> two nodes, independently of witness state or availability. If fencing
>>> succeeds, we can continue managing services.
>>>
>>> - if normal fencing fails (due to other site isolation), consult witness
>>> - and follow normal procedure. If witness is not available/does not
>>> grant us quorum - suicide/go out of quorum, if witness is available and
>>> grants us quorum - continue managing services.
>>>
>>> Any potential issues with this? If it is possible to implement using
>>> current tools I did not find it.
>> I see the idea but I see a couple of issues:
> My mailer was confused by all this combinations of
> "Antw: Re: Antw:" anddidn't compose mails into a
> thread properly. Which is why I missed further
> discussion where it was definitely still about
> shared-storage and notwatchdog fencing.
> Had guessed - from the initial post - that there was
> a shift in direction of qdevice.
> But maybe thoughts below are still interesting in
> that light ...
And what I had said about timing for quorum-loss is of
course true as well for loss of access to a shared-disk
which is why your ask witness as last resort is critical.
>>
>> - watchdog-fencing is timing critical. So when loosing quorum
>>   we haveto suicide after a defined time. So just trying normal
>>   fencing first andthen going for watchdog-fencing is no way.
>>   But what could be consideredis right away starting with
>>   watchdog fencing upon quorum-loss - rememberI said defined
>>   which doesn't necessarily mean short - and try other means
>>   of fencing in parallel and if that succeeds e.g. somehow
>>   regain quorum(additional voting or something one would have
>>   to think over a little more).
>>
>> - usually we are using quorum to prevent a fence race which
>>   this approachjeopardizes. Of course we can introduce an
>>   additional wait before normalfencing on the node that
>>   doesn't have quorum to mitigate that effect.
>>
>> - why I think current configuration possibilities won't give
>>   you yourdesired behavior is that we finally end up with
>>   2 quorum sources.
>>   Only case where I'm aware of a similar thing is
>>   2-node + shared diskwhere sbd decides not to go with quorum
>>   gotten from pacemaker butdoes node-counting internally
>>   instead.
>>
>> Klaus
>>> And note, that this is not actually limited to two node cluster - we
>>> have more or less the same issue with any 50-50 split cluster and
>>> witness on third site.
>>> _______________________________________________
>>> Manage your subscription:
>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>
>>> ClusterLabs home: https://www.clusterlabs.org/
>>>
>> _______________________________________________
>> Manage your subscription:
>> https://lists.clusterlabs.org/mailman/listinfo/users
>>
>> ClusterLabs home: https://www.clusterlabs.org/
>



More information about the Users mailing list