[ClusterLabs] Two node cluster and extended distance/site failure

Mon Jun 29 04:12:17 EDT 2020

On 6/29/20 9:56 AM, Klaus Wenninger wrote:
> On 6/24/20 8:09 AM, Andrei Borzenkov wrote:
>> Two node is what I almost exclusively deal with. It works reasonably
>> well in one location where failures to perform fencing are rare and can
>> be mitigated by two different fencing methods. Usually SBD is reliable
>> enough, as failure of shared storage also implies failure of the whole
>> cluster.
>>
>> When two nodes are located on separate sites (not necessary
>> Asia/America, two buildings across the street is already enough) we have
>> issue of complete site isolation where normal fencing becomes impossible
>> together with missing node (power outage, network outage etc).
>>
>> Usual recommendation is third site which functions as witness. This
>> works fine up to failure of this third site itself. Unavailability of
>> the witness makes normal maintenance of either of two nodes impossible.
>> If witness is not available and (pacemaker on) one of two nodes needs to
>> be restarted the remaining node goes out of quorum or commits suicide.
>> At most we can statically designate one node as tiebreaker (and this is
>> already incompatible with qdevice).
>>
>> I think I finally can formulate what I miss. The behavior that I would
>> really want is
>>
>> - if (pacemaker on) one node performs normal shutdown, remaining node
>> continues managing services, independently of witness state or
>> availability. Usually this is achieved either by two_node or by
>> no-quorum-policy=ignore, but that absolutely requires successful
>> fencing, so cannot be used alone. Such feature likely mandates WFA, but
>> that is probably unavoidable.
>>
>> - if other node is lost unexpectedly, first try normal fencing between
>> two nodes, independently of witness state or availability. If fencing
>> succeeds, we can continue managing services.
>>
>> - if normal fencing fails (due to other site isolation), consult witness
>> - and follow normal procedure. If witness is not available/does not
>> grant us quorum - suicide/go out of quorum, if witness is available and
>> grants us quorum - continue managing services.
>>
>> Any potential issues with this? If it is possible to implement using
>> current tools I did not find it.
> I see the idea but I see a couple of issues:
My mailer was confused by all this combinations of
"Antw: Re: Antw:" anddidn't compose mails into a
thread properly. Which is why I missed further
discussion where it was definitely still about
shared-storage and notwatchdog fencing.
Had guessed - from the initial post - that there was
a shift in direction of qdevice.
But maybe thoughts below are still interesting in
that light ...
>
>
> - watchdog-fencing is timing critical. So when loosing quorum
>   we haveto suicide after a defined time. So just trying normal
>   fencing first andthen going for watchdog-fencing is no way.
>   But what could be consideredis right away starting with
>   watchdog fencing upon quorum-loss - rememberI said defined
>   which doesn't necessarily mean short - and try other means
>   of fencing in parallel and if that succeeds e.g. somehow
>   regain quorum(additional voting or something one would have
>   to think over a little more).
>
> - usually we are using quorum to prevent a fence race which
>   this approachjeopardizes. Of course we can introduce an
>   additional wait before normalfencing on the node that
>   doesn't have quorum to mitigate that effect.
>
> - why I think current configuration possibilities won't give
>   you yourdesired behavior is that we finally end up with
>   2 quorum sources.
>   Only case where I'm aware of a similar thing is
>   2-node + shared diskwhere sbd decides not to go with quorum
>   gotten from pacemaker butdoes node-counting internally
>   instead.
>
> Klaus
>> And note, that this is not actually limited to two node cluster - we
>> have more or less the same issue with any 50-50 split cluster and
>> witness on third site.
>> _______________________________________________
>> Manage your subscription:
>> https://lists.clusterlabs.org/mailman/listinfo/users
>>
>> ClusterLabs home: https://www.clusterlabs.org/
>>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/


-- 
Klaus Wenninger

Senior Software Engineer, EMEA ENG Base Operating Systems

Red Hat

kwenning at redhat.com   

Red Hat GmbH, http://www.de.redhat.com/, Sitz: Grasbrunn, 
Handelsregister: Amtsgericht München, HRB 153243,
Geschäftsführer: Charles Cachera, Laurie Krebs, Michael O'Neill, Thomas Savage