[ClusterLabs] Two node cluster and extended distance/site failure

Mon Jun 29 03:56:13 EDT 2020

On 6/24/20 8:09 AM, Andrei Borzenkov wrote:
> Two node is what I almost exclusively deal with. It works reasonably
> well in one location where failures to perform fencing are rare and can
> be mitigated by two different fencing methods. Usually SBD is reliable
> enough, as failure of shared storage also implies failure of the whole
> cluster.
>
> When two nodes are located on separate sites (not necessary
> Asia/America, two buildings across the street is already enough) we have
> issue of complete site isolation where normal fencing becomes impossible
> together with missing node (power outage, network outage etc).
>
> Usual recommendation is third site which functions as witness. This
> works fine up to failure of this third site itself. Unavailability of
> the witness makes normal maintenance of either of two nodes impossible.
> If witness is not available and (pacemaker on) one of two nodes needs to
> be restarted the remaining node goes out of quorum or commits suicide.
> At most we can statically designate one node as tiebreaker (and this is
> already incompatible with qdevice).
>
> I think I finally can formulate what I miss. The behavior that I would
> really want is
>
> - if (pacemaker on) one node performs normal shutdown, remaining node
> continues managing services, independently of witness state or
> availability. Usually this is achieved either by two_node or by
> no-quorum-policy=ignore, but that absolutely requires successful
> fencing, so cannot be used alone. Such feature likely mandates WFA, but
> that is probably unavoidable.
>
> - if other node is lost unexpectedly, first try normal fencing between
> two nodes, independently of witness state or availability. If fencing
> succeeds, we can continue managing services.
>
> - if normal fencing fails (due to other site isolation), consult witness
> - and follow normal procedure. If witness is not available/does not
> grant us quorum - suicide/go out of quorum, if witness is available and
> grants us quorum - continue managing services.
>
> Any potential issues with this? If it is possible to implement using
> current tools I did not find it.
I see the idea but I see a couple of issues:


- watchdog-fencing is timing critical. So when loosing quorum
  we haveto suicide after a defined time. So just trying normal
  fencing first andthen going for watchdog-fencing is no way.
  But what could be consideredis right away starting with
  watchdog fencing upon quorum-loss - rememberI said defined
  which doesn't necessarily mean short - and try other means
  of fencing in parallel and if that succeeds e.g. somehow
  regain quorum(additional voting or something one would have
  to think over a little more).

- usually we are using quorum to prevent a fence race which
  this approachjeopardizes. Of course we can introduce an
  additional wait before normalfencing on the node that
  doesn't have quorum to mitigate that effect.

- why I think current configuration possibilities won't give
  you yourdesired behavior is that we finally end up with
  2 quorum sources.
  Only case where I'm aware of a similar thing is
  2-node + shared diskwhere sbd decides not to go with quorum
  gotten from pacemaker butdoes node-counting internally
  instead.

Klaus
>
> And note, that this is not actually limited to two node cluster - we
> have more or less the same issue with any 50-50 split cluster and
> witness on third site.
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>