[ClusterLabs] Antw: [EXT] Two node cluster and extended distance/site failure

Wed Jun 24 04:42:28 EDT 2020

24.06.2020 10:28, Ulrich Windl пишет:
>>
>> Usual recommendation is third site which functions as witness. This
>> works fine up to failure of this third site itself. Unavailability of
>> the witness makes normal maintenance of either of two nodes impossible.
> 
> That's a problem of pacemaker:
> Assume you have two nodes and a shared storage: If one node tells via shared
> storage that it is going to leave the cluster, there won't be any issues.
> Likewise if one node crashes (the other node could think it's just a network
> problem), both nodes could try to access the shared storage atomically and
> "leave their mark" there (like most locking works). Then the other node will
> see who (in case there was any) node was fastest claiming the lock, and all
> other nodes would commit suicide (self-fence) or freeze until network is up
> again.
> 

And how exactly this is related to what I wrote? What you describe is
SBD with single witness device. None of this works if witness device is
not available which is scenario I would like to handle.

> (I'm sorry if I repeat myself from time to time, but this was how a two-node
> cluster in HP-UX Service Guard worked, and it worked quite well)
> 

How Service Guard handles loss of shared storage?

>> If witness is not available and (pacemaker on) one of two nodes needs to
>> be restarted the remaining node goes out of quorum or commits suicide.
>> At most we can statically designate one node as tiebreaker (and this is
>> already incompatible with qdevice).
> 
> So shared storage actually could play the "witness role".
> 

Of course it does. The question is what to do when it becomes unavailable.

Anyway, I guess with relatively short distance SBD with three devices
(one on each location) is the closest approximation. It allows SBD to
retain quorum when any location is lost, so cluster can survive witness
location outage while pacemaker will handle one node loss/reboot as long
as two main sites remain available. This does require suitable storage
on each main location though, which is not always available. So mode of
operation "ask witness as the last resort" still could be useful in
general case.

Note, that the question is not about resolving unexpected node loss
without witness or fencing. This is clearly impossible. The question is
about allowing *planned* node restart without causing second node
unavailability.