[ClusterLabs] shutdown and restart of complete cluster due to power outage with UPS

Thu Jan 24 13:28:36 EST 2019

24.01.2019 18:01, Lentes, Bernd пишет:
> ----- On Jan 23, 2019, at 3:20 PM, Klaus Wenninger kwenning at redhat.com wrote:
>>> I have corosync-2.3.6-9.13.1.x86_64.
>>> Where can i configure this value ?
>>
>> speaking of two_node & wait_for_all?
>> That is configured in the quorum-section of corosync.conf:
>>
>> quorum {
>> ...
>>    wait_for_all: 1
>>    two_node: 1
>> ...
>> }
>> As Ken mentioned two_node would already imply wait_for_all.
>> Dependent on the high-level-tooling you are using that might
>> take care of that configuration already.
>>
>> Using 'corosync-cmapctl' to display or directly set keys should
>> work as well.
> 
> corosync-cmapctl -b knows two_node and wait_for_all:
> 
> ha-idg-1:~ # corosync-cmapctl -b|grep -iE 'wait|two'
> quorum.two_node (u8) = 1
> runtime.votequorum.two_node (u8) = 1
> runtime.votequorum.wait_for_all_status (u8) = 1
> 
> man 5 votequorum is very helpful.
> It says:
> two_node = 1 set the quorum to 1.
> wait_for_all = 1 requires both nodes to be up for at least a short time simultaneously before the cluster can operate.
> I see this as a disadvantage. What is if one node has a hw problem which can't be fixed in short time ?
> 

This is the only possibility to avoid split brain with two_node on
corosync level. I do not know if there are other consumers of corosync
besides pacemaker.

If you start the whole stack manually anyway, just remove it before
starting.

Arguably with pacemaker two_node is entirely redundant. You must use
fencing with pacemaker anyway (if you care about data integrity) and
there is no difference between making corosync lie about quorum and
simply ignoring what it has to say about it.

>> You mentioned 'no-quorum-policy = ignore' before.
>> Wasn't clear if you have that set at all times. Have seen
>> howtos suggesting that instead of two_node (probably
>> coming from times when corosync didn't have 'two_node'
>> or when quorum was derived by pacemaker directly).
>> Btw. you probably shouldn't use 'ignore' to prevent the nodes
>> coming up in parallel without seeing each other - as Ken
>> mentioned before.
>> On the other hand startup-fencing - as you've experienced -
>> would prevent that as well.
>> But with 'no-quorum-policy = ignore' a node coming up
>> without connection to the peer would immediately try
>> to fence the peer - which you definitely wouldn't want
>> if that one is working properly.
> Yes, i see that.
> But corosync and pacemaker aren't start automatically in my setup.
> Also my fencing action is off and not reboot.
> These two is to check first "what happened" ? and fix it befroe starting the fenced node again.
> And my corosync-connection is a bonding device with cables direct to the other server, without a switch.
> 
> Do you recommend to switch off ignore ?

With two_node this setting is relevant only during initial startup. It
allows pacemaker to proceed (with fencing) even if other node is not
present. After initial startup is complete, this setting is entirely
irrelevant with two_node as two_node makes corosync always report in
quorum, so there is nothing to ignore.

Personally I would say that if HA stack is always started manually under
direct administrator control, neither two_node not wait_for_all need to
be set. In this case you must set no-quorum-policy=ignore.

> But what is if the cluster is running and one node is fenced ?
> When i don't have ignore the resources don't continue to run.
> Is there a hierarchy or a mutual exclusion of two_node and no-quorum-policy ?
> I would say that no-quorum-policy=ignore, two_node=1 and wait_for_all=0 would be the best for a 
> two-node cluster.
> 

You do not need two_node with no-quorum-policy=ignore.

>> You've probably setup fencing with random-delay or fixed
>> delays different for each target-node.
> 
> One agent has a delay of 20 seconds, the other has no delay.
> 
> Bernd
>  
> 
> Helmholtz Zentrum Muenchen
> Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
> Ingolstaedter Landstr. 1
> 85764 Neuherberg
> www.helmholtz-muenchen.de
> Aufsichtsratsvorsitzende: MinDirig'in Petra Steiner-Hoffmann
> Stellv.Aufsichtsratsvorsitzender: MinDirig. Dr. Manfred Wolter
> Geschaeftsfuehrer: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich Bassler
> Registergericht: Amtsgericht Muenchen HRB 6466
> USt-IdNr: DE 129521671
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>