[ClusterLabs] Corosync quorum vs. pacemaker quorum confusion

Wed Dec 6 23:33:21 EST 2017

07.12.2017 00:28, Klaus Wenninger пишет:
> On 12/06/2017 08:03 PM, Ken Gaillot wrote:
>> On Sun, 2017-12-03 at 14:03 +0300, Andrei Borzenkov wrote:
>>> I assumed that with corosync 2.x quorum is maintained by corosync and
>>> pacemaker simply gets yes/no. Apparently this is more complicated.
>> It shouldn't be, but everything in HA-land is complicated :)
>>
>>> Trivial test two node cluster (two_node is intentionally not set to
>>> simulate "normal" behavior).
>>>
...
>>>
>>> Dec 03 13:52:07 [1633] ha1       crmd:   notice:
>>> pcmk_quorum_notification:	Quorum acquired | membership=240
>>> members=1
>>> Dec 03 13:52:07 [1626] ha1 pacemakerd:   notice:
>>> pcmk_quorum_notification:	Quorum acquired | membership=240
>>> members=1
>>>
...
>>>
>>> Confused ... is it intentional behavior or a bug?
>> The no-quorum-policy message above shouldn't prevent the cluster from
>> either fencing other nodes or starting resources, once quorum is
>> obtained from corosync. I'm not sure from the information here why that
>> didn't happen.
> 
> This is obviously a cluster with 2 nodes. Is it configured as 2-node
> in corosync as well? If yes the wait-for-all-logic might be confused
> somehow.

No, as I mentioned initially I explicitly disable wait_for_all.

ha1:~ # corosync-cmapctl quorum.
quorum.expected_votes (u32) = 2
quorum.provider (str) = corosync_votequorum
ha1:~ # rpm -q

> Which version of the sbd-daemon are you running?

ha1:~ # rpm -q sbd
sbd-1.3.0-3.3.x86_64

although I do not see how exactly it matters in this case, as pacemaker
never tells sbd to do anything.

> As of 1.3.1 (actually a few commits before iirc) in case of 2-node-
> configuration in corosync sbd wouldn't rely on quorum coming from
> corosync but rather count the nodes seen by itself. Just in case you
> see nodes suicide on loss of the disk while still being signaled
> quorate from corosync either due to the 2-node-config or some fake
> config as you did...
> 

I had suicide due to no-quorum-policy=suicide.

> Regards,
> Klaus
> 
>>
>> I'd first check the Pacemaker logs for "Quorum acquired" and "Quorum
>> lost" messages. These indicate when Pacemaker received notifications
>> from corosync.

As shown in original post I did have "Quorum acquired" messages.

> Assuming those were received properly, the DC should
>> then recalculate what needs to be done, and the logs at that point
>> should not have any of the messages about not having quorum.
>

So I redid this using default no-quorum-policy=stop and one more
non-stonith resource.

...and just before I hit "send" cluster recovered. So it appears that
"Quorum acquired" event does not trigger (immediate) re-evaluation of
policy until some timeout.

Dec 07 07:05:16 ha1 pacemakerd[1888]:   notice: Quorum acquired

Nothing happens until the next message

Dec 07 07:12:58 ha1 crmd[1894]:   notice: State transition S_IDLE ->
S_POLICY_ENGINE
Dec 07 07:12:58 ha1 pengine[1893]:   notice: Watchdog will be used via
SBD if fencing is required
Dec 07 07:12:58 ha1 pengine[1893]:  warning: Scheduling Node ha2 for STONITH
Dec 07 07:12:58 ha1 pengine[1893]:   notice:  * Fence (reboot) ha2 'node
is unclean'
Dec 07 07:12:58 ha1 pengine[1893]:   notice:  * Start      stonith-sbd
  (   ha1 )
Dec 07 07:12:58 ha1 pengine[1893]:   notice:  * Start      rsc_dummy_1
  (   ha1 )

This is apparently 15 minutes cluster-recheck-interval timer (give or take):

Dec 07 06:57:35 [1888] ha1 pacemakerd:     info: crm_log_init:  Changed
active directory to /var/lib/pacemaker/cores
...
Dec 07 07:12:58 [1894] ha1       crmd:   notice: do_state_transition:
State transition S_IDLE -> S_POLICY_ENGINE | input=I_PE_CALC
cause=C_TIMER_POPPED origin=crm_timer_popped

OK, at least we know why it happens. Whether this is intentional
behavior is another question :)