[ClusterLabs] Corosync quorum vs. pacemaker quorum confusion

Mon Dec 11 06:57:58 EST 2017

On 12/07/2017 05:33 AM, Andrei Borzenkov wrote:
> 07.12.2017 00:28, Klaus Wenninger пишет:
>> On 12/06/2017 08:03 PM, Ken Gaillot wrote:
>>> On Sun, 2017-12-03 at 14:03 +0300, Andrei Borzenkov wrote:
>>>> I assumed that with corosync 2.x quorum is maintained by corosync and
>>>> pacemaker simply gets yes/no. Apparently this is more complicated.
>>> It shouldn't be, but everything in HA-land is complicated :)
>>>
>>>> Trivial test two node cluster (two_node is intentionally not set to
>>>> simulate "normal" behavior).
>>>>
> ...
>>>> Dec 03 13:52:07 [1633] ha1       crmd:   notice:
>>>> pcmk_quorum_notification:	Quorum acquired | membership=240
>>>> members=1
>>>> Dec 03 13:52:07 [1626] ha1 pacemakerd:   notice:
>>>> pcmk_quorum_notification:	Quorum acquired | membership=240
>>>> members=1
>>>>
> ...
>>>> Confused ... is it intentional behavior or a bug?
>>> The no-quorum-policy message above shouldn't prevent the cluster from
>>> either fencing other nodes or starting resources, once quorum is
>>> obtained from corosync. I'm not sure from the information here why that
>>> didn't happen.
>> This is obviously a cluster with 2 nodes. Is it configured as 2-node
>> in corosync as well? If yes the wait-for-all-logic might be confused
>> somehow.
> No, as I mentioned initially I explicitly disable wait_for_all.
>
> ha1:~ # corosync-cmapctl quorum.
> quorum.expected_votes (u32) = 2
> quorum.provider (str) = corosync_votequorum
> ha1:~ # rpm -q
>
>> Which version of the sbd-daemon are you running?
> ha1:~ # rpm -q sbd
> sbd-1.3.0-3.3.x86_64
>
> although I do not see how exactly it matters in this case, as pacemaker
> never tells sbd to do anything.

The fence-agent/fencing-resource isn't the only way
pacemaker and sbd communicate with each other.
Pacemaker is connecting to the cib and is using part
of the pengine-code (status-stuff) to evaluate the
content.
Again of course a question of configuration - assuming
you have pacemaker-watcher active...

>
>> As of 1.3.1 (actually a few commits before iirc) in case of 2-node-
>> configuration in corosync sbd wouldn't rely on quorum coming from
>> corosync but rather count the nodes seen by itself. Just in case you
>> see nodes suicide on loss of the disk while still being signaled
>> quorate from corosync either due to the 2-node-config or some fake
>> config as you did...
>>
> I had suicide due to no-quorum-policy=suicide.

Wasn't sure if this was an assumption or something
visible in the logs.
Just saw you had an sbd-setup and thus wanted to
point out that there are sbd-triggered suicides to
consider as well.

>
>> Regards,
>> Klaus
>>
>>> I'd first check the Pacemaker logs for "Quorum acquired" and "Quorum
>>> lost" messages. These indicate when Pacemaker received notifications
>>> from corosync.
> As shown in original post I did have "Quorum acquired" messages.
>
>
>> Assuming those were received properly, the DC should
>>> then recalculate what needs to be done, and the logs at that point
>>> should not have any of the messages about not having quorum.
> So I redid this using default no-quorum-policy=stop and one more
> non-stonith resource.
>
> ...and just before I hit "send" cluster recovered. So it appears that
> "Quorum acquired" event does not trigger (immediate) re-evaluation of
> policy until some timeout.
>
> Dec 07 07:05:16 ha1 pacemakerd[1888]:   notice: Quorum acquired
>
> Nothing happens until the next message
>
> Dec 07 07:12:58 ha1 crmd[1894]:   notice: State transition S_IDLE ->
> S_POLICY_ENGINE
> Dec 07 07:12:58 ha1 pengine[1893]:   notice: Watchdog will be used via
> SBD if fencing is required
> Dec 07 07:12:58 ha1 pengine[1893]:  warning: Scheduling Node ha2 for STONITH
> Dec 07 07:12:58 ha1 pengine[1893]:   notice:  * Fence (reboot) ha2 'node
> is unclean'
> Dec 07 07:12:58 ha1 pengine[1893]:   notice:  * Start      stonith-sbd
>   (   ha1 )
> Dec 07 07:12:58 ha1 pengine[1893]:   notice:  * Start      rsc_dummy_1
>   (   ha1 )
>
> This is apparently 15 minutes cluster-recheck-interval timer (give or take):
>
> Dec 07 06:57:35 [1888] ha1 pacemakerd:     info: crm_log_init:  Changed
> active directory to /var/lib/pacemaker/cores
> ...
> Dec 07 07:12:58 [1894] ha1       crmd:   notice: do_state_transition:
> State transition S_IDLE -> S_POLICY_ENGINE | input=I_PE_CALC
> cause=C_TIMER_POPPED origin=crm_timer_popped
>
> OK, at least we know why it happens. Whether this is intentional
> behavior is another question :)
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org