[ClusterLabs] What is the logic when two node are down at the same time and needs to be fenced
Ken Gaillot
kgaillot at redhat.com
Tue Nov 8 21:12:47 CET 2016
On 11/07/2016 09:59 AM, Niu Sibo wrote:
> Hi Ken,
>
> Thanks for the clarification. Now I have another real problem that needs
> your advise.
>
> The cluster consists of 5 nodes and one of the node got a 1 second
> network failure which resulted in one of the VirtualDomain resources to
> start on two nodes at the same time. The cluster property
> no_quorum_policy is set to stop.
>
> At 16:13:34, this happened:
> 16:13:34 zs95kj attrd[133000]: notice: crm_update_peer_proc: Node
> zs93KLpcs1[5] - state is now lost (was member)
> 16:13:34 zs95kj corosync[132974]: [CPG ] left_list[0]
> group:pacemakerd\x00, ip:r(0) ip(10.20.93.13) , pid:28721
> 16:13:34 zs95kj crmd[133002]: warning: No match for shutdown action on 5
> 16:13:34 zs95kj attrd[133000]: notice: Removing all zs93KLpcs1
> attributes for attrd_peer_change_cb
> 16:13:34 zs95kj corosync[132974]: [CPG ] left_list_entries:1
> 16:13:34 zs95kj crmd[133002]: notice: Stonith/shutdown of zs93KLpcs1
> not matched
> ...
> 16:13:35 zs95kj attrd[133000]: notice: crm_update_peer_proc: Node
> zs93KLpcs1[5] - state is now member (was (null))
>
> From the DC:
> [root at zs95kj ~]# crm_simulate --xml-file
> /var/lib/pacemaker/pengine/pe-input-3288.bz2 |grep 110187
> zs95kjg110187_res (ocf::heartbeat:VirtualDomain): Started
> zs93KLpcs1 <----------This is the baseline that everything works normal
>
> [root at zs95kj ~]# crm_simulate --xml-file
> /var/lib/pacemaker/pengine/pe-input-3289.bz2 |grep 110187
> zs95kjg110187_res (ocf::heartbeat:VirtualDomain): Stopped
> <----------- Here the node zs93KLpcs1 lost it's network for 1 sec and
> resulted in this state.
This is unexpected. Can you open a bug report at
http://bugs.clusterlabs.org/ and attach the full logs from around this
time, and maybe the above two pe-input files?
> [root at zs95kj ~]# crm_simulate --xml-file
> /var/lib/pacemaker/pengine/pe-input-3290.bz2 |grep 110187
> zs95kjg110187_res (ocf::heartbeat:VirtualDomain): Stopped
>
> [root at zs95kj ~]# crm_simulate --xml-file
> /var/lib/pacemaker/pengine/pe-input-3291.bz2 |grep 110187
> zs95kjg110187_res (ocf::heartbeat:VirtualDomain): Stopped
>
>
> From the DC's pengine log, it has:
> 16:05:01 zs95kj pengine[133001]: notice: Calculated Transition 238:
> /var/lib/pacemaker/pengine/pe-input-3288.bz2
> ...
> 16:13:41 zs95kj pengine[133001]: notice: Start
> zs95kjg110187_res#011(zs90kppcs1)
> ...
> 16:13:41 zs95kj pengine[133001]: notice: Calculated Transition 239:
> /var/lib/pacemaker/pengine/pe-input-3289.bz2
>
> From the DC's CRMD log, it has:
> Sep 9 16:05:25 zs95kj crmd[133002]: notice: Transition 238
> (Complete=48, Pending=0, Fired=0, Skipped=0, Incomplete=0,
> Source=/var/lib/pacemaker/pengine/pe-input-3288.bz2): Complete
> ...
> Sep 9 16:13:42 zs95kj crmd[133002]: notice: Initiating action 752:
> start zs95kjg110187_res_start_0 on zs90kppcs1
> ...
> Sep 9 16:13:56 zs95kj crmd[133002]: notice: Transition 241
> (Complete=81, Pending=0, Fired=0, Skipped=172, Incomplete=341,
> Source=/var/lib/pacemaker/pengine/pe-input-3291.bz2): Stopped
>
> Here I do not see any log about pe-input-3289.bz2 and pe-input-3290.bz2.
> Why is this?
>
> From the log on zs93KLpcs1 where guest 110187 was running, i do not see
> any message regarding stopping this resource after it lost its
> connection to the cluster.
>
> Any ideas where to look for possible cause?
>
> On 11/3/2016 1:02 AM, Ken Gaillot wrote:
>> On 11/02/2016 11:17 AM, Niu Sibo wrote:
>>> Hi all,
>>>
>>> I have a general question regarding the fence login in pacemaker.
>>>
>>> I have setup a three nodes cluster with Pacemaker 1.1.13 and cluster
>>> property no_quorum_policy set to ignore. When two nodes lost their NIC
>>> corosync is running on at the same time, it looks like the two nodes are
>>> getting fenced one by one, even I have three fence devices defined for
>>> each of the node.
>>>
>>> What should I be expecting in the case?
>> It's probably coincidence that the fencing happens serially; there is
>> nothing enforcing that for separate fence devices. There are many steps
>> in a fencing request, so they can easily take different times to
>> complete.
>>
>>> I noticed if the node rejoins the cluster before the cluster starts the
>>> fence actions, some resources will get activated on 2 nodes at the
>>> sametime. This is really not good if the resource happens to be
>>> VirutalGuest. Thanks for any suggestions.
>> Since you're ignoring quorum, there's nothing stopping the disconnected
>> node from starting all resources on its own. It can even fence the other
>> nodes, unless the downed NIC is used for fencing. From that node's point
>> of view, it's the other two nodes that are lost.
>>
>> Quorum is the only solution I know of to prevent that. Fencing will
>> correct the situation, but it won't prevent it.
>>
>> See the votequorum(5) man page for various options that can affect how
>> quorum is calculated. Also, the very latest version of corosync supports
>> qdevice (a lightweight daemon that run on a host outside the cluster
>> strictly for the purposes of quorum).
More information about the Users
mailing list