[ClusterLabs] Resources not always stopped when quorum lost

Fri May 29 08:06:13 EDT 2015

On 5/28/15 2:25 PM, Vladislav Bogdanov wrote:
> 28 мая 2015 г. 18:39:27 GMT+03:00, Matt Rideout <mrideout at windserve.com> пишет:
>> I'm attempting to upgrade a two node cluster with no quorum requirement
>>
>> to a three node cluster with a two member quorum requirement. Each node
>>
>> is running CentOS 7, Pacemaker 1.1.12-22 and Crosync 2.3.4-4.
>>
>> If a node that's running resources loses quorum, then I want it to stop
>>
>> all of its resources.  The goal was partially accomplished by setting
>> the following in corosync.conf:
>>
>> quorum {
>>    provider: corosync_votequorum
>>    two_node: 1
>> }
>>
>> ...and updating Pacemaker's configuration with:
>>
>> pcs property set no-quorum-policy=stop
>>
>> With the above configuration, Two failure scenarios work as I would
>> expect:
>>
>> 1. If I power up a single node, it sees that there is no quorum, and
>> refuses to start any resources until it sees a second node come up.
>>
>> 2. If there are two nodes running, and I power down a node that's
>> running resources, the other node sees that it lost quorum, and refuses
>>
>> to start any resources.
>>
>> However, a third failure scenario does not work as I would expect:
>>
>> 3. If there are two nodes running, and I power down a node that's not
>> running resources, the node that is running resources notes in its log
>> that it lost quorum, but does not actually shutdown any of its running
>> services.
>>
>> Any ideas on what the problem may be would be greatly appreciated. It
>> in
>> case it helps, I included the output of "pcs status", "pcs config
>> show",
>> the contents of "corosync.conf", and the pacemaker and corosync logs
> >from the period during which resources were not stopped.
>> *"pcs status" shows the resources still running after quorum is lost:*
>>
>> Cluster name:
>> Last updated: Thu May 28 10:27:47 2015
>> Last change: Thu May 28 10:03:05 2015
>> Stack: corosync
>> Current DC: node1 (1) - partition WITHOUT quorum
>> Version: 1.1.12-a14efad
>> 3 Nodes configured
>> 12 Resources configured
>>
>>
>> Node node3 (3): OFFLINE (standby)
>> Online: [ node1 ]
>> OFFLINE: [ node2 ]
>>
>> Full list of resources:
>>
>>   Resource Group: primary
>>       virtual_ip_primary    (ocf::heartbeat:IPaddr2):    Started node1
>>       GreenArrowFS    (ocf::heartbeat:Filesystem):    Started node1
>>       GreenArrow    (ocf::drh:greenarrow):    Started node1
>>       virtual_ip_1    (ocf::heartbeat:IPaddr2):    Started node1
>>       virtual_ip_2    (ocf::heartbeat:IPaddr2):    Started node1
>>   Resource Group: secondary
>>       virtual_ip_secondary    (ocf::heartbeat:IPaddr2):    Stopped
>>       GreenArrow-Secondary    (ocf::drh:greenarrow-secondary): Stopped
>>   Clone Set: ping-clone [ping]
>>       Started: [ node1 ]
>>       Stopped: [ node2 node3 ]
>>   Master/Slave Set: GreenArrowDataClone [GreenArrowData]
>>       Masters: [ node1 ]
>>       Stopped: [ node2 node3 ]
>>
>> PCSD Status:
>>    node1: Online
>>    node2: Offline
>>    node3: Offline
>>
>> Daemon Status:
>>    corosync: active/enabled
>>    pacemaker: active/enabled
>>    pcsd: active/enabled
>>
>> *"pcs config show"**shows that the "no-quorum-policy: stop" setting is
>> in place:*
>>
>> Cluster Name:
>> Corosync Nodes:
>>   node1 node2 node3
>> Pacemaker Nodes:
>>   node1 node2 node3
>>
>> Resources:
>>   Group: primary
>> Resource: virtual_ip_primary (class=ocf provider=heartbeat
>> type=IPaddr2)
>>     Attributes: ip=10.10.10.1 cidr_netmask=32
>>     Operations: start interval=0s timeout=20s
>> (virtual_ip_primary-start-timeout-20s)
>>                 stop interval=0s timeout=20s
>> (virtual_ip_primary-stop-timeout-20s)
>>                 monitor interval=30s
>> (virtual_ip_primary-monitor-interval-30s)
>>   Resource: GreenArrowFS (class=ocf provider=heartbeat type=Filesystem)
>>     Attributes: device=/dev/drbd1 directory=/media/drbd1 fstype=xfs
>> options=noatime,discard
>> Operations: start interval=0s timeout=60
>> (GreenArrowFS-start-timeout-60)
>>              stop interval=0s timeout=60 (GreenArrowFS-stop-timeout-60)
>>                 monitor interval=20 timeout=40
>> (GreenArrowFS-monitor-interval-20)
>>    Resource: GreenArrow (class=ocf provider=drh type=greenarrow)
>> Operations: start interval=0s timeout=30 (GreenArrow-start-timeout-30)
>>              stop interval=0s timeout=240 (GreenArrow-stop-timeout-240)
>>                 monitor interval=10 timeout=20
>> (GreenArrow-monitor-interval-10)
>>    Resource: virtual_ip_1 (class=ocf provider=heartbeat type=IPaddr2)
>>     Attributes: ip=64.21.76.51 cidr_netmask=32
>>     Operations: start interval=0s timeout=20s
>> (virtual_ip_1-start-timeout-20s)
>>            stop interval=0s timeout=20s (virtual_ip_1-stop-timeout-20s)
>>                monitor interval=30s (virtual_ip_1-monitor-interval-30s)
>>    Resource: virtual_ip_2 (class=ocf provider=heartbeat type=IPaddr2)
>>     Attributes: ip=64.21.76.63 cidr_netmask=32
>>     Operations: start interval=0s timeout=20s
>> (virtual_ip_2-start-timeout-20s)
>>            stop interval=0s timeout=20s (virtual_ip_2-stop-timeout-20s)
>>                monitor interval=30s (virtual_ip_2-monitor-interval-30s)
>>   Group: secondary
>>    Resource: virtual_ip_secondary (class=ocf provider=heartbeat
>> type=IPaddr2)
>>     Attributes: ip=10.10.10.4 cidr_netmask=32
>>     Operations: start interval=0s timeout=20s
>> (virtual_ip_secondary-start-timeout-20s)
>>                 stop interval=0s timeout=20s
>> (virtual_ip_secondary-stop-timeout-20s)
>>                 monitor interval=30s
>> (virtual_ip_secondary-monitor-interval-30s)
>>    Resource: GreenArrow-Secondary (class=ocf provider=drh
>> type=greenarrow-secondary)
>>     Operations: start interval=0s timeout=30
>> (GreenArrow-Secondary-start-timeout-30)
>>                 stop interval=0s timeout=240
>> (GreenArrow-Secondary-stop-timeout-240)
>>                 monitor interval=10 timeout=20
>> (GreenArrow-Secondary-monitor-interval-10)
>>   Clone: ping-clone
>>    Resource: ping (class=ocf provider=pacemaker type=ping)
>>     Attributes: dampen=30s multiplier=1000 host_list=64.21.76.1
>>     Operations: start interval=0s timeout=60 (ping-start-timeout-60)
>>                 stop interval=0s timeout=20 (ping-stop-timeout-20)
>>               monitor interval=10 timeout=60 (ping-monitor-interval-10)
>>   Master: GreenArrowDataClone
>>    Meta Attrs: master-max=1 master-node-max=1 clone-max=2
>> clone-node-max=1 notify=true
>>    Resource: GreenArrowData (class=ocf provider=linbit type=drbd)
>>     Attributes: drbd_resource=r0
>>     Operations: start interval=0s timeout=240
>> (GreenArrowData-start-timeout-240)
>>                 promote interval=0s timeout=90
>> (GreenArrowData-promote-timeout-90)
>>                 demote interval=0s timeout=90
>> (GreenArrowData-demote-timeout-90)
>>                 stop interval=0s timeout=100
>> (GreenArrowData-stop-timeout-100)
>>              monitor interval=60s (GreenArrowData-monitor-interval-60s)
>>
>> Stonith Devices:
>> Fencing Levels:
>>
>> Location Constraints:
>>    Resource: primary
>> Enabled on: node1 (score:INFINITY) (id:location-primary-node1-INFINITY)
>>      Constraint: location-primary
>>        Rule: score=-INFINITY boolean-op=or (id:location-primary-rule)
>>          Expression: pingd lt 1  (id:location-primary-rule-expr)
>>         Expression: not_defined pingd (id:location-primary-rule-expr-1)
>> Ordering Constraints:
>>   promote GreenArrowDataClone then start GreenArrowFS (kind:Mandatory)
>> (id:order-GreenArrowDataClone-GreenArrowFS-mandatory)
>>    stop GreenArrowFS then demote GreenArrowDataClone (kind:Mandatory)
>> (id:order-GreenArrowFS-GreenArrowDataClone-mandatory)
>> Colocation Constraints:
>>    GreenArrowFS with GreenArrowDataClone (score:INFINITY)
>> (with-rsc-role:Master)
>> (id:colocation-GreenArrowFS-GreenArrowDataClone-INFINITY)
>>    virtual_ip_secondary with GreenArrowDataClone (score:INFINITY)
>> (with-rsc-role:Slave)
>> (id:colocation-virtual_ip_secondary-GreenArrowDataClone-INFINITY)
>>    virtual_ip_primary with GreenArrowDataClone (score:INFINITY)
>> (with-rsc-role:Master)
>> (id:colocation-virtual_ip_primary-GreenArrowDataClone-INFINITY)
>>
>> Cluster Properties:
>>   cluster-infrastructure: corosync
>>   cluster-name: cluster_greenarrow
>>   dc-version: 1.1.12-a14efad
>>   have-watchdog: false
>>   no-quorum-policy: stop
>>   stonith-enabled: false
>> Node Attributes:
>>   node3: standby=on
>>
>> *Here's what was logged*:
>>
>> May 28 10:19:51 node1 pengine[1296]: notice: stage6: Scheduling Node
>> node3 for shutdown
>> May 28 10:19:51 node1 pengine[1296]: notice: process_pe_message:
>> Calculated Transition 7: /var/lib/pacemaker/pengine/pe-input-992.bz2
>> May 28 10:19:51 node1 crmd[1297]: notice: run_graph: Transition 7
>> (Complete=1, Pending=0, Fired=0, Skipped=0, Incomplete=0,
>> Source=/var/lib/pacemaker/pengine/pe-input-992.bz2): Complete
>> May 28 10:19:51 node1 crmd[1297]: notice: do_state_transition: State
>> transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
>> cause=C_FSA_INTERNAL origin=notify_crmd ]
>> May 28 10:19:51 node1 crmd[1297]: notice: peer_update_callback:
>> do_shutdown of node3 (op 64) is complete
>> May 28 10:19:51 node1 attrd[1295]: notice: crm_update_peer_state:
>> attrd_peer_change_cb: Node node3[3] - state is now lost (was member)
>> May 28 10:19:51 node1 attrd[1295]: notice: attrd_peer_remove: Removing
>> all node3 attributes for attrd_peer_change_cb
>> May 28 10:19:51 node1 attrd[1295]: notice: attrd_peer_change_cb: Lost
>> attribute writer node3
>> May 28 10:19:51 node1 corosync[1040]: [TOTEM ] Membership left list
>> contains incorrect address. This is sign of misconfiguration between
>> nodes!
>> May 28 10:19:51 node1 corosync[1040]: [TOTEM ] A new membership
>> (64.21.76.61:25740) was formed. Members left: 3
>> May 28 10:19:51 node1 corosync[1040]: [QUORUM] This node is within the
>> non-primary component and will NOT provide any services.
>> May 28 10:19:51 node1 corosync[1040]: [QUORUM] Members[1]: 1
>> May 28 10:19:51 node1 corosync[1040]: [MAIN  ] Completed service
>> synchronization, ready to provide service.
>> May 28 10:19:51 node1 crmd[1297]: notice: pcmk_quorum_notification:
>> Membership 25740: quorum lost (1)
>> May 28 10:19:51 node1 crmd[1297]: notice: crm_update_peer_state:
>> pcmk_quorum_notification: Node node3[3] - state is now lost (was
>> member)
>> May 28 10:19:51 node1 crmd[1297]: notice: peer_update_callback:
>> do_shutdown of node3 (op 64) is complete
>> May 28 10:19:51 node1 pacemakerd[1254]: notice:
>> pcmk_quorum_notification: Membership 25740: quorum lost (1)
>> May 28 10:19:51 node1 pacemakerd[1254]: notice: crm_update_peer_state:
>> pcmk_quorum_notification: Node node3[3] - state is now lost (was
>> member)
>> May 28 10:19:52 node1 corosync[1040]: [TOTEM ] Automatically recovered
>> ring 1
>>
>> *H**ere's corosync.conf:*
>>
>> totem {
>>    version: 2
>>    secauth: off
>>    cluster_name: cluster_greenarrow
>>    rrp_mode: passive
>>    transport: udpu
>> }
>>
>> nodelist {
>>    node {
>>      ring0_addr: node1
>>      ring1_addr: 10.10.10.2
>>      nodeid: 1
>>    }
>>    node {
>>      ring0_addr: node2
>>      ring1_addr: 10.10.10.3
>>      nodeid: 2
>>    }
>>    node {
>>      ring0_addr: node3
>>      nodeid: 3
>>    }
>> }
>>
>> quorum {
>>    provider: corosync_votequorum
>>    two_node: 0
>> }
>>
>> logging {
>>    to_syslog: yes
>> }
>>
>> Thanks,
>>
>> Matt
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> http://clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started:
>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
> Hi,
> you probably need to replace two_node with wait_for_all to achieve what you want. Two_node mode implies latter but weakens quorum requirement from 50%+1 to just 50%. So pacemaker sees that your cluster is quorate.
> Best,
> Vladislav
>
>
Thank you for the feedback, Vladislav. I suspect that something else is 
going on, though.

two-node is set to 0, so it's already disabled.

I'm not clear on how wait_for_all relates to the problem that I'm 
seeing, since the problem isn't with establishing the initial quorum.

The problem is that if there are two nodes running which have already 
established  quorum, and I power down the node that's not running 
resources, the node that is running resources notes in its log that it 
lost quorum, but does not actually shutdown any of its running services 
until cluster-recheck-interval (by default, 15 minutes) elapses.