<html>
  <head>

    <meta http-equiv="content-type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    I'm attempting to upgrade a two node cluster with no quorum
    requirement to a three node cluster with a two member quorum
    requirement. Each node is running CentOS 7, Pacemaker 1.1.12-22 and
    Crosync 2.3.4-4.<br>
    <br>
    If a node that's running resources loses quorum, then I want it to
    stop all of its resources.  The goal was partially accomplished by
    setting the following in corosync.conf:<br>
    <br>
    quorum {<br>
      provider: corosync_votequorum<br>
      two_node: 1<br>
    }<br>
    <br>
    ...and updating Pacemaker's configuration with:<br>
    <br>
    pcs property set no-quorum-policy=stop<br>
    <br>
    With the above configuration, Two failure scenarios work as I would
    expect:<br>
    <br>
    1. If I power up a single node, it sees that there is no quorum, and
    refuses to start any resources until it sees a second node come up.<br>
    <br>
    2. If there are two nodes running, and I power down a node that's
    running resources, the other node sees that it lost quorum, and
    refuses to start any resources.<br>
    <br>
    However, a third failure scenario does not work as I would expect:<br>
    <br>
    3. If there are two nodes running, and I power down a node that's
    not running resources, the node that is running resources notes in
    its log that it lost quorum, but does not actually shutdown any of
    its running services.<br>
    <br>
    Any ideas on what the problem may be would be greatly appreciated.
    It in case it helps, I included the output of "pcs status", "pcs
    config show", the contents of "corosync.conf", and the pacemaker and
    corosync logs from the period during which resources were not
    stopped.<br>
    <br>
    <b>"pcs status" shows the resources still running after quorum is
      lost:</b><br>
    <br>
    Cluster name:<br>
    Last updated: Thu May 28 10:27:47 2015<br>
    Last change: Thu May 28 10:03:05 2015<br>
    Stack: corosync<br>
    Current DC: node1 (1) - partition WITHOUT quorum<br>
    Version: 1.1.12-a14efad<br>
    3 Nodes configured<br>
    12 Resources configured<br>
    <br>
    <br>
    Node node3 (3): OFFLINE (standby)<br>
    Online: [ node1 ]<br>
    OFFLINE: [ node2 ]<br>
    <br>
    Full list of resources:<br>
    <br>
     Resource Group: primary<br>
         virtual_ip_primary    (ocf::heartbeat:IPaddr2):    Started
    node1<br>
         GreenArrowFS    (ocf::heartbeat:Filesystem):    Started node1<br>
         GreenArrow    (ocf::drh:greenarrow):    Started node1<br>
         virtual_ip_1    (ocf::heartbeat:IPaddr2):    Started node1<br>
         virtual_ip_2    (ocf::heartbeat:IPaddr2):    Started node1<br>
     Resource Group: secondary<br>
         virtual_ip_secondary    (ocf::heartbeat:IPaddr2):    Stopped<br>
         GreenArrow-Secondary    (ocf::drh:greenarrow-secondary):   
    Stopped<br>
     Clone Set: ping-clone [ping]<br>
         Started: [ node1 ]<br>
         Stopped: [ node2 node3 ]<br>
     Master/Slave Set: GreenArrowDataClone [GreenArrowData]<br>
         Masters: [ node1 ]<br>
         Stopped: [ node2 node3 ]<br>
    <br>
    PCSD Status:<br>
      node1: Online<br>
      node2: Offline<br>
      node3: Offline<br>
    <br>
    Daemon Status:<br>
      corosync: active/enabled<br>
      pacemaker: active/enabled<br>
      pcsd: active/enabled<br>
    <br>
    <b>"pcs config show"</b><b> shows that the "no-quorum-policy: stop"
      setting is in place:</b><br>
    <br>
    Cluster Name:<br>
    Corosync Nodes:<br>
     node1 node2 node3<br>
    Pacemaker Nodes:<br>
     node1 node2 node3<br>
    <br>
    Resources:<br>
     Group: primary<br>
      Resource: virtual_ip_primary (class=ocf provider=heartbeat
    type=IPaddr2)<br>
       Attributes: ip=10.10.10.1 cidr_netmask=32<br>
       Operations: start interval=0s timeout=20s
    (virtual_ip_primary-start-timeout-20s)<br>
                   stop interval=0s timeout=20s
    (virtual_ip_primary-stop-timeout-20s)<br>
                   monitor interval=30s
    (virtual_ip_primary-monitor-interval-30s)<br>
      Resource: GreenArrowFS (class=ocf provider=heartbeat
    type=Filesystem)<br>
       Attributes: device=/dev/drbd1 directory=/media/drbd1 fstype=xfs
    options=noatime,discard<br>
       Operations: start interval=0s timeout=60
    (GreenArrowFS-start-timeout-60)<br>
                   stop interval=0s timeout=60
    (GreenArrowFS-stop-timeout-60)<br>
                   monitor interval=20 timeout=40
    (GreenArrowFS-monitor-interval-20)<br>
      Resource: GreenArrow (class=ocf provider=drh type=greenarrow)<br>
       Operations: start interval=0s timeout=30
    (GreenArrow-start-timeout-30)<br>
                   stop interval=0s timeout=240
    (GreenArrow-stop-timeout-240)<br>
                   monitor interval=10 timeout=20
    (GreenArrow-monitor-interval-10)<br>
      Resource: virtual_ip_1 (class=ocf provider=heartbeat type=IPaddr2)<br>
       Attributes: ip=64.21.76.51 cidr_netmask=32<br>
       Operations: start interval=0s timeout=20s
    (virtual_ip_1-start-timeout-20s)<br>
                   stop interval=0s timeout=20s
    (virtual_ip_1-stop-timeout-20s)<br>
                   monitor interval=30s
    (virtual_ip_1-monitor-interval-30s)<br>
      Resource: virtual_ip_2 (class=ocf provider=heartbeat type=IPaddr2)<br>
       Attributes: ip=64.21.76.63 cidr_netmask=32<br>
       Operations: start interval=0s timeout=20s
    (virtual_ip_2-start-timeout-20s)<br>
                   stop interval=0s timeout=20s
    (virtual_ip_2-stop-timeout-20s)<br>
                   monitor interval=30s
    (virtual_ip_2-monitor-interval-30s)<br>
     Group: secondary<br>
      Resource: virtual_ip_secondary (class=ocf provider=heartbeat
    type=IPaddr2)<br>
       Attributes: ip=10.10.10.4 cidr_netmask=32<br>
       Operations: start interval=0s timeout=20s
    (virtual_ip_secondary-start-timeout-20s)<br>
                   stop interval=0s timeout=20s
    (virtual_ip_secondary-stop-timeout-20s)<br>
                   monitor interval=30s
    (virtual_ip_secondary-monitor-interval-30s)<br>
      Resource: GreenArrow-Secondary (class=ocf provider=drh
    type=greenarrow-secondary)<br>
       Operations: start interval=0s timeout=30
    (GreenArrow-Secondary-start-timeout-30)<br>
                   stop interval=0s timeout=240
    (GreenArrow-Secondary-stop-timeout-240)<br>
                   monitor interval=10 timeout=20
    (GreenArrow-Secondary-monitor-interval-10)<br>
     Clone: ping-clone<br>
      Resource: ping (class=ocf provider=pacemaker type=ping)<br>
       Attributes: dampen=30s multiplier=1000 host_list=64.21.76.1<br>
       Operations: start interval=0s timeout=60 (ping-start-timeout-60)<br>
                   stop interval=0s timeout=20 (ping-stop-timeout-20)<br>
                   monitor interval=10 timeout=60
    (ping-monitor-interval-10)<br>
     Master: GreenArrowDataClone<br>
      Meta Attrs: master-max=1 master-node-max=1 clone-max=2
    clone-node-max=1 notify=true<br>
      Resource: GreenArrowData (class=ocf provider=linbit type=drbd)<br>
       Attributes: drbd_resource=r0<br>
       Operations: start interval=0s timeout=240
    (GreenArrowData-start-timeout-240)<br>
                   promote interval=0s timeout=90
    (GreenArrowData-promote-timeout-90)<br>
                   demote interval=0s timeout=90
    (GreenArrowData-demote-timeout-90)<br>
                   stop interval=0s timeout=100
    (GreenArrowData-stop-timeout-100)<br>
                   monitor interval=60s
    (GreenArrowData-monitor-interval-60s)<br>
    <br>
    Stonith Devices:<br>
    Fencing Levels:<br>
    <br>
    Location Constraints:<br>
      Resource: primary<br>
        Enabled on: node1 (score:INFINITY)
    (id:location-primary-node1-INFINITY)<br>
        Constraint: location-primary<br>
          Rule: score=-INFINITY boolean-op=or 
    (id:location-primary-rule)<br>
            Expression: pingd lt 1  (id:location-primary-rule-expr)<br>
            Expression: not_defined pingd 
    (id:location-primary-rule-expr-1)<br>
    Ordering Constraints:<br>
      promote GreenArrowDataClone then start GreenArrowFS
    (kind:Mandatory)
    (id:order-GreenArrowDataClone-GreenArrowFS-mandatory)<br>
      stop GreenArrowFS then demote GreenArrowDataClone (kind:Mandatory)
    (id:order-GreenArrowFS-GreenArrowDataClone-mandatory)<br>
    Colocation Constraints:<br>
      GreenArrowFS with GreenArrowDataClone (score:INFINITY)
    (with-rsc-role:Master)
    (id:colocation-GreenArrowFS-GreenArrowDataClone-INFINITY)<br>
      virtual_ip_secondary with GreenArrowDataClone (score:INFINITY)
    (with-rsc-role:Slave)
    (id:colocation-virtual_ip_secondary-GreenArrowDataClone-INFINITY)<br>
      virtual_ip_primary with GreenArrowDataClone (score:INFINITY)
    (with-rsc-role:Master)
    (id:colocation-virtual_ip_primary-GreenArrowDataClone-INFINITY)<br>
    <br>
    Cluster Properties:<br>
     cluster-infrastructure: corosync<br>
     cluster-name: cluster_greenarrow<br>
     dc-version: 1.1.12-a14efad<br>
     have-watchdog: false<br>
     no-quorum-policy: stop<br>
     stonith-enabled: false<br>
    Node Attributes:<br>
     node3: standby=on<br>
    <br>
    <b>Here's what was logged</b>:<br>
    <br>
    May 28 10:19:51 node1 pengine[1296]: notice: stage6: Scheduling Node
    node3 for shutdown<br>
    May 28 10:19:51 node1 pengine[1296]: notice: process_pe_message:
    Calculated Transition 7: /var/lib/pacemaker/pengine/pe-input-992.bz2<br>
    May 28 10:19:51 node1 crmd[1297]: notice: run_graph: Transition 7
    (Complete=1, Pending=0, Fired=0, Skipped=0, Incomplete=0,
    Source=/var/lib/pacemaker/pengine/pe-input-992.bz2): Complete<br>
    May 28 10:19:51 node1 crmd[1297]: notice: do_state_transition: State
    transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
    cause=C_FSA_INTERNAL origin=notify_crmd ]<br>
    May 28 10:19:51 node1 crmd[1297]: notice: peer_update_callback:
    do_shutdown of node3 (op 64) is complete<br>
    May 28 10:19:51 node1 attrd[1295]: notice: crm_update_peer_state:
    attrd_peer_change_cb: Node node3[3] - state is now lost (was member)<br>
    May 28 10:19:51 node1 attrd[1295]: notice: attrd_peer_remove:
    Removing all node3 attributes for attrd_peer_change_cb<br>
    May 28 10:19:51 node1 attrd[1295]: notice: attrd_peer_change_cb:
    Lost attribute writer node3<br>
    May 28 10:19:51 node1 corosync[1040]: [TOTEM ] Membership left list
    contains incorrect address. This is sign of misconfiguration between
    nodes!<br>
    May 28 10:19:51 node1 corosync[1040]: [TOTEM ] A new membership
    (64.21.76.61:25740) was formed. Members left: 3<br>
    May 28 10:19:51 node1 corosync[1040]: [QUORUM] This node is within
    the non-primary component and will NOT provide any services.<br>
    May 28 10:19:51 node1 corosync[1040]: [QUORUM] Members[1]: 1<br>
    May 28 10:19:51 node1 corosync[1040]: [MAIN  ] Completed service
    synchronization, ready to provide service.<br>
    May 28 10:19:51 node1 crmd[1297]: notice: pcmk_quorum_notification:
    Membership 25740: quorum lost (1)<br>
    May 28 10:19:51 node1 crmd[1297]: notice: crm_update_peer_state:
    pcmk_quorum_notification: Node node3[3] - state is now lost (was
    member)<br>
    May 28 10:19:51 node1 crmd[1297]: notice: peer_update_callback:
    do_shutdown of node3 (op 64) is complete<br>
    May 28 10:19:51 node1 pacemakerd[1254]: notice:
    pcmk_quorum_notification: Membership 25740: quorum lost (1)<br>
    May 28 10:19:51 node1 pacemakerd[1254]: notice:
    crm_update_peer_state: pcmk_quorum_notification: Node node3[3] -
    state is now lost (was member)<br>
    May 28 10:19:52 node1 corosync[1040]: [TOTEM ] Automatically
    recovered ring 1<br>
    <br>
    <b>H</b><b>ere's corosync.conf:</b><br>
    <br>
    totem {<br>
      version: 2<br>
      secauth: off<br>
      cluster_name: cluster_greenarrow<br>
      rrp_mode: passive<br>
      transport: udpu<br>
    }<br>
    <br>
    nodelist {<br>
      node {<br>
        ring0_addr: node1<br>
        ring1_addr: 10.10.10.2<br>
        nodeid: 1<br>
      }<br>
      node {<br>
        ring0_addr: node2<br>
        ring1_addr: 10.10.10.3<br>
        nodeid: 2<br>
      }<br>
      node {<br>
        ring0_addr: node3<br>
        nodeid: 3<br>
      }<br>
    }<br>
    <br>
    quorum {<br>
      provider: corosync_votequorum<br>
      two_node: 0<br>
    }<br>
    <br>
    logging {<br>
      to_syslog: yes<br>
    }<br>
    <br>
    Thanks,<br>
    <br>
    Matt<br>
  </body>
</html>