[ClusterLabs] Pacemaker quorum behavior

Sat Sep 10 00:14:12 CEST 2016

On 09/09/2016 04:27 AM, Klaus Wenninger wrote:
> On 09/08/2016 07:31 PM, Scott Greenlese wrote:
>>
>> Hi Klaus, thanks for your prompt and thoughtful feedback...
>>
>> Please see my answers nested below (sections entitled, "Scott's
>> Reply"). Thanks!
>>
>> - Scott
>>
>>
>> Scott Greenlese ... IBM Solutions Test, Poughkeepsie, N.Y.
>> INTERNET: swgreenl at us.ibm.com
>> PHONE: 8/293-7301 (845-433-7301) M/S: POK 42HA/P966
>>
>>
>> Inactive hide details for Klaus Wenninger ---09/08/2016 10:59:27
>> AM---On 09/08/2016 03:55 PM, Scott Greenlese wrote: >Klaus Wenninger
>> ---09/08/2016 10:59:27 AM---On 09/08/2016 03:55 PM, Scott Greenlese
>> wrote: >
>>
>> From: Klaus Wenninger <kwenning at redhat.com>
>> To: users at clusterlabs.org
>> Date: 09/08/2016 10:59 AM
>> Subject: Re: [ClusterLabs] Pacemaker quorum behavior
>>
>> ------------------------------------------------------------------------
>>
>>
>>
>> On 09/08/2016 03:55 PM, Scott Greenlese wrote:
>> >
>> > Hi all...
>> >
>> > I have a few very basic questions for the group.
>> >
>> > I have a 5 node (Linux on Z LPARs) pacemaker cluster with 100
>> > VirtualDomain pacemaker-remote nodes
>> > plus 100 "opaque" VirtualDomain resources. The cluster is configured
>> > to be 'symmetric' and I have no
>> > location constraints on the 200 VirtualDomain resources (other than to
>> > prevent the opaque guests
>> > from running on the pacemaker remote node resources). My quorum is set
>> > as:
>> >
>> > quorum {
>> > provider: corosync_votequorum
>> > }
>> >
>> > As an experiment, I powered down one LPAR in the cluster, leaving 4
>> > powered up with the pcsd service up on the 4 survivors
>> > but corosync/pacemaker down (pcs cluster stop --all) on the 4
>> > survivors. I then started pacemaker/corosync on a single cluster
>> >
>>
>> "pcs cluster stop" shuts down pacemaker & corosync on my test-cluster but
>> did you check the status of the individual services?
>>
>> Scott's reply:
>>
>> No, I only assumed that pacemaker was down because I got this back on
>> my pcs status
>> command from each cluster node:
>>
>> [root at zs95kj VD]# date;for host in zs93KLpcs1 zs95KLpcs1 zs95kjpcs1
>> zs93kjpcs1 ; do ssh $host pcs status; done
>> Wed Sep 7 15:49:27 EDT 2016
>> Error: cluster is not currently running on this node
>> Error: cluster is not currently running on this node
>> Error: cluster is not currently running on this node
>> Error: cluster is not currently running on this node

In my experience, this is sufficient to say that pacemaker and corosync
aren't running.

>>
>> What else should I check?  The pcsd.service service was still up,
>> since I didn't not stop that
>> anywhere. Should I have done,  ps -ef |grep -e pacemaker -e corosync
>>  to check the state before
>> assuming it was really down?
>>
>>
> Guess the answer from Poki should guide you well here ...
>>
>>
>> > node (pcs cluster start), and this resulted in the 200 VirtualDomain
>> > resources activating on the single node.
>> > This was not what I was expecting. I assumed that no resources would
>> > activate / start on any cluster nodes
>> > until 3 out of the 5 total cluster nodes had pacemaker/corosync running.

Your expectation is correct; I'm not sure what happened in this case.
There are some obscure corosync options (e.g. last_man_standing,
allow_downscale) that could theoretically lead to this, but I don't get
the impression you're using anything unusual.

>> > After starting pacemaker/corosync on the single host (zs95kjpcs1),
>> > this is what I see :
>> >
>> > [root at zs95kj VD]# date;pcs status |less
>> > Wed Sep 7 15:51:17 EDT 2016
>> > Cluster name: test_cluster_2
>> > Last updated: Wed Sep 7 15:51:18 2016 Last change: Wed Sep 7 15:30:12
>> > 2016 by hacluster via crmd on zs93kjpcs1
>> > Stack: corosync
>> > Current DC: zs95kjpcs1 (version 1.1.13-10.el7_2.ibm.1-44eb2dd) -
>> > partition with quorum
>> > 106 nodes and 304 resources configured
>> >
>> > Node zs93KLpcs1: pending
>> > Node zs93kjpcs1: pending
>> > Node zs95KLpcs1: pending
>> > Online: [ zs95kjpcs1 ]
>> > OFFLINE: [ zs90kppcs1 ]
>> >
>> > .
>> > .
>> > .
>> > PCSD Status:
>> > zs93kjpcs1: Online
>> > zs95kjpcs1: Online
>> > zs95KLpcs1: Online
>> > zs90kppcs1: Offline
>> > zs93KLpcs1: Online

FYI the Online/Offline above refers only to pcsd, which doesn't have any
effect on the cluster itself -- just the ability to run pcs commands.

>> > So, what exactly constitutes an "Online" vs. "Offline" cluster node
>> > w.r.t. quorum calculation? Seems like in my case, it's "pending" on 3
>> > nodes,
>> > so where does that fall? Any why "pending"? What does that mean?

"pending" means that the node has joined the corosync cluster (which
allows it to contribute to quorum), but it has not yet completed the
pacemaker join process (basically a handshake with the DC).

I think the corosync and pacemaker detail logs would be essential to
figuring out what's going on. Check the logs on the "pending" nodes to
see whether corosync somehow started up by this point, and check the
logs on this node to see what the most recent references to the pending
nodes were.

>> > Also, what exactly is the cluster's expected reaction to quorum loss?
>> > Cluster resources will be stopped or something else?
>> >
>> Depends on how you configure it using cluster property no-quorum-policy
>> (default: stop).
>>
>> Scott's reply:
>>
>> This is how the policy is configured:
>>
>> [root at zs95kj VD]# date;pcs config |grep quorum
>> Thu Sep  8 13:18:33 EDT 2016
>>  no-quorum-policy: stop
>>
>> What should I expect with the 'stop' setting?
>>
>>
>> >
>> >
>> > Where can I find this documentation?
>> >
>> http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/
>>
>> Scott's reply:
>>
>> OK, I'll keep looking thru this doc, but I don't easily find the
>> no-quorum-policy explained.
>>
> Well, the index leads you to:
> http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/s-cluster-options.html
> where you find an exhaustive description of the option.
> 
> In short:
> you are running the default and that leads to all resources being
> stopped in a partition without quorum
> 
>> Thanks..
>>
>>
>> >
>> >
>> > Thanks!
>> >
>> > Scott Greenlese - IBM Solution Test Team.
>> >
>> >
>> >
>> > Scott Greenlese ... IBM Solutions Test, Poughkeepsie, N.Y.
>> > INTERNET: swgreenl at us.ibm.com
>> > PHONE: 8/293-7301 (845-433-7301) M/S: POK 42HA/P966