[ClusterLabs] Pacemaker quorum behavior

Scott Greenlese swgreenl at us.ibm.com
Thu Sep 8 15:55:28 CEST 2016


Hi all...

I have a few very basic questions for the group.

I have a 5 node (Linux on Z LPARs) pacemaker cluster with 100 VirtualDomain
pacemaker-remote nodes
plus 100 "opaque" VirtualDomain resources. The cluster is configured to be
'symmetric' and I have no
location constraints on the 200 VirtualDomain resources (other than to
prevent the opaque guests
from running on the pacemaker remote node resources).  My quorum is set as:

quorum {
    provider: corosync_votequorum
}

As an experiment, I powered down one LPAR in the cluster, leaving 4 powered
up with the pcsd service up on the 4 survivors
but corosync/pacemaker down (pcs cluster stop --all) on the 4 survivors.
I then started pacemaker/corosync on a single cluster
node (pcs cluster start), and this resulted in the 200 VirtualDomain
resources activating on the single node.
This was not what I was expecting.  I assumed that no resources would
activate / start on any cluster nodes
until 3 out of the 5 total cluster nodes had pacemaker/corosync running.

After starting pacemaker/corosync on the single host (zs95kjpcs1), this is
what I see :

[root at zs95kj VD]# date;pcs status |less
Wed Sep  7 15:51:17 EDT 2016
Cluster name: test_cluster_2
Last updated: Wed Sep  7 15:51:18 2016          Last change: Wed Sep  7
15:30:12 2016 by hacluster via crmd on zs93kjpcs1
Stack: corosync
Current DC: zs95kjpcs1 (version 1.1.13-10.el7_2.ibm.1-44eb2dd) - partition
with quorum
106 nodes and 304 resources configured

Node zs93KLpcs1: pending
Node zs93kjpcs1: pending
Node zs95KLpcs1: pending
Online: [ zs95kjpcs1 ]
OFFLINE: [ zs90kppcs1 ]

.
.
.
PCSD Status:
  zs93kjpcs1: Online
  zs95kjpcs1: Online
  zs95KLpcs1: Online
  zs90kppcs1: Offline
  zs93KLpcs1: Online

So, what exactly constitutes an "Online" vs. "Offline" cluster node w.r.t.
quorum calculation?   Seems like in my case, it's "pending" on 3 nodes,
so where does that fall?   Any why "pending"?  What does that mean?

Also, what exactly is the cluster's expected reaction to quorum loss?
Cluster resources will be stopped or something else?

Where can I find this documentation?

Thanks!

Scott Greenlese -  IBM Solution Test Team.



Scott Greenlese ... IBM Solutions Test,  Poughkeepsie, N.Y.
  INTERNET:  swgreenl at us.ibm.com
  PHONE:  8/293-7301 (845-433-7301)    M/S:  POK 42HA/P966
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://clusterlabs.org/pipermail/users/attachments/20160908/26acee89/attachment.html>


More information about the Users mailing list