Opt-in cluster shows resources stopped where no nodes should be considered

Martin Schlegel martin at nuboreto.org
Fri Mar 4 05:51:45 EST 2016

Hello all

While our cluster seems to be working just fine I have noticed something in the
crm_mon output that I don't quite understand and that is throwing off my
monitoring a bit as stopped resources could mean something is wrong. I was
hoping somebody could help me to understand what it means. It seems this might
have something to do with the fact I am using remote nodes, but I cannot wrap my
head around it.

What I am seeing are 3 additional, unexpected lines in the crm_mon -1rR output
listing my "p_pgcPgbouncer_test" resources as stopped even though there should
not be any more nodes to be considered in my mind (opt-in cluster, see location
rules). At the same time this is not happening to my p_pgsqln resources as shown
at the top of the crm_mon output.

The important crm_mon -1rR output lines further below are marked with arrows ->

Some background on the policy:
We are running an asymmetric / opt-in cluster (property symmetric-cluster=false.

The cluster's main purpose is to take care of a 3+-nodes replicating master /
slave database running strictly on nodes pg1, pg2 and pg3 per location rule

We also have 2 remote nodes pagalog1 & pgalog2 defined to control database
connection pooler resources (p_pgcPgbouncer_test) to facilitate client
connection reroute as per location rule l_pgc_resources.

crm_mon -1rR output:

Last updated: Fri Mar  4 09:56:02 2016          Last change: Fri Mar  4 09:55:47
2016 by root via cibadmin on pg1
Stack: corosync
Current DC: pg1 (1) (version 1.1.14-70404b0) - partition with quorum
5 nodes and 29 resources configured

Online: [ pg1 (1) pg2 (2) pg3 (3) ]
RemoteOnline: [ pgalog1 pgalog2 ]

Full list of resources:

 Master/Slave Set: ms_pgsqln [p_pgsqln]
     p_pgsqln   (ocf::heartbeat:pgsqln):        Master pg3
     p_pgsqln   (ocf::heartbeat:pgsqln):        Started pg1
     p_pgsqln   (ocf::heartbeat:pgsqln):        Started pg2
-> NO additional lines here <---
     Masters: [ pg3 ]
     Stopped: [ pg1 pg2 ]
 pgalog1        (ocf::pacemaker:remote):        Started pg1
 pgalog2        (ocf::pacemaker:remote):        Started pg3
 Clone Set: cl_pgcPgbouncer [p_pgcPgbouncer_test]
     p_pgcPgbouncer_test        (ocf::heartbeat:pgbouncer):     Started pgalog1
     p_pgcPgbouncer_test        (ocf::heartbeat:pgbouncer):     Started pgalog2
->   p_pgcPgbouncer_test        (ocf::heartbeat:pgbouncer):     Stopped
->   p_pgcPgbouncer_test        (ocf::heartbeat:pgbouncer):     Stopped
->   p_pgcPgbouncer_test        (ocf::heartbeat:pgbouncer):     Stopped
     Started: [ pgalog1 pgalog2 ]
<! End of Output !>

Here are the most important parts of the configuration as shown in "crm
configure show":

primitive pgalog1 ocf:pacemaker:remote \
	params server=pgalog1 port=3121 \
	meta target-role=Started
primitive pgalog2 ocf:pacemaker:remote \
	params server=pgalog2 port=3121 \
	meta target-role=Started
location l_pgc_resources { cl_pgcPgbouncer } resource-discovery=exclusive \
	rule #uname eq pgalog1 \
	rule #uname eq pgalog2

location l_pgs_resources { cl_pgsServices1 ms_pgsqln p_pgsBackupjob pgalog1
pgalog2 } resource-discovery=exclusive \
	rule #uname eq pg1 \
	rule #uname eq pg2 \
	rule #uname eq pg3

property cib-bootstrap-options: \
	symmetric-cluster=false \

Martin Schlegel

