[ClusterLabs] Opt-in cluster shows resources stopped where no nodes should be considered

Ken Gaillot kgaillot at redhat.com
Fri Mar 4 10:37:18 EST 2016


On 03/04/2016 04:51 AM, Martin Schlegel wrote:
> Hello all
> 
> While our cluster seems to be working just fine I have noticed something in the
> crm_mon output that I don't quite understand and that is throwing off my
> monitoring a bit as stopped resources could mean something is wrong. I was
> hoping somebody could help me to understand what it means. It seems this might
> have something to do with the fact I am using remote nodes, but I cannot wrap my
> head around it.
> 
> What I am seeing are 3 additional, unexpected lines in the crm_mon -1rR output
> listing my "p_pgcPgbouncer_test" resources as stopped even though there should
> not be any more nodes to be considered in my mind (opt-in cluster, see location
> rules). At the same time this is not happening to my p_pgsqln resources as shown
> at the top of the crm_mon output.

There are two things to look at here: the crm_mon options, and the
clone-max property.

-r means "show inactive resources", and -R means "show more detail". For
clones, this will show all clone instances individually, even if they
can't currently run anywhere due to a constraint. Don't use those
options if you don't want to see that level of detail.

clone-max defaults to the number of nodes. I'm guessing you let it
default, so pacemaker will actually prepare 5 clone instances, even
though only 2 of them can run under current conditions. Setting
clone-max=2 on the clone resource would make the other instances go away.

> The important crm_mon -1rR output lines further below are marked with arrows ->
>   <---.  
> 
> 
> Some background on the policy:
> We are running an asymmetric / opt-in cluster (property symmetric-cluster=false.
> 
> 
> The cluster's main purpose is to take care of a 3+-nodes replicating master /
> slave database running strictly on nodes pg1, pg2 and pg3 per location rule
> l_pgs_resources.
> 
> We also have 2 remote nodes pagalog1 & pgalog2 defined to control database
> connection pooler resources (p_pgcPgbouncer_test) to facilitate client
> connection reroute as per location rule l_pgc_resources.
> 
> 
> crm_mon -1rR output:
> 
> Last updated: Fri Mar  4 09:56:02 2016          Last change: Fri Mar  4 09:55:47
> 2016 by root via cibadmin on pg1
> Stack: corosync
> Current DC: pg1 (1) (version 1.1.14-70404b0) - partition with quorum
> 5 nodes and 29 resources configured
> 
> Online: [ pg1 (1) pg2 (2) pg3 (3) ]
> RemoteOnline: [ pgalog1 pgalog2 ]
> 
> Full list of resources:
> 
>  Master/Slave Set: ms_pgsqln [p_pgsqln]
>                                                                                                                                                        
>      p_pgsqln   (ocf::heartbeat:pgsqln):        Master pg3
>                                                                                                                                                           
>      p_pgsqln   (ocf::heartbeat:pgsqln):        Started pg1
>                                                                                                                                                              
>      p_pgsqln   (ocf::heartbeat:pgsqln):        Started pg2
> -> NO additional lines here <---
>      Masters: [ pg3 ]
>      Stopped: [ pg1 pg2 ]
> [...]
>  pgalog1        (ocf::pacemaker:remote):        Started pg1
>  pgalog2        (ocf::pacemaker:remote):        Started pg3
>  Clone Set: cl_pgcPgbouncer [p_pgcPgbouncer_test]
>      p_pgcPgbouncer_test        (ocf::heartbeat:pgbouncer):     Started pgalog1
>      p_pgcPgbouncer_test        (ocf::heartbeat:pgbouncer):     Started pgalog2
> ->   p_pgcPgbouncer_test        (ocf::heartbeat:pgbouncer):     Stopped
>     <----
> ->   p_pgcPgbouncer_test        (ocf::heartbeat:pgbouncer):     Stopped
>     <----
> ->   p_pgcPgbouncer_test        (ocf::heartbeat:pgbouncer):     Stopped
>     <----
>      Started: [ pgalog1 pgalog2 ]
> <! End of Output !>
> 
> 
> Here are the most important parts of the configuration as shown in "crm
> configure show":
> 
> [...]
> primitive pgalog1 ocf:pacemaker:remote \
> 	params server=pgalog1 port=3121 \
> 	meta target-role=Started
> primitive pgalog2 ocf:pacemaker:remote \
> 	params server=pgalog2 port=3121 \
> 	meta target-role=Started
> [...]
> location l_pgc_resources { cl_pgcPgbouncer } resource-discovery=exclusive \
> 	rule #uname eq pgalog1 \
> 	rule #uname eq pgalog2
> 
> location l_pgs_resources { cl_pgsServices1 ms_pgsqln p_pgsBackupjob pgalog1
> pgalog2 } resource-discovery=exclusive \
> 	rule #uname eq pg1 \
> 	rule #uname eq pg2 \
> 	rule #uname eq pg3
> 
> [...]
> property cib-bootstrap-options: \
> 	symmetric-cluster=false \
> [...]
> 
> 
> Regards,
> Martin Schlegel





More information about the Users mailing list