[Pacemaker] Dependency Loop Errors in Log

Wed Aug 10 00:04:54 EDT 2011

On 09/08/11 02:36, Bobbie Lind wrote:
> I have 6 servers with three sets of 2 failover pairs. So 2 servers for
> one pair, 2 servers for another pair etc.  I am trying to configure this
> under one pacemaker instance.
>
> I changed from using Resource groups because the resources are not
> dependent on each other, just located together.
>
> I have 4 dummy resources that are used to help with colocation.
>
> The following configuration works as designed when I first start up
> pacemaker but when I try and run failover tests that's when things get
> screwy.
>
> Here is the relevant snippet of my configuration that shows the location
> and colocation set up.  As well as what I *think* I am asking it to do.
>
> [...snip...]
>
> ** Ensuring that the resources from one failover node do not start up on
> the other nodes giving -500 points.
> ** failover pairs are MDSgroup, OSS1/OSS3, and OSS2/OSS4
> colocation colocMDSOSS1 -500: anchorOSS1 MDSgroup
> colocation colocMDSOSS2 -500: anchorOSS2 MDSgroup
> colocation colocMDSOSS3 -500: anchorOSS3 MDSgroup
> colocation colocMDSOSS4 -500: anchorOSS4 MDSgroup
> colocation colocOSS1MDS -500: MDSgroup anchorOSS1
> colocation colocOSS2MDS -500: MDSgroup anchorOSS2
> colocation colocOSS3MDS -500: MDSgroup anchorOSS3
> colocation colocOSS4MDS -500: MDSgroup anchorOSS4
> colocation colocOSS2OSS1 -500: anchorOSS1 anchorOSS2
> colocation colocOSS4OSS1 -500: anchorOSS1 anchorOSS4
> colocation colocOSS1OSS2 -500: anchorOSS2 anchorOSS1
> colocation colocOSS3OSS2 -500: anchorOSS2 anchorOSS3
> colocation colocOSS2OSS3 -500: anchorOSS3 anchorOSS2
> colocation colocOSS4OSS3 -500: anchorOSS3 anchorOSS4
> colocation colocOSS1OSS4 -500: anchorOSS4 anchorOSS1
> colocation colocOSS3OSS4 -500: anchorOSS4 anchorOSS3
>
> [...snip...]
>
> One of the issues I am running into is the logs are giving me dependency
> loop errors.  Here is a snippet but it does this for all the
> anchor/dummy resources and the LVM resource(from MDSgroup)
>
> Aug 08 11:05:56 s02ns070 pengine: [32677]: info: rsc_merge_weights:
> anchorOSS1: Breaking dependency loop at MDSgroup
> [...snip...]
>
> I think these dependency loops are what's causing the other quirky
> behavior I have of resources failing to the wrong server.
>
> I'm not sure where the dependency loop is coming from, but I'm sure it
> has something to do with my configuration and score setup.
>
> Any help deciphering this would be greatly appreciated.

You can't have bidirectional colocation, i.e.. either specify 
"colocation colocMDSOSS1 -500: anchorOSS1 MDSgroup" or "colocation 
colocOSS1MDS -500: MDSgroup anchorOSS1", but not both.  The dependency 
loop error means pacemaker is tossing one of these away.  For some more 
detail check the Resource Constraints chapter of Pacemaker explained 
(http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/) 
or the mailing list archives (this has come up a few times in recent 
memory).

HTH,

Tim
-- 
Tim Serong
Senior Clustering Engineer
SUSE
tserong at suse.com