[Pacemaker] migration-threshold causing unnecessary restart of underlying resources

Thu Aug 12 12:46:17 EDT 2010

Hi,

On Thu, Aug 12, 2010 at 04:12:02AM +0200, Cnut Jansen wrote:
>  Hi,
> 
> I'm once again experiencing (imho) strange behaviour respectively
> decision-making by Pacemaker, and I hope that someone can either
> enlighten me a little about this, its intention and/or a possible
> misconfiguration or something, or confirm it a possible bug.
> 
> Basically I have a cluster of 2 nodes with cloned DLM-, O2CB-,
> DRBD-, mount-resources, and a MySQL-resource (grouped with an
> IPaddr-resource) running on top of the other ones.
> The MySQL(-group)-resource depends on the mount-resource, which
> depends on both, the DRBD- and the O2CB-resources equally, and the
> O2CB-resource depends on the DLM-resource.
> cloneDlm -> cloneO2cb -\
>                         }-> cloneMountMysql -> mysql / grpMysql(
> mysql -> ipMysql )
> msDrbdMysql -----------/
> Furthermore for the MySQL(-group)-resource I set meta-attributes
> "migration-threshold=1" and "failure-timeout=90" (later also tried
> settings "3" and "130" for these).
> 
> Now I picked a little on mysql using "crm_resource -F -r mysql -H
> <node>", expecting that only mysql respectively its group (tested
> both configurations; same result) would be stopped (and moved over
> to the other node).
> But actually not only mysql/grpMysql was stopped, but also the
> mount- and even the DRBD-resources were stopped, and upon restarting
> them the DRBD-resource was left as slave (thus the mount of course
> wasn't allowed to restart either) and - back then before I set
> cluster-recheck-interval=2m - didn't seem to even try to promote
> back to master (didn't wait cluster-recheck-interval's default 15m).
> 
> Now through a lot of testing I found out that:
> a) the stops/restarts of the underlying resources happen only when
> failcounter hits the limit set by migration-threshold; i.e. when set
> to 3, on first 2 failures only mysql/grpMysql is restarted on the
> same node and only on 3rd one underlying resources are left in a
> mess (while mysql/grpMysql migrates) (for DRBD reproducable; unsure
> about DLM/O2CB-side, but there's sometimes hard trouble too after
> having picked on mysql; just couldn't definitively link it yet)

The migration-threshold shouldn't in any way influence resources
which don't depend on the resource which fails over. Couldn't
reproduce it here with our example RAs.

BTW, what's the point of cloneMountMysql? If it can run only
where drbd is master, then it can run on one node only:

colocation colocMountMysql_drbd inf: cloneMountMysql msDrbdMysql:Master
order orderMountMysql_drbd inf: msDrbdMysql:promote cloneMountMysql:start

Right? At least that's how it behaves here, with the tip of
1.1.2.

> b) upon causing mysql/grpMysql's migration, score for
> msDrbdMysql:promote changes from 10020 to -inf and stays there for
> the time of mysql/grpMysql's failure-timeout (proved with also
> setting to 130), before it rises back up to 10000
> c) msDrbdMysql remains slave until the next cluster-recheck after
> its promote-score went back up to 10000
> d) I also have the impression that fail-counters don't get reset
> after their failure-timeout, because when migration-threshold=3 is
> set, upon every(!) following picking-on those issues occure, even
> when I've waited for nearly 5 minutes (with failure-timeout=90)
> without any touching the cluster

That seems to be a bug though I couldn't reproduce it with a
simple configuration.

Thanks,

Dejan

> I experienced this on both test-clusters, a SLES 11 HAE SP1 with
> Pacemaker 1.1.2, and a Debian Squeeze with Pacemaker 1.0.9. When
> migration-threshold for mysql/grpMysql is removed, everything is
> fine (except no migration of course). I can't remember such
> happening with SLES 11 HAE SP0's Pacemaker 1.0.6.
> 
> I'd really appreciate any comment and/or enlightment about what's
> the deal with this. (-;
> 
> 
> p.s.: Just for fun / testing / proving I just also contrainted
> grpLdirector to cloneMountShared... and could perfectly reproduce
> that problem with its then underlying resources too.
> 
> ================================================================================
> 
> 2) mysql: meta migration-threshold=1 failure-timeout=130 ->
> drbd:promote erst nach 130sek score-technisch wieder möglich
> nde34:~ # nd=nde35;cl=1;failcmd="crm_resource -F -r mysql -H $nd" ;
> date ; ptest -sL | grep "drbdMysql:$cl promotion score on $nd" ;
> date ; echo $failcmd; $failcmd ; date ; ptest -sL | grep
> "drbdMysql:$cl promotion score on $nd" ; sleep 85 ; while [ true ];
> do date ; ptest -sL | grep "drbdMysql:$cl promotion score on $nd" ;
> sleep 5; done
> Wed Aug 11 15:33:04 CEST 2010
> drbdMysql:1 promotion score on nde35: 10020
> drbdMysql:1 promotion score on nde35: INFINITY
> drbdMysql:1 promotion score on nde35: INFINITY
> Wed Aug 11 15:33:04 CEST 2010
> crm_resource -F -r mysql -H nde35
> Wed Aug 11 15:33:05 CEST 2010
> drbdMysql:1 promotion score on nde35: -INFINITY
> drbdMysql:1 promotion score on nde35: -INFINITY
> drbdMysql:1 promotion score on nde35: -INFINITY
> drbdMysql:1 promotion score on nde35: -INFINITY
> Wed Aug 11 15:34:31 CEST 2010
> drbdMysql:1 promotion score on nde35: -INFINITY
> drbdMysql:1 promotion score on nde35: -INFINITY
> drbdMysql:1 promotion score on nde35: -INFINITY
> drbdMysql:1 promotion score on nde35: -INFINITY
> [...]
> Wed Aug 11 15:35:11 CEST 2010
> drbdMysql:1 promotion score on nde35: -INFINITY
> drbdMysql:1 promotion score on nde35: -INFINITY
> drbdMysql:1 promotion score on nde35: -INFINITY
> drbdMysql:1 promotion score on nde35: -INFINITY
> Wed Aug 11 15:35:16 CEST 2010
> drbdMysql:1 promotion score on nde35: 10000
> drbdMysql:1 promotion score on nde35: INFINITY
> drbdMysql:1 promotion score on nde35: INFINITY
> ^C
> 
> 

> node nde34 \
> node nde35 \
> primitive apache ocf:cj:apache \
> primitive dlm ocf:pacemaker:controld \
> primitive drbdMysql ocf:linbit:drbd \
> primitive drbdOpencms ocf:linbit:drbd \
> primitive drbdShared ocf:linbit:drbd \
> primitive ipLdirector ocf:heartbeat:IPaddr2 \
> primitive ipMysql ocf:heartbeat:IPaddr \
> primitive ldirector ocf:heartbeat:ldirectord \
> primitive mountMysql ocf:heartbeat:Filesystem \
> primitive mountOpencms ocf:heartbeat:Filesystem \
> primitive mountShared ocf:heartbeat:Filesystem \
> primitive mysql ocf:heartbeat:mysql \
> primitive o2cb ocf:ocfs2:o2cb \
> primitive tomcat ocf:cj:tomcat \
> group grpLdirector ldirector ipLdirector \
> group grpMysql mysql ipMysql \
> ms msDrbdMysql drbdMysql \
> ms msDrbdOpencms drbdOpencms \
> ms msDrbdShared drbdShared \
> clone cloneApache apache
> clone cloneDlm dlm \
> clone cloneMountMysql mountMysql \
> clone cloneMountOpencms mountOpencms \
> clone cloneMountShared mountShared \
> clone cloneO2cb o2cb \
> clone cloneTomcat tomcat \
> colocation colocApache inf: cloneApache cloneTomcat
> colocation colocGrpLdirector inf: grpLdirector cloneMountShared
> colocation colocGrpMysql inf: grpMysql cloneMountMysql
> colocation colocMountMysql_drbd inf: cloneMountMysql msDrbdMysql:Master
> colocation colocMountMysql_o2cb inf: cloneMountMysql cloneO2cb
> colocation colocMountOpencms_drbd inf: cloneMountOpencms msDrbdOpencms:Master
> colocation colocMountOpencms_o2cb inf: cloneMountOpencms cloneO2cb
> colocation colocMountShared_drbd inf: cloneMountShared msDrbdShared:Master
> colocation colocMountShared_o2cb inf: cloneMountShared cloneO2cb
> colocation colocO2cb inf: cloneO2cb cloneDlm
> colocation colocTomcat inf: cloneTomcat cloneMountOpencms
> order orderApache inf: cloneTomcat cloneApache
> order orderGrpLdirector inf: cloneMountShared grpLdirector
> order orderGrpMysql inf: cloneMountMysql grpMysql
> order orderMountMysql_drbd inf: msDrbdMysql:promote cloneMountMysql:start
> order orderMountMysql_o2cb inf: cloneO2cb cloneMountMysql
> order orderMountOpencms_drbd inf: msDrbdOpencms:promote cloneMountOpencms:start
> order orderMountOpencms_o2cb inf: cloneO2cb cloneMountOpencms
> order orderMountShared_drbd inf: msDrbdShared:promote cloneMountShared:start
> order orderMountShared_o2cb inf: cloneO2cb cloneMountShared
> order orderO2cb inf: cloneDlm cloneO2cb
> order orderTomcat inf: cloneMountOpencms cloneTomcat
> property $id="cib-bootstrap-options" \
> 	dc-version="1.1.2-2e096a41a5f9e184a1c1537c82c6da1093698eb5" \
> 	cluster-infrastructure="openais" \
> 	expected-quorum-votes="2" \
> 	stonith-enabled="false" \
> 	no-quorum-policy="ignore" \
> 	start-failure-is-fatal="false" \
> 	cluster-recheck-interval="5m" \
> 	shutdown-escalation="5m" \
> 	last-lrm-refresh="1281543643"
> rsc_defaults $id="rsc-options" \
> 	resource-stickiness="5"

> node alpha \
> 	attributes standby="off"
> node beta \
> 	attributes standby="off"
> primitive dlm ocf:pacemaker:controld \
> 	op monitor interval="10" timeout="20" \
> 	op start interval="0" timeout="90" \
> 	op stop interval="0" timeout="100"
> primitive drbdShared ocf:linbit:drbd \
> 	params drbd_resource="shared" \
> 	op monitor interval="10" role="Master" timeout="20" \
> 	op monitor interval="20" role="Slave" timeout="20" \
> 	op start interval="0" timeout="240" \
> 	op stop interval="0" timeout="100" \
> 	op promote interval="0" timeout="90" \
> 	op demote interval="0" timeout="90" \
> 	op notify interval="0" timeout="90"
> primitive ipMysql ocf:heartbeat:IPaddr \
> 	params ip="192.168.135.67" cidr_netmask="255.255.0.0" \
> 	op monitor interval="2" timeout="20" \
> 	op start interval="0" timeout="90"
> primitive mountShared ocf:heartbeat:Filesystem \
> 	params device="/dev/drbd0" directory="/shared" fstype="ocfs2" \
> 	op monitor interval="10" timeout="40" OCF_CHECK_LEVEL="10" \
> 	op start interval="0" timeout="60" \
> 	op stop interval="0" timeout="60"
> primitive mysql ocf:heartbeat:mysql \
> 	params binary="/usr/bin/mysqld_safe" config="/var/lib/mysql/my.cnf" pid="/var/run/mysqld/mysqld.pid" socket="/var/lib/mysql/mysqld.sock" test_table="ha.check" test_user="HAuser" test_passwd="HApass" \
> 	op monitor interval="10" timeout="30" OCF_CHECK_LEVEL="0" \
> 	op start interval="0" timeout="120" \
> 	op stop interval="0" timeout="120"
> primitive o2cb ocf:pacemaker:o2cb \
> 	op monitor interval="10" \
> 	op start interval="0" timeout="90" \
> 	op stop interval="0" timeout="100"
> group grpMysql mysql ipMysql \
> 	meta migration-threshold="3" failure-timeout="30"
> ms msDrbdShared drbdShared \
> 	meta resource-stickiness="100" notify="true" master-max="2"
> clone cloneDlm dlm \
> 	meta globally-unique="false" interleave="true"
> clone cloneMountShared mountShared \
> 	meta interleave="true" globally-unique="false" target-role="Started"
> clone cloneO2cb o2cb \
> 	meta globally-unique="false" interleave="true" target-role="Started"
> colocation colocMountShared_drbd inf: cloneMountShared msDrbdShared:Master
> colocation colocMountShared_o2cb inf: cloneMountShared cloneO2cb
> colocation colocMysql inf: grpMysql cloneMountShared
> colocation colocO2cb inf: cloneO2cb cloneDlm
> order orderMountShared_drbd inf: msDrbdShared:promote cloneMountShared:start
> order orderMountShared_o2cb inf: cloneO2cb cloneMountShared
> order orderMysql inf: cloneMountShared grpMysql
> order orderO2cb inf: cloneDlm cloneO2cb
> property $id="cib-bootstrap-options" \
> 	dc-version="1.0.9-unknown" \
> 	cluster-infrastructure="openais" \
> 	expected-quorum-votes="2" \
> 	stonith-enabled="false" \
> 	start-failure-is-fatal="false" \
> 	last-lrm-refresh="1281577809" \
> 	cluster-recheck-interval="4m" \
> 	shutdown-escalation="5m"

> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker