[Pacemaker] Resource stickiness not working as expected?

Thu Feb 28 15:24:49 EST 2013

----- Original Message -----
> From: "Allen Pomeroy" <a at pomeroy.us>
> To: pacemaker at oss.clusterlabs.org
> Sent: Thursday, February 28, 2013 2:49:40 PM
> Subject: [Pacemaker] Resource stickiness not working as expected?
> 
> Hi guys,
> 
> I have a two node cluster (corosync + pacemaker) on Fedora Core 17.
> Works
> well to move resources over to the secondary cluster node, but when
> an
> "unmove" command is issued now the resources fail back to the primary
> cluster node - seemingly ignoring the resource-stickiness settings.
> What
> have I missed here?  .. I got errors when I tried to use the
> resource-stickiness="100"
> argument on my ClusterFS, ClusterSrcIP and ClusterStatus primitives,
> although
> they are ocf.

I see a few issues overall.  I'm thinking your errors when addind resource-stickiness was because you need to add it in the meta section of the primitive.  Also if you set a default resource-stickiness that applies to all resource unless they are explicitly set differently so it's not necessarily needed.

> 
> crm resource move ClusterStatus node6
> 
> crm resource unmove ClusterStatus
> 
> node $id="1029968044" node5
> node $id="1046745260" node6
> primitive ClusterData ocf:linbit:drbd
>    params drbd_resource="clusterData"
>    op monitor interval="15" role="Master"
>    op monitor interval="30" role="Slave" resource-stickiness="100"
> primitive ClusterFS ocf:heartbeat:Filesystem
>    params device="/dev/drbd/by-res/clusterData" directory="/cluster"
> fstype="ext4"
>    meta target-role="Started"
> primitive ClusterIP ocf:heartbeat:IPaddr2
>    params ip="10.20.1.60" cidr_netmask="32"
>    op monitor interval="10s" resource-stickiness="100"
>    meta target-role="Started"
> primitive ClusterSrcIP ocf:heartbeat:IPsrcaddr
>    params ipaddress="10.20.1.60" cidr_netmask="24"
>    meta target-role="Started"
> primitive ClusterStatus ocf:pacemaker:ClusterMon
>    params update="10" htmlfile="/cluster/www/index.html"
>    meta target-role="Started"
> primitive WebServer ocf:heartbeat:apache
>    params configfile="/etc/httpd/conf/httpd.conf"
> statusurl="http://127.0.0.1/server-status [1]"
>    op monitor interval="30s" resource-stickiness="100"
>    meta target-role="Started"
> group BaseGroup ClusterFS ClusterIP ClusterSrcIP WebServer
> ClusterStatus
> ms ClusterDataClone ClusterData
>    meta master-max="1" master-node-max="1" clone-max="2"
> clone-node-max="1"
> notify="true" target-role="Started"
> location ClusterData-prefer-node1 ClusterData 50: node5

ClusterData should not be defined explictly after using it in ms|clone - only refer to the ms|clone instance now.  For master/slaves you also need to specify which role in the location statement.
location ClusterDataClone-Master-prefer-node5 ClusterDataClone $role="Master" 50: node5

> location ClusterIP-prefer-node1 ClusterIP 50: node5
> location WebServer-prefer-node1 WebServer 50: node5

If you consider the 3 location statements above each scores 50 to be on node5 = 150 score to run on node5.  Since they are dependent/grouped the scores are cumulative (get more complicated but close enough for this). Resource-stickiness is only 100 so the 150 cumulative score for node5 will win and the services will fail back if they can.
I think what you really want here is to score just the ClusterDataClone master role at 50.  Then it will prefer to be on node5 unless it's already on the other node (score 100) at which point it will stay there.  Everything else is colocated/dependent on it so they will follow/stay with it.

HTH

Jake

> colocation ClusterFS-on-DRBD inf: BaseGroup ClusterDataClone:Master
> order ClusterFS-after-Data inf: ClusterDataClone:promote
> BaseGroup:start
> property $id="cib-bootstrap-options"
>    dc-version="1.1.7-2.fc17-ee0730e13d124c3d58f00016c3376a1de5323cff"
>    cluster-infrastructure="corosync"
>    stonith-enabled="false"
>    no-quorum-policy="ignore"
> rsc_defaults $id="rsc-options"
>    resource-stickiness="100"
> op_defaults $id="op-options"
>    timeout="240s"
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 
>