[Pacemaker] Unexpected resource restarts after putting a node in standby mode

Dejan Muhamedagic dejanmm at fastmail.fm
Mon Jul 6 06:38:03 EDT 2009


Hi,

On Mon, Jul 06, 2009 at 08:00:38AM +0200, Florian Haas wrote:
> Hello everyone,
> 
> probably at bad time to ask this as Andrew is out on vacation, but maybe
> Dejan or Dominik can help shed some light on this one.
> 
> I'm testing my iSCSITarget and iSCSILogicalUnit agents in a 2-node
> Pacemaker 1.0.4 cluster. If you don't feel like grokking the full config
> that follows, what I have is
> 
> - 2 DRBD Master/Slave resources;
> - 2 resource groups, each holding one LVM VG, one iSCSITarget, and one
> or more iSCSILogicalUnits;
> - A cloned LSB resource managing the SCSI target daemon (tgt),
> - order and colocation constraints to make sure that everything is
> started in the right places and in the correct order.
> 
> What follows is my full configuration; sorry for being this noisy but I
> guess it makes sense to include the full config here:
> 
> node $id="3074cde6-2e91-4259-9868-7ac94007087e" alice \
> 	attributes standby="off"
> node $id="9a4cafd3-fcfc-4de9-9440-10bc8822d9af" bob \
> 	attributes standby="off"
> primitive res_drbd_iscsivg01 ocf:linbit:drbd \
> 	params drbd_resource="iscsivg01" \
> 	op monitor interval="10s"
> primitive res_drbd_iscsivg02 ocf:linbit:drbd \
> 	params drbd_resource="iscsivg02"
> primitive res_lu_iscsivg01_lun1 ocf:heartbeat:iSCSILogicalUnit \
> 	params target_iqn="iqn.2001-04.com.linbit:storage.alicebob.iscsivg01"
> lun="1" path="/dev/iscsivg01/lun1" scsi_id="iscsivg01.lun1" \
> 	op monitor interval="10s"
> primitive res_lu_iscsivg01_lun2 ocf:heartbeat:iSCSILogicalUnit \
> 	params target_iqn="iqn.2001-04.com.linbit:storage.alicebob.iscsivg01"
> lun="2" path="/dev/iscsivg01/lun2" scsi_id="iscsivg01.lun2" \
> 	op monitor interval="10s"
> primitive res_lu_iscsivg01_lun3 ocf:heartbeat:iSCSILogicalUnit \
> 	params target_iqn="iqn.2001-04.com.linbit:storage.alicebob.iscsivg01"
> lun="3" path="/dev/iscsivg01/lun3" scsi_id="iscsivg01.lun3" \
> 	op monitor interval="10s"
> primitive res_lu_iscsivg02_lun1 ocf:heartbeat:iSCSILogicalUnit \
> 	params target_iqn="iqn.2001-04.com.linbit:storage.alicebob.iscsivg02"
> lun="1" path="/dev/iscsivg02/lun1" scsi_id="iscsivg02.lun1" \
> 	op monitor interval="10s"
> primitive res_lvm_iscsivg01 ocf:heartbeat:LVM \
> 	params volgrpname="iscsivg01"
> primitive res_lvm_iscsivg02 ocf:heartbeat:LVM \
> 	params volgrpname="iscsivg02"
> primitive res_target_iscsivg01 ocf:heartbeat:iSCSITarget \
> 	params iqn="iqn.2001-04.com.linbit:storage.alicebob.iscsivg01"
> additional_parameters="DefaultTime2Retain=60" \
> 	op monitor interval="10s"
> primitive res_target_iscsivg02 ocf:heartbeat:iSCSITarget \
> 	params iqn="iqn.2001-04.com.linbit:storage.alicebob.iscsivg02"
> additional_parameters="DefaultTime2Retain=60" \
> 	op monitor interval="10s"
> primitive res_tgtd lsb:tgtd
> group rg_iscsivg01 res_lvm_iscsivg01 res_target_iscsivg01
> res_lu_iscsivg01_lun1 res_lu_iscsivg01_lun2 res_lu_iscsivg01_lun3 \
> 	meta collocated="true" ordered="true" target-role="Started"
> group rg_iscsivg02 res_lvm_iscsivg02 res_target_iscsivg02
> res_lu_iscsivg02_lun1
> ms ms_drbd_iscsivg01 res_drbd_iscsivg01 \
> 	meta clone-max="2" clone-node-max="1" master-max="1"
> master-node-max="1" target-role="Started" notify="true"
> ms ms_drbd_iscsivg02 res_drbd_iscsivg02 \
> 	meta master-max="1" clone-max="2" clone-node-max="1"
> master-node-max="1" notify="true" target-role="Started"
> clone cl_tgtd res_tgtd \
> 	meta target-role="Started"
> colocation c_iscsivg01_on_drbd inf: rg_iscsivg01 ms_drbd_iscsivg01:Master
> colocation c_iscsivg01_on_tgtd inf: rg_iscsivg01 cl_tgtd
> colocation c_iscsivg02_on_drbd inf: rg_iscsivg02 ms_drbd_iscsivg02:Master
> colocation c_iscsivg02_on_tgtd inf: rg_iscsivg02 cl_tgtd
> order o_drbd_before_iscsivg01 inf: ms_drbd_iscsivg01:promote
> rg_iscsivg01:start
> order o_drbd_before_iscsivg02 inf: ms_drbd_iscsivg02:promote
> rg_iscsivg02:start
> order o_tgtd_before_iscsivg01 inf: cl_tgtd rg_iscsivg01
> order o_tgtd_before_iscsivg02 inf: cl_tgtd rg_iscsivg02
> property $id="cib-bootstrap-options" \
> 	dc-version="1.0.4-6dede86d6105786af3a5321ccf66b44b6914f0aa" \
> 	cluster-infrastructure="Heartbeat" \
> 	stonith-enabled="false" \
> 	no-quorum-policy="ignore" \
> 	last-lrm-refresh="1246653472" \
> 	default-resource-stickiness="0"
> 
> Now as I switch my node named bob into standby mode, resources are
> transferred to alice as expected. But, and this is the issue that I'm
> having, the resource group that ran on alice all along is (needlessly,
> it seems) restarted in place.
> 
> I played this thing through with ptest:
> 
> cibadmin -Q \
> | sed -e 's/id="nodes-9a4cafd3-fcfc-4de9-9440-10bc8822d9af-standby"
> value="off"/id="nodes-9a4cafd3-fcfc-4de9-9440-10bc8822d9af-standby"
> value="on"/' > /tmp/cib.xml
> 
> [root at alice ~]# ptest -VVV -x /tmp/cib.xml 2>&1 | grep LogActions
> ptest[3695]: 2009/07/05_19:51:08 notice: LogActions: Demote
> res_drbd_iscsivg01:0	(Master -> Stopped bob)
> ptest[3695]: 2009/07/05_19:51:08 notice: LogActions: Stop resource
> res_drbd_iscsivg01:0	(bob)
> ptest[3695]: 2009/07/05_19:51:08 notice: LogActions: Promote
> res_drbd_iscsivg01:1	(Slave -> Master alice)
> ptest[3695]: 2009/07/05_19:51:08 notice: LogActions: Move resource
> res_lvm_iscsivg01	(Started bob -> alice)
> ptest[3695]: 2009/07/05_19:51:08 notice: LogActions: Move resource
> res_target_iscsivg01	(Started bob -> alice)
> ptest[3695]: 2009/07/05_19:51:08 notice: LogActions: Move resource
> res_lu_iscsivg01_lun1	(Started bob -> alice)
> ptest[3695]: 2009/07/05_19:51:08 notice: LogActions: Move resource
> res_lu_iscsivg01_lun2	(Started bob -> alice)
> ptest[3695]: 2009/07/05_19:51:08 notice: LogActions: Move resource
> res_lu_iscsivg01_lun3	(Started bob -> alice)
> ptest[3695]: 2009/07/05_19:51:08 notice: LogActions: Leave resource
> res_tgtd:0	(Started alice)
> ptest[3695]: 2009/07/05_19:51:08 notice: LogActions: Stop resource
> res_tgtd:1	(bob)
> ptest[3695]: 2009/07/05_19:51:08 notice: LogActions: Leave resource
> res_drbd_iscsivg02:0	(Master alice)
> ptest[3695]: 2009/07/05_19:51:08 notice: LogActions: Stop resource
> res_drbd_iscsivg02:1	(bob)
> ptest[3695]: 2009/07/05_19:51:08 notice: LogActions: Restart resource
> res_lvm_iscsivg02	(Started alice)
> ptest[3695]: 2009/07/05_19:51:08 notice: LogActions: Restart resource
> res_target_iscsivg02	(Started alice)
> ptest[3695]: 2009/07/05_19:51:08 notice: LogActions: Restart resource
> res_lu_iscsivg02_lun1	(Started alice)
> 
> All those actions are fine, except for those restarts of the
> rg_iscsivg02 resource group on alice. What am I doing wrong?

Not sure if there's anything wrong with the configuration.

> I would
> assume there must be a way to avoid these.

I suspect that this has again to do with constraints on clones.
How does it behave if you replace cl_tgtd with two different
resources?

Thanks,

Dejan

> All comments much appreciated. Thanks!
> Cheers,
> Florian
> 
> 
> 



> _______________________________________________
> Pacemaker mailing list
> Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker





More information about the Pacemaker mailing list