[ClusterLabs] LIO (iSCSI Target) does not release DRBD device?

lukas lukas.kostyan at gmail.com
Mon May 25 14:01:14 EDT 2015


You should assign to each resorce operation a role.
Change the config according to this:

primitive p_drbd_iscsi ocf:linbit:drbd \
         params drbd_resource="fs01_data" \
         op start timeout="240" interval="0" \
         op stop timeout="180" interval="0" \
         op monitor interval="20" role="Master" \
	op monitor interval="30" role="Slave"




On 2015-05-25 17:35, Per von Zweigbergk wrote:
> I'm attempting to get a two-node cluster setup working. The workload is going to be a DRBD-backed iSCSI target. I have chosen the Linux-IO Target (LIO) for this purpose. I'm running on Ubuntu LTS 14.04, with the software as packaged by the distro.
>
> Unfortunately, it doesn't quite work the way I expect. When I do a "crm resource move g_iscsi" to force a move of the iSCSI target, and thereby DRBD, from what I can tell:
>
> First the iSCSI target resource (ocf:heartbeat:iSCSITarget) is torn down. After that, the LUN resource is torn down (ocf:heartbeat:iSCSILogicalUnit). After that, the two IP address resources are torn down (ocf:heartbeat:IPaddr2). This is all as I expect to happen.
>
> Then, it attempts to demote DRBD to secondary, which is where it seems to fail according to:
>
> May 25 16:44:12 node01 kernel: [  702.206628] block drbd1: State change failed: Device is held open by someone
>
> This is despite the fact that I have verified that the LIO LUN is "deactivated" according to:
>
> root at node01:~# targetcli
> targetcli GIT_VERSION (rtslib GIT_VERSION)
> Copyright (c) 2011-2013 by Datera, Inc.
> All rights reserved.
> /> ls
> o- / .................................................................................................... [...]
>    o- backstores ......................................................................................... [...]
>    | o- fileio .............................................................................. [0 Storage Object]
>    | o- iblock .............................................................................. [1 Storage Object]
>    | | o- p_lun_iscsi ................................................................. [/dev/drbd1 deactivated]
>    | o- pscsi ............................................................................... [0 Storage Object]
>    | o- rd_dr ............................................................................... [0 Storage Object]
>    | o- rd_mcp .............................................................................. [0 Storage Object]
>    o- ib_srpt ...................................................................................... [0 Targets]
>    o- iscsi ........................................................................................ [0 Targets]
>    o- loopback ..................................................................................... [0 Targets]
>    o- qla2xxx ...................................................................................... [0 Targets]
>    o- tcm_fc ....................................................................................... [0 Targets]
> />
>
> No joy on using fuser or lsof to check what might be holding /dev/drbd1 open unfortunately (perhaps because LIO lives in the kernel?), but if I go in and manually delete the p_lun_iscsi object, I'm able to demote to secondary, as below:
>
> (in targetcli)
> /backstores/iblock> delete p_lun_iscsi
> Deleted storage object p_lun_iscsi.
> /backstores/iblock> exit
> There are unsaved configuration changes.
> If you exit now, configuration will not be updated and changes will be lost upon reboot.
> Type 'exit' if you want to exit anyway: exit
> root at node01:~# drbdadm secondary fs01_data
> root at node01:~# cat /proc/drbd
> version: 8.4.3 (api:1/proto:86-101)
> srcversion: 6551AD2C98F533733BE558C
>
>   1: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown   r-----
>      ns:7095200 nr:0 dw:2028504 dr:5069232 al:472 bm:312 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
> root at node01:~#
>
> The DRBD node is in "standalone" mode because it seemed the other node forcibly took over the resource somehow, so this is not to be unexpected.
>
> So in summary, what I think is happening is that failover fails, because Pacemaker isn't telling LIO to sufficiently let go of the DRBD device, which is causing it to be unable to go into secondary mode. After a bunch of failures, the other node realizes this, and goes into standalone mode to force-take-over the DRBD resource, killing replication.
>
> What can I do to try to get this working?
>
> Finally, here's a dump of "crm configure show", for good measure, with some potentially sensitive data redacted (I'm not actually running production in the RFC3330 TEST-NET subnet, and my nodeid's aren't as listed):
>
> node $id="111111111" node01 \
>          attributes maintenance="off"
> node $id="222222222" node02 \
>          attributes maintenance="off"
> primitive p_drbd_iscsi ocf:linbit:drbd \
>          params drbd_resource="fs01_data" \
>          op start timeout="240" interval="0" \
>          op stop timeout="180" interval="0" \
>          op monitor interval="60" timeout="60"
> primitive p_ip1_iscsi ocf:heartbeat:IPaddr2 \
>          params ip="192.0.2.131" cidr_netmask="28" nic="eth1" iflabel="iscsi" \
>          op monitor interval="30s"
> primitive p_ip2_iscsi ocf:heartbeat:IPaddr2 \
>          params ip="192.0.2.147" cidr_netmask="28" nic="eth2" iflabel="iscsi" \
>          op monitor interval="30s"
> primitive p_iscsitarget_iscsi ocf:heartbeat:iSCSITarget \
>          params iqn="iqn.2015-05.com.example:iscsi" implementation="lio" portals="192.0.2.131 192.0.2.146" \
>          meta is-managed="true"
> primitive p_lun_iscsi ocf:heartbeat:iSCSILogicalUnit \
>          params target_iqn="iqn.2015-05.com.example:iscsi" lun="0" path="/dev/drbd1"
> group g_iscsi p_ip1_iscsi p_ip2_iscsi p_lun_iscsi p_iscsitarget_iscsi \
>          meta target-role="Started"
> ms ms_drbd_iscsi p_drbd_iscsi \
>          meta clone-max="2" clone-node-max="1" master-max="1" master-node-max="1" notify="true" is-managed="true" target-role="Started"
> location lo_drbd_iscsi ms_drbd_iscsi \
>          rule $id="lo_drbd_iscsi-rule" -inf: #uname ne node01 and #uname ne node02
> colocation co_iscsitarget_iscsi inf: g_iscsi ms_drbd_iscsi:Master
> order o_iscsi inf: ms_drbd_iscsi:promote g_iscsi:start
> property $id="cib-bootstrap-options" \
>          dc-version="1.1.10-42f2063" \
>          cluster-infrastructure="corosync" \
>          stonith-enabled="false" \
>          last-lrm-refresh="1432564681" \
>          no-quorum-policy="ignore" \
>          default-resource-stickiness="200"
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org





More information about the Users mailing list