[ClusterLabs] LIO (iSCSI Target) does not release DRBD device?

Per von Zweigbergk pvz at itassistans.se
Mon May 25 18:56:51 UTC 2015


Thank you, that was an oversight in my configuration. I have now adjusted it accordingly. The new config is at the end of the message.

That does however not actually resolve my issue, I still have the exact same problem as before.


node $id="111111111" node01
node $id="222222222" node02
primitive p_drbd_iscsi ocf:linbit:drbd \
        params drbd_resource="fs01_data" \
        op start timeout="240" interval="0" \
        op stop timeout="180" interval="0" \
        op monitor interval="20" role="Master" \
        op monitor interval="30" role="Slave"
primitive p_ip1_iscsi ocf:heartbeat:IPaddr2 \
        params ip="192.0.2.131" cidr_netmask="28" nic="eth1" iflabel="iscsi" \
        op monitor interval="30s"
primitive p_ip2_iscsi ocf:heartbeat:IPaddr2 \
        params ip="192.0.2.147" cidr_netmask="28" nic="eth2" iflabel="iscsi" \
        op monitor interval="30s"
primitive p_iscsitarget_iscsi ocf:heartbeat:iSCSITarget \
        params iqn="iqn.2015-05.com.example:iscsi" implementation="lio" portals="192.0.2.131 192.0.2.146" \
        op monitor interval="30s"
primitive p_lun_iscsi ocf:heartbeat:iSCSILogicalUnit \
        params target_iqn="iqn.2015-05.com.example:iscsi" lun="0" path="/dev/drbd1"
group g_iscsi p_ip1_iscsi p_ip2_iscsi p_lun_iscsi p_iscsitarget_iscsi
ms ms_drbd_iscsi p_drbd_iscsi \
        meta clone-max="2" clone-node-max="1" master-max="1" master-node-max="1" notify="true"
location lo_drbd_iscsi ms_drbd_iscsi \
        rule $id="lo_drbd_iscsi-rule" -inf: #uname ne node01 and #uname ne node02
colocation co_iscsitarget_iscsi inf: g_iscsi ms_drbd_iscsi:Master
order o_iscsi inf: ms_drbd_iscsi:promote g_iscsi:start
property $id="cib-bootstrap-options" \
        dc-version="1.1.10-42f2063" \
        cluster-infrastructure="corosync" \
        stonith-enabled="false" \
        last-lrm-refresh="1432579747" \
        no-quorum-policy="ignore" \
        default-resource-stickiness="200"

-----Ursprungligt meddelande-----
Från: lukas [mailto:lukas.kostyan at gmail.com] 
Skickat: den 25 maj 2015 20:01
Till: Cluster Labs - All topics related to open-source clustering welcomed
Ämne: Re: [ClusterLabs] LIO (iSCSI Target) does not release DRBD device?

You should assign to each resorce operation a role.
Change the config according to this:

primitive p_drbd_iscsi ocf:linbit:drbd \
         params drbd_resource="fs01_data" \6
         op start timeout="240" interval="0" \
         op stop timeout="180" interval="0" \
         op monitor interval="20" role="Master" \
	op monitor interval="30" role="Slave"




On 2015-05-25 17:35, Per von Zweigbergk wrote:
> I'm attempting to get a two-node cluster setup working. The workload is going to be a DRBD-backed iSCSI target. I have chosen the Linux-IO Target (LIO) for this purpose. I'm running on Ubuntu LTS 14.04, with the software as packaged by the distro.
>
> Unfortunately, it doesn't quite work the way I expect. When I do a "crm resource move g_iscsi" to force a move of the iSCSI target, and thereby DRBD, from what I can tell:
>
> First the iSCSI target resource (ocf:heartbeat:iSCSITarget) is torn down. After that, the LUN resource is torn down (ocf:heartbeat:iSCSILogicalUnit). After that, the two IP address resources are torn down (ocf:heartbeat:IPaddr2). This is all as I expect to happen.
>
> Then, it attempts to demote DRBD to secondary, which is where it seems to fail according to:
>
> May 25 16:44:12 node01 kernel: [  702.206628] block drbd1: State 
> change failed: Device is held open by someone
>
> This is despite the fact that I have verified that the LIO LUN is "deactivated" according to:
>
> root at node01:~# targetcli
> targetcli GIT_VERSION (rtslib GIT_VERSION) Copyright (c) 2011-2013 by 
> Datera, Inc.
> All rights reserved.
> /> ls
> o- / .................................................................................................... [...]
>    o- backstores ......................................................................................... [...]
>    | o- fileio .............................................................................. [0 Storage Object]
>    | o- iblock .............................................................................. [1 Storage Object]
>    | | o- p_lun_iscsi ................................................................. [/dev/drbd1 deactivated]
>    | o- pscsi ............................................................................... [0 Storage Object]
>    | o- rd_dr ............................................................................... [0 Storage Object]
>    | o- rd_mcp .............................................................................. [0 Storage Object]
>    o- ib_srpt ...................................................................................... [0 Targets]
>    o- iscsi ........................................................................................ [0 Targets]
>    o- loopback ..................................................................................... [0 Targets]
>    o- qla2xxx ...................................................................................... [0 Targets]
>    o- tcm_fc 
> ......................................................................
> ................. [0 Targets] />
>
> No joy on using fuser or lsof to check what might be holding /dev/drbd1 open unfortunately (perhaps because LIO lives in the kernel?), but if I go in and manually delete the p_lun_iscsi object, I'm able to demote to secondary, as below:
>
> (in targetcli)
> /backstores/iblock> delete p_lun_iscsi Deleted storage object 
> p_lun_iscsi.
> /backstores/iblock> exit
> There are unsaved configuration changes.
> If you exit now, configuration will not be updated and changes will be lost upon reboot.
> Type 'exit' if you want to exit anyway: exit root at node01:~# drbdadm 
> secondary fs01_data root at node01:~# cat /proc/drbd
> version: 8.4.3 (api:1/proto:86-101)
> srcversion: 6551AD2C98F533733BE558C
>
>   1: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown   r-----
>      ns:7095200 nr:0 dw:2028504 dr:5069232 al:472 bm:312 lo:0 pe:0 
> ua:0 ap:0 ep:1 wo:f oos:0 root at node01:~#
>
> The DRBD node is in "standalone" mode because it seemed the other node forcibly took over the resource somehow, so this is not to be unexpected.
>
> So in summary, what I think is happening is that failover fails, because Pacemaker isn't telling LIO to sufficiently let go of the DRBD device, which is causing it to be unable to go into secondary mode. After a bunch of failures, the other node realizes this, and goes into standalone mode to force-take-over the DRBD resource, killing replication.
>
> What can I do to try to get this working?
>
> Finally, here's a dump of "crm configure show", for good measure, with some potentially sensitive data redacted (I'm not actually running production in the RFC3330 TEST-NET subnet, and my nodeid's aren't as listed):
>
> node $id="111111111" node01 \
>          attributes maintenance="off"
> node $id="222222222" node02 \
>          attributes maintenance="off"
> primitive p_drbd_iscsi ocf:linbit:drbd \
>          params drbd_resource="fs01_data" \
>          op start timeout="240" interval="0" \
>          op stop timeout="180" interval="0" \
>          op monitor interval="60" timeout="60"
> primitive p_ip1_iscsi ocf:heartbeat:IPaddr2 \
>          params ip="192.0.2.131" cidr_netmask="28" nic="eth1" iflabel="iscsi" \
>          op monitor interval="30s"
> primitive p_ip2_iscsi ocf:heartbeat:IPaddr2 \
>          params ip="192.0.2.147" cidr_netmask="28" nic="eth2" iflabel="iscsi" \
>          op monitor interval="30s"
> primitive p_iscsitarget_iscsi ocf:heartbeat:iSCSITarget \
>          params iqn="iqn.2015-05.com.example:iscsi" implementation="lio" portals="192.0.2.131 192.0.2.146" \
>          meta is-managed="true"
> primitive p_lun_iscsi ocf:heartbeat:iSCSILogicalUnit \
>          params target_iqn="iqn.2015-05.com.example:iscsi" lun="0" path="/dev/drbd1"
> group g_iscsi p_ip1_iscsi p_ip2_iscsi p_lun_iscsi p_iscsitarget_iscsi \
>          meta target-role="Started"
> ms ms_drbd_iscsi p_drbd_iscsi \
>          meta clone-max="2" clone-node-max="1" master-max="1" master-node-max="1" notify="true" is-managed="true" target-role="Started"
> location lo_drbd_iscsi ms_drbd_iscsi \
>          rule $id="lo_drbd_iscsi-rule" -inf: #uname ne node01 and 
> #uname ne node02 colocation co_iscsitarget_iscsi inf: g_iscsi 
> ms_drbd_iscsi:Master order o_iscsi inf: ms_drbd_iscsi:promote 
> g_iscsi:start property $id="cib-bootstrap-options" \
>          dc-version="1.1.10-42f2063" \
>          cluster-infrastructure="corosync" \
>          stonith-enabled="false" \
>          last-lrm-refresh="1432564681" \
>          no-quorum-policy="ignore" \
>          default-resource-stickiness="200"
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org 
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org Getting started: 
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org


_______________________________________________
Users mailing list: Users at clusterlabs.org http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org




More information about the Users mailing list