[ClusterLabs] fence_scsi problem
Andrei Borzenkov
arvidjaar at gmail.com
Wed Oct 28 08:36:52 EDT 2020
On Wed, Oct 28, 2020 at 3:18 PM Patrick Vranckx
<patrick.vranckx at uclouvain.be> wrote:
>
> Hi,
>
> I try yo setup an HA cluster for ZFS. I think fence_scsi is not working
> properly. I can reproduce the problem on two kind of hardware: iSCSI and
> SAS storage.
>
> Here is what I did:
>
> - set up a storage server with 3 iscsi targets to be accessed by my
> 2-nodes cluster
>
> - set up two nodes with centos 8.1. iscsi initiators let me see three
> scsi disks on each node:
>
> /dev/disk/by-id/wwn-0x60014053374288c674e4add8679947ca -> ../../sda
> /dev/disk/by-id/wwn-0x6001405a38892587b3f4a739046b9ff4 -> ../../sdb
> /dev/disk/by-id/wwn-0x6001405b249db0e617b41b19f3147af5 -> ../../sdc
>
> - configure zpool "vol1" on one node
>
> - defined resources on the cluster:
>
> pcs resource create vol1 ZFS pool="vol1" op start timeout="90" op
> stop timeout="90" --group=group-vol1
> pcs resource create vol1-ip IPaddr2 ip=10.30.0.100 cidr_netmask=16
> --group group-vol1
> pcs stonith create fence-vol1 fence_scsi
> pcmk_monitor_action="metadata" pcmk_host_list="node1,node2"
> devices="/dev/disk/by-id/wwn-0x60014053374288c674e4add8679947ca,/dev/disk/by-id/wwn-0x6001405a38892587b3f4a739046b9ff4,/dev/disk/by-id/wwn-0x6001405b249db0e617b41b19f3147af5"
> meta provides=unfencing
>
> Everything seems ok:
>
> [root at node1 /]# pcs status
>
> Cluster name: mycluster
> Cluster Summary:
> * Stack: corosync
> * Current DC: node2 (version 2.0.3-5.el8_2.1-4b1f869f0f) - partition
> with quorum
> * Last updated: Wed Oct 28 12:55:52 2020
> * Last change: Wed Oct 28 12:32:36 2020 by root via crm_resource on
> node2
> * 2 nodes configured
> * 3 resource instances configured
>
> Node List:
> * Online: [ node1 node2 ]
>
> Full List of Resources:
> * fence-vol1 (stonith:fence_scsi): Started node1
> * Resource Group: group-vol1:
> * vol1-ip (ocf::heartbeat:IPaddr2): Started node2
> * vol1 (ocf::heartbeat:ZFS): Started node2
>
> Daemon Status:
> corosync: active/enabled
> pacemaker: active/enabled
> pcsd: active/enabled
>
> Moving the resource to the other node runs fine and the zpool is only
> imported on the active node.
>
> SCSI reservations for scsi disks are as follows:
>
> LIO-ORG disk1 4.0
> Peripheral device type: disk
> PR generation=0x2
> Key=0x6c9d0000
> All target ports bit clear
> Relative port address: 0x1
> << Reservation holder >>
> scope: LU_SCOPE, type: Write Exclusive, registrants only
> Transport Id of initiator:
> iSCSI world wide unique port id: iqn.2020-10.localhost.store:node1
> Key=0x6c9d0001
> All target ports bit clear
> Relative port address: 0x1
> not reservation holder
> Transport Id of initiator:
> iSCSI world wide unique port id: iqn.2020-10.localhost.store:node2
>
> QUESTION : when I move the resource
Which resource?
> to the other node, SCSI reservations
> do not change. I would expect that the Write Exclusive reservation would
> follow the active node. Is it normal ?
>
Yes. Reservation type is "Write Exclusive, registrants only", list of
registered initiators does not change. You probably misunderstand what
fence_scsi does. It blocks access to device(s) for a victim node; in
normal state it allows access to all nodes in a cluster.
> If I fence manually the active node for group-vol1:
>
> [root at node1 /]# pcs stonith fence node1
> Node: node1 fenced
>
> the resource is restarted on the other node BUT THE ZFS FILESTYSTEMS ARE
> MOUNTED AND WRITEABLE ON BOTH NODES even if the scsi reservations seem ok:
Nothing tells node1 to unmount filesystem so it remains mounted. Are
you sure it is really writable?
>
> [root at node2 by-id]# sg_persist -s
> /dev/disk/by-id/wwn-0x60014053374288c674e4add8679947ca
> LIO-ORG disk1 4.0
> Peripheral device type: disk
> PR generation=0x3
> Key=0x6c9d0001
> All target ports bit clear
> Relative port address: 0x1
> << Reservation holder >>
> scope: LU_SCOPE, type: Write Exclusive, registrants only
> Transport Id of initiator:
> iSCSI world wide unique port id: iqn.2020-10.localhost.store:node2
>
> I can reproduce the same thing with two node cluster with SAS attached
> storage running centos 7.8.
>
> In my understanding, if node2 has write exclusive reservation, node1
> shouldn't be able to write on those disks.
>
> What's the problem ?
>
> Thanks for your help,
>
> Patrick
>
>
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
More information about the Users
mailing list