[ClusterLabs] fence_scsi no such device

Ken Gaillot kgaillot at redhat.com
Tue Mar 15 14:39:44 UTC 2016


On 03/15/2016 09:10 AM, marvin wrote:
> Hi,
> 
> I'm trying to get fence_scsi working, but i get "no such device" error.
> It's a two node cluster with nodes called "node01" and "node03". The OS
> is RHEL 7.2.
> 
> here is some relevant info:
> 
> # pcs status
> Cluster name: testrhel7cluster
> Last updated: Tue Mar 15 15:05:40 2016          Last change: Tue Mar 15
> 14:33:39 2016 by root via cibadmin on node01
> Stack: corosync
> Current DC: node03 (version 1.1.13-10.el7-44eb2dd) - partition with quorum
> 2 nodes and 23 resources configured
> 
> Online: [ node01 node03 ]
> 
> Full list of resources:
> 
>  Clone Set: dlm-clone [dlm]
>      Started: [ node01 node03 ]
>  Clone Set: clvmd-clone [clvmd]
>      Started: [ node01 node03 ]
>  fence-node1    (stonith:fence_ipmilan):        Started node03
>  fence-node3    (stonith:fence_ipmilan):        Started node01
>  Resource Group: test_grupa
>      test_ip    (ocf::heartbeat:IPaddr):        Started node01
>      lv_testdbcl        (ocf::heartbeat:LVM):   Started node01
>      fs_testdbcl        (ocf::heartbeat:Filesystem):    Started node01
>      oracle11_baza      (ocf::heartbeat:oracle):        Started node01
>      oracle11_lsnr      (ocf::heartbeat:oralsnr):       Started node01
>  fence-scsi-node1       (stonith:fence_scsi):   Started node03
>  fence-scsi-node3       (stonith:fence_scsi):   Started node01
> 
> PCSD Status:
>   node01: Online
>   node03: Online
> 
> Daemon Status:
>   corosync: active/enabled
>   pacemaker: active/enabled
>   pcsd: active/enabled
> 
> # pcs stonith show
>  fence-node1    (stonith:fence_ipmilan):        Started node03
>  fence-node3    (stonith:fence_ipmilan):        Started node01
>  fence-scsi-node1       (stonith:fence_scsi):   Started node03
>  fence-scsi-node3       (stonith:fence_scsi):   Started node01
>  Node: node01
>   Level 1 - fence-scsi-node3
>   Level 2 - fence-node3
>  Node: node03
>   Level 1 - fence-scsi-node1
>   Level 2 - fence-node1
> 
> # pcs stonith show fence-scsi-node1 --all
>  Resource: fence-scsi-node1 (class=stonith type=fence_scsi)
>   Attributes: pcmk_host_list=node01 pcmk_monitor_action=metadata
> pcmk_reboot_action=off
>   Meta Attrs: provides=unfencing
>   Operations: monitor interval=60s (fence-scsi-node1-monitor-interval-60s)
> 
> # pcs stonith show fence-scsi-node3 --all
>  Resource: fence-scsi-node3 (class=stonith type=fence_scsi)
>   Attributes: pcmk_host_list=node03 pcmk_monitor_action=metadata
> pcmk_reboot_action=off
>   Meta Attrs: provides=unfencing
>   Operations: monitor interval=60s (fence-scsi-node3-monitor-interval-60s)
> 
> node01 # pcs stonith fence node03
> Error: unable to fence 'node03'
> Command failed: No such device
> 
> node01 # tail /var/log/messages
> Mar 15 14:54:04 node01 stonith-ng[20024]:  notice: Client
> stonith_admin.29191.2b7fe910 wants to fence (reboot) 'node03' with
> device '(any)'
> Mar 15 14:54:04 node01 stonith-ng[20024]:  notice: Initiating remote
> operation reboot for node03: d1df9201-5bb1-447f-9b40-d3d7235c3d0a (0)
> Mar 15 14:54:04 node01 stonith-ng[20024]:  notice: fence-scsi-node3 can
> fence (reboot) node03: static-list
> Mar 15 14:54:04 node01 stonith-ng[20024]:  notice: fence-node3 can fence
> (reboot) node03: static-list
> Mar 15 14:54:04 node01 stonith-ng[20024]:  notice: All fencing options
> to fence node03 for stonith_admin.29191 at node01.d1df9201 failed

The above line is the key. Both of the devices registered for node03
returned failure. Pacemaker then looked for any other device capable of
fencing node03 and there is none, so that's why it reported "No such
device" (an admittedly obscure message).

It looks like the fence agents require more configuration options than
you have set. If you run "/path/to/fence/agent -o metadata", you can see
the available options. It's a good idea to first get the agent running
successfully manually on the command line ("status" command is usually
sufficient), then put those same options in the cluster configuration.

> Mar 15 14:54:04 node01 stonith-ng[20024]:  notice: Couldn't find anyone
> to fence (reboot) node03 with fence-node1
> Mar 15 14:54:04 node01 stonith-ng[20024]:   error: Operation reboot of
> node03 by <no-one> for stonith_admin.29191 at node01.d1df9201: No such device
> Mar 15 14:54:04 node01 crmd[20028]:  notice: Peer node03 was not
> terminated (reboot) by <anyone> for node01: No such device
> (ref=d1df9201-5bb1-447f-9b40-d3d7235c3d0a) by client stonith_admin.29191
> 
> node03 # tail /var/log/messages
> Mar 15 14:54:04 node03 stonith-ng[2601]:  notice: fence-scsi-node1 can
> not fence (reboot) node03: static-list
> Mar 15 14:54:04 node03 stonith-ng[2601]:  notice: fence-node1 can not
> fence (reboot) node03: static-list
> Mar 15 14:54:04 node03 stonith-ng[2601]:  notice: Operation reboot of
> node03 by <no-one> for stonith_admin.29191 at node01.d1df9201: No such device
> Mar 15 14:54:04 node03 crmd[2605]:  notice: Peer node03 was not
> terminated (reboot) by <anyone> for node01: No such device
> (ref=d1df9201-5bb1-447f-9b40-d3d7235c3d0a) by client stonith_admin.29191
> 
> node01 # stonith_admin -L
>  fence-scsi-node3
>  fence-node3
> 2 devices found
> 
> node03 # stonith_admin -L
>  fence-scsi-node1
>  fence-node1
> 2 devices found
> 
> node01 # sg_persist --in -r -d
> /dev/disk/by-id/scsi-360060e8013757c005020757c00003f08
>   HITACHI   OPEN-V            7303
>   Peripheral device type: disk
>   PR generation=0x6, Reservation follows:
>     Key=0x7b6b0001
>     scope: LU_SCOPE,  type: Write Exclusive, registrants only
> node01 # sg_persist --in -k -d
> /dev/disk/by-id/scsi-360060e8013757c005020757c00003f08
>   HITACHI   OPEN-V            7303
>   Peripheral device type: disk
>   PR generation=0x6, 4 registered reservation keys follow:
>     0x7b6b0000
>     0x7b6b0001
>     0x7b6b0001
>     0x7b6b0000
> 
> node03 # sg_persist --in -r -d
> /dev/disk/by-id/scsi-360060e8013757c005020757c00003f08
>   HITACHI   OPEN-V            7303
>   Peripheral device type: disk
>   PR generation=0x6, Reservation follows:
>     Key=0x7b6b0001
>     scope: LU_SCOPE,  type: Write Exclusive, registrants only
> node03 # sg_persist --in -k -d
> /dev/disk/by-id/scsi-360060e8013757c005020757c00003f08
>   HITACHI   OPEN-V            7303
>   Peripheral device type: disk
>   PR generation=0x6, 4 registered reservation keys follow:
>     0x7b6b0000
>     0x7b6b0001
>     0x7b6b0001
>     0x7b6b0000
> 
> I'm kind of in the dark and don't know how to proceed here.





More information about the Users mailing list