[ClusterLabs] Using fence_scsi agent and watchdog

Mon Aug 21 12:12:24 EDT 2017

Hello all,

I've setup a 2 Nodes PCS lab to test the fence_scsi agent and how it works.
The lab is comprised by the following VMs, all CentOS 7.3 under VMware
Workstation:

pcs1 - 192.168.199.101
pcs2 - 192.168.199.102
iscsi - 192.168.199.200  ISCSI Server

The ISCSI server is providing 3 Block Volumes like these to both PCS nodes:

/dev/sdb 200 MB fence volume with working SCSI-3 persistent reservation
/dev/sdc 1GB data volume XFS
/dev/sdd 2GB data volume XFS

The Fencing agent is configured like this:
pcs stonith create FenceSCSI fence_scsi pcmk_host_list="pcs1 pcs2"
devices=/dev/sdb meta provides=unfencing

Then I've created 2 ResGroups, each with an LVM Volume mounted under
/cluster/fs1 and /cluster/fs2.

PCS is working like expected in managing resources.

Coming to the fence_scsi it seems that to be sure to have one node fenced
the only solution is to install the watchdog rpm and to link the correct
/usr/share/cluster/fence_scsi_check file in the /etc/watchdog.d directory.

But I've noticed that there is a significant lag between the effective
reboot of the node and the resource takeover on the surviving node which
could lead to a dangerous situation, for example:
1. stonith_admin -F pcs1
2. PCS will stop on pcs1 and resource are switched on node pcs2 in a few
moments
3. watchdog in some more time will trigger the reboot of the pcs1 node.

I've the following questions:

A. Is this the only possible configuration in order to use the fence_scsi
agent to be sure that fenced node is rebooted? If yes I think that
documentation should be updated accordingly because it is not very clear
B. is there a way to make the surviving node to wait that the fenced node
is actually rebooted before taking over the resources from the fenced node?

Thanks in advance for any answers.
Best regards,
Luca
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20170821/d6c61639/attachment-0002.html>