[ClusterLabs] is SFEX valid for Pacemaker on VMware with fence_vmware_soap?

Andrei Borzenkov arvidjaar at gmail.com
Fri Sep 14 03:05:36 EDT 2018

14.09.2018 05:38, Satoshi Suzuki пишет:
> Hello,
> pls let me ask if SFEX is valid as the disk exclusive access control for
> Pacemaker clusters on VMware environment.
> My client is planning to configure Pacemaker HA clusters on several
> VMware vSphere 6.5 hosts.
> Each of the HA clusters consists of two VM nodes of active and standby
> across two different ESX hosts, with shared LVM disk resources.
> As for the disk exclusive control and fencing mechanism with Pacemaker,
> our IT vendor is proposing to use SFEX (Shared Disk File EXclusiveness)
> and fence_vmware_soap (to reset the failing node via vCenter).

What is SFEX? I could not find this abbreviation anywhere.

> Here, I am very concerned about a case of an ESX host hanging for over a
> minute like due to intermittent HW failures, so fence_vmware_soap would
> not work. Forcing the standby node to takeover the disk resources with
> SFEX, but if the hanging node comes back eventually, the hanged I/Os
> that were queued on the last active node just before the ESX hanged-up
> would flood over and corrupt the SFEX-takenover disk resources, because
> there was no SCSI persistent reservation and no valid HW watchdog timer
> for VMs on VMware.

Interesting point. In general pacemaker timeouts are supposed to always
be larger than underlying infrastructure timeouts. I.e. you need to
account for multipath failover as well as internal ESXi failover. I.e.
if ESXi host is unresponsive long enough, it is kicked out of HA cluster
and VMs are restarted elsewhere. Disk access in this case should be
regulated by internal ESXXi locking.

> So I think SFEX is valid only if combined with STONITH IPMI for
> baremetal servers or even VMware hosts,
> and we should use fence_scsi for the recent SPC-3 compliant disk storage
> with fence_vmware_soap on VMware.  Am I right?

This depends on your storage configuration. SCSI-3 reservation across
ESXi hosts is supported only with RDM in physical compatibility mode.

> In addition, is fence_scsi with fence_vmware_soap proven enough in
> production environments on RHEL7x on VMware?
> Thank you for any responses.
> Satoshi
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

More information about the Users mailing list