<div dir="ltr">

Hi Andrei, <br>

    thanks for your interest and response.<br>

    <br>

    <i>>> As for the disk exclusive control and fencing mechanism

      with Pacemaker,

    </i><i><br>

    </i><i>>> our IT vendor is proposing to use SFEX (Shared Disk

      File EXclusiveness)

    </i><i><br>

    </i><i>>> and fence_vmware_soap (to reset the failing node via

      vCenter).

    </i><i><br>

    </i><br>

    <i>></i><i> </i><i>

      What is SFEX? I could not find this abbreviation anywhere.

    </i><br>

    <br>

    It looks like <i>"Shared Disk File EXclusiveness" </i>.  Pls take

    look at the following:<br>

       <a class="gmail-m_5057374519026155446moz-txt-link-freetext" href="http://www.linux-ha.org/wiki/Sfex_(resource_agent)" target="_blank">http://www.linux-ha.org/wiki/Sfex_(resource_agent)</a><br>

    <br>

    <i>> Interesting point. In general pacemaker timeouts are

      supposed to always

    </i><i><br>

    </i><i>> be larger than underlying infrastructure timeouts. I.e.

      you need to

    </i><i><br>

    </i><i>> account for multipath failover as well as internal ESXi

      failover. I.e.

    </i><i><br>

      <br>

    </i>OK, in my case, two paths of FC from each of ESXi hosts to the

    storage,<br>

    so we consider 60+sec (30x2+some seconds) or 120sec for the timeout.

    <br>

    <i><br>

      > if ESXi host is unresponsive long enough, it is kicked out of

      HA cluster

    </i><i><br>

    </i><i>> and VMs are restarted elsewhere. Disk access in this

      case should be

    </i><i><br>

    </i><i>> regulated by internal ESXXi locking.</i><br>

    <br>

    Suppose if the ESXi hangs longer than 120s then it comes back,<br>

    fence_vmware_soap would not work timely (as the target ESXi was

    hanging)<br>

    and pacemaker timeout itself would not work immediately(as SW watchdog timer),<br>

    then the queued I/Os before the hang would override the disks.<br>

    <br>

    Sorry, I can't understand how your saying "internal ESXi

    locking" with SFEX<br>

    would protect my concerned scenario above.<br>

    <br>

    >><i> So I think SFEX is valid only if combined with STONITH

      IPMI for

      <br>

    </i>>><i> baremetal servers or even VMware hosts,

    </i><br>

    >><i> and we should use fence_scsi for the recent SPC-3

      compliant disk storage

    </i><br>

    >><i> with fence_vmware_soap on VMware. Am I right?

    </i><br>

    <br>

    <div><i>> 

    This depends on your storage configuration. SCSI-3 reservation

    across <br></i></div><div><i>> ESXi hosts is supported only with RDM in physical compatibility

    mode.</i>

    </div>

    <br>

    Sure, my disk configuration is RDM in physical compatibility mode, as required by VMware.<br>

    <br>

    Thanks for further guidance or suggestions. 

</div>