[ClusterLabs] fence-scsi question

Dan Swartzendruber dswartz at druber.com
Mon Feb 10 07:02:10 EST 2020


On 2020-02-10 00:06, Strahil Nikolov wrote:
> On February 10, 2020 2:07:01 AM GMT+02:00, Dan Swartzendruber
> <dswartz at druber.com> wrote:
>> 
>> I have a 2-node CentOS7 cluster running ZFS.  The two nodes (vsphere
>> appliances on different hosts) access 2 SAS SSD in a Supermicro JBOD
>> with 2 mini-SAS connectors.  It all works fine - failover and all.  My
>> quandary was how to implement fencing.  I was able to get both of the
>> vmware SOAP and REST fencing agents to work - it just isn't reliable
>> enough.  If the vcenter server appliance is busy, fencing requests
>> timeout.  I know I can increase the timeouts, but in at least one test
>> run, even a minute wasn't enough, and my concern is that too long
>> switching over, and vmware will put the datastore in APD, hosing
>> guests.
>>  I confirmed that both SSD work properly with the fence-scsi agent.
>> Fencing the host who actively owns the ZFS pool also works perfectly
>> (ZFS flushes data to the datastore every 5 seconds or so, so
>> withdrawing
>> the SCSI-3 persistent reservations causes a fatal write error to the
>> pool, and setting the pool in failmode=panic will cause the fenced
>> cluster node to reboot automatically.)  The problem (maybe it isn't
>> really one?) is that fencing the node that does *not* own the pool has
>> no effect, since it holds no reservations on the devices in the pool.)
>> 
>> I'd love to be sure this isn't an issue at all.
>> _______________________________________________
>> Manage your subscription:
>> https://lists.clusterlabs.org/mailman/listinfo/users
>> 
>> ClusterLabs home: https://www.clusterlabs.org/
> 
> Hi Dan,
> You can configure multiple fencing mechanisms in your cluster.
> For example, you can set the first fencing mechanism to be via VmWare
> and if it fails (being busy or currrently unavailable), then the scsi
> fencing can kick in to ensure a failover can be done.
> 
> What you observe is normal - no scsi reservations -> no fencing.
> That's why major vendors require , when using
> fence_multipath/fence_scsi,  the shared storage to be a dependency (a
> File system in use by the application) and not just an add-on.
> 
> I personally don't like  scsi reservations, as there is no guarantee
> that other resources (services, IPs, etc) are actually down , but the
> risk is low.
> 
> In your case fence_scsi stonith can be a second layer of protection.
> 
> 
> Best Regards,
> Strahil Nikolov

Okay, thanks.  I'll look into multi-level then.


More information about the Users mailing list