[ClusterLabs] fence-scsi question
dswartz at druber.com
Sun Feb 9 19:07:01 EST 2020
I have a 2-node CentOS7 cluster running ZFS. The two nodes (vsphere
appliances on different hosts) access 2 SAS SSD in a Supermicro JBOD
with 2 mini-SAS connectors. It all works fine - failover and all. My
quandary was how to implement fencing. I was able to get both of the
vmware SOAP and REST fencing agents to work - it just isn't reliable
enough. If the vcenter server appliance is busy, fencing requests
timeout. I know I can increase the timeouts, but in at least one test
run, even a minute wasn't enough, and my concern is that too long
switching over, and vmware will put the datastore in APD, hosing guests.
I confirmed that both SSD work properly with the fence-scsi agent.
Fencing the host who actively owns the ZFS pool also works perfectly
(ZFS flushes data to the datastore every 5 seconds or so, so withdrawing
the SCSI-3 persistent reservations causes a fatal write error to the
pool, and setting the pool in failmode=panic will cause the fenced
cluster node to reboot automatically.) The problem (maybe it isn't
really one?) is that fencing the node that does *not* own the pool has
no effect, since it holds no reservations on the devices in the pool.)
I'd love to be sure this isn't an issue at all.
More information about the Users