[ClusterLabs] Preventing multiple resources from moving at the same time.

Wed Apr 21 12:27:32 EDT 2021

On 4/21/21 12:48 AM, Klaus Wenninger wrote:
> Just to better understand the issue ...
> Does the first resource implement storage that is being used
> by the resource that is being migrated/moved?
> Or is it just the combination of 2 parallel moves that is
> overcommitting storage or network?
> Is it assured that there are no load-scenarios inside these
> resources that create the same issues as if you migrate/move
> them?
>
> Klaus

Thanks for the help Klaus, I'll spell it out more clearly.

I'm using a group resource sets up a failover-ip address, then mounts a 
ZFS dataset (which exports a configuration directory as NFS), then a 
custom resource called ZFSiSCSI that exports all virtual machine disks 
as iSCSI.

Like this:

   * Resource Group: IP-ZFS-iSCSI:
     * fence-datastore    (stonith:fence_scsi):     Started node1
     * failover-ip    (ocf::heartbeat:IPaddr):     Started node1
     * zfs-datastore    (ocf::heartbeat:ZFS):     Started node1
     * ZFSiSCSI    (ocf::heartbeat:ZFSiSCSI):     Started node1

Then I create a virtual machine with

primitive vm-testvm VirtualDomain params 
config="/nfs/vm/testvm/testvm.xml" meta allow-migrate=true op monitor 
timeout=30 interval=10

This works fine because the ZFS storage can be mounted/exported on node1 
or node2 which will have an iSCSI target for each VM bound to the shared 
IP address.  I can move the storage to either node and while there is a 
pause in the storage it works fine as things move around faster than the 
iscsi timeout.  I can also migrate the VM to either node because when 
it's started on the target node, it can immediately access it's iscsi 
storage regardless if the storage is local or not.

The problem is monitoring with VirtualDomain.  The 
/usr/lib/ocf/resource.d/heartbeat/VirtualDomain script goes to check to 
see if /nfs/vm/testvm/testvm.xml is available with this line:

         if [ ! -r $OCF_RESKEY_config ]; then
                 if ocf_is_probe; then
                         ocf_log info "Configuration file 
$OCF_RESKEY_config not readable during probe."

That causes bash to stat the config file which if we are in the middle 
of a IP-ZFS-iSCSI move, will return -1 which then causes VirtualDomain 
to view the VM as dead and hard resets it.

If I set the stickiness to 100 then it's a race condition, many times we 
get the storage layer migrated without VirtualDomain noticing, but if 
the stickiness is not set, then moving a resource causes the cluster to 
re-balance and will cause the VM to fail every time because validation 
is one of the first things we do when we migrate the VM, and it's at the 
same time as a IP-ZFS-iSCSI move so the config file goes away for 5 seconds.

I'm not sure how to fix this.  The nodes don't have local storage that 
isn't the ZFS pool, otherwise I'd just create a local config directory 
and glusterfs them together.

I suppose the next step is to see if NFS has some sort of retry mode so 
that bash stating the config file is blocked until a timeout. That would 
certainly fix my issue as that's how the iscsi stuff works, retry until 
timeout.  Another option is to rework VirtualDomain as stating a config 
file isn't really a good test to see if the domain is working.  It makes 
more sense to have it make a virsh call to see if it's working and only 
care about the config file if it's starting the domain.

Ideas welcome!!!!

Matt