[ClusterLabs] Preventing multiple resources from moving at the same time.
matt.s at aptalaska.net
Wed Apr 21 12:27:32 EDT 2021
On 4/21/21 12:48 AM, Klaus Wenninger wrote:
> Just to better understand the issue ...
> Does the first resource implement storage that is being used
> by the resource that is being migrated/moved?
> Or is it just the combination of 2 parallel moves that is
> overcommitting storage or network?
> Is it assured that there are no load-scenarios inside these
> resources that create the same issues as if you migrate/move
Thanks for the help Klaus, I'll spell it out more clearly.
I'm using a group resource sets up a failover-ip address, then mounts a
ZFS dataset (which exports a configuration directory as NFS), then a
custom resource called ZFSiSCSI that exports all virtual machine disks
* Resource Group: IP-ZFS-iSCSI:
* fence-datastore (stonith:fence_scsi): Started node1
* failover-ip (ocf::heartbeat:IPaddr): Started node1
* zfs-datastore (ocf::heartbeat:ZFS): Started node1
* ZFSiSCSI (ocf::heartbeat:ZFSiSCSI): Started node1
Then I create a virtual machine with
primitive vm-testvm VirtualDomain params
config="/nfs/vm/testvm/testvm.xml" meta allow-migrate=true op monitor
This works fine because the ZFS storage can be mounted/exported on node1
or node2 which will have an iSCSI target for each VM bound to the shared
IP address. I can move the storage to either node and while there is a
pause in the storage it works fine as things move around faster than the
iscsi timeout. I can also migrate the VM to either node because when
it's started on the target node, it can immediately access it's iscsi
storage regardless if the storage is local or not.
The problem is monitoring with VirtualDomain. The
/usr/lib/ocf/resource.d/heartbeat/VirtualDomain script goes to check to
see if /nfs/vm/testvm/testvm.xml is available with this line:
if [ ! -r $OCF_RESKEY_config ]; then
if ocf_is_probe; then
ocf_log info "Configuration file
$OCF_RESKEY_config not readable during probe."
That causes bash to stat the config file which if we are in the middle
of a IP-ZFS-iSCSI move, will return -1 which then causes VirtualDomain
to view the VM as dead and hard resets it.
If I set the stickiness to 100 then it's a race condition, many times we
get the storage layer migrated without VirtualDomain noticing, but if
the stickiness is not set, then moving a resource causes the cluster to
re-balance and will cause the VM to fail every time because validation
is one of the first things we do when we migrate the VM, and it's at the
same time as a IP-ZFS-iSCSI move so the config file goes away for 5 seconds.
I'm not sure how to fix this. The nodes don't have local storage that
isn't the ZFS pool, otherwise I'd just create a local config directory
and glusterfs them together.
I suppose the next step is to see if NFS has some sort of retry mode so
that bash stating the config file is blocked until a timeout. That would
certainly fix my issue as that's how the iscsi stuff works, retry until
timeout. Another option is to rework VirtualDomain as stating a config
file isn't really a good test to see if the domain is working. It makes
more sense to have it make a virsh call to see if it's working and only
care about the config file if it's starting the domain.
More information about the Users