[ClusterLabs] Antw: [EXT] Re: Preventing multiple resources from moving at the same time.

Thu Apr 22 03:41:49 EDT 2021

>>> Matthew Schumacher <matt.s at aptalaska.net> schrieb am 21.04.2021 um 18:27 in
Nachricht <3dab41c0-d56e-72fa-8a61-70e268a0fa22 at aptalaska.net>:
> On 4/21/21 12:48 AM, Klaus Wenninger wrote:
>> Just to better understand the issue ...
>> Does the first resource implement storage that is being used
>> by the resource that is being migrated/moved?
>> Or is it just the combination of 2 parallel moves that is
>> overcommitting storage or network?
>> Is it assured that there are no load-scenarios inside these
>> resources that create the same issues as if you migrate/move
>> them?
>>
>> Klaus
> 
> Thanks for the help Klaus, I'll spell it out more clearly.
> 
> I'm using a group resource sets up a failover-ip address, then mounts a 
> ZFS dataset (which exports a configuration directory as NFS), then a 
> custom resource called ZFSiSCSI that exports all virtual machine disks 
> as iSCSI.
> 
> Like this:
> 
>    * Resource Group: IP-ZFS-iSCSI:
>      * fence-datastore    (stonith:fence_scsi):     Started node1
>      * failover-ip    (ocf::heartbeat:IPaddr):     Started node1
>      * zfs-datastore    (ocf::heartbeat:ZFS):     Started node1
>      * ZFSiSCSI    (ocf::heartbeat:ZFSiSCSI):     Started node1
> 
> Then I create a virtual machine with
> 
> primitive vm-testvm VirtualDomain params 
> config="/nfs/vm/testvm/testvm.xml" meta allow-migrate=true op monitor 
> timeout=30 interval=10
> 
> This works fine because the ZFS storage can be mounted/exported on node1 
> or node2 which will have an iSCSI target for each VM bound to the shared 
> IP address.  I can move the storage to either node and while there is a 
> pause in the storage it works fine as things move around faster than the 
> iscsi timeout.  I can also migrate the VM to either node because when 
> it's started on the target node, it can immediately access it's iscsi 
> storage regardless if the storage is local or not.
> 
> The problem is monitoring with VirtualDomain.  The 
> /usr/lib/ocf/resource.d/heartbeat/VirtualDomain script goes to check to 
> see if /nfs/vm/testvm/testvm.xml is available with this line:
> 
>          if [ ! -r $OCF_RESKEY_config ]; then
>                  if ocf_is_probe; then
>                          ocf_log info "Configuration file 
> $OCF_RESKEY_config not readable during probe."
> 
> That causes bash to stat the config file which if we are in the middle 
> of a IP-ZFS-iSCSI move, will return -1 which then causes VirtualDomain 
> to view the VM as dead and hard resets it.

I think it's unsafe to move an iSCSI target between nodes assuming the initiator won't notice, specifically as iSCSI uses TCP.
Don't the initiators see a "connection reset" when the target moved?
(When if your target ran in a VM that is life-migrated, it might succeed if migration is fast enough)

> 
> If I set the stickiness to 100 then it's a race condition, many times we 
> get the storage layer migrated without VirtualDomain noticing, but if 
> the stickiness is not set, then moving a resource causes the cluster to 
> re-balance and will cause the VM to fail every time because validation 
> is one of the first things we do when we migrate the VM, and it's at the 
> same time as a IP-ZFS-iSCSI move so the config file goes away for 5 seconds.
> 
> I'm not sure how to fix this.  The nodes don't have local storage that 
> isn't the ZFS pool, otherwise I'd just create a local config directory 
> and glusterfs them together.
> 
> I suppose the next step is to see if NFS has some sort of retry mode so 
> that bash stating the config file is blocked until a timeout. That would 

NFS (at least before version 4) always had a mode to wait for the server; see "bg" (background) option.

> certainly fix my issue as that's how the iscsi stuff works, retry until 
> timeout.  Another option is to rework VirtualDomain as stating a config 
> file isn't really a good test to see if the domain is working.  It makes 
> more sense to have it make a virsh call to see if it's working and only 
> care about the config file if it's starting the domain.
> 
> Ideas welcome!!!!
> 
> Matt
> 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/