[ClusterLabs] Preventing multiple resources from moving at the same time.
Matthew Schumacher
matt.s at aptalaska.net
Fri Apr 30 10:04:34 EDT 2021
On 4/21/21 11:04 AM, Matthew Schumacher wrote:
> On 4/21/21 10:21 AM, Andrei Borzenkov wrote:
>>> If I set the stickiness to 100 then it's a race condition, many
>>> times we
>>> get the storage layer migrated without VirtualDomain noticing, but if
>>> the stickiness is not set, then moving a resource causes the cluster to
>>> re-balance and will cause the VM to fail every time because validation
>>> is one of the first things we do when we migrate the VM, and it's at
>>> the
>>> same time as a IP-ZFS-iSCSI move so the config file goes away for 5
>>> seconds.
>>>
>>> I'm not sure how to fix this. The nodes don't have local storage that
>> Your nodes must have operating system and pacemaker stack loaded from
>> somewhere before they can import zfs pool.
>
> Yup, and they do. There are plenty of ways to do this: internal SD
> card, usb boot, pxe boot, etc.... I prefer this because I don't need
> to maintain a boot drive, the nodes boot from the exact same image,
> and I have gobs of memory so the running system can run in a ramdisk.
> This also makes it possible to boot my nodes with failed
> disks/controllers which makes troubleshooting easier. I basically
> made a live CD distro that has everything I need.
>
>>> I suppose the next step is to see if NFS has some sort of retry mode so
>> That is what "hard" mount option is for.
>>
> Thanks, I'll take a look.
For others searching the list, I did figure this out. The problem was
the order I was loading the resources in.
This doesn't work because we start the failover IP before ZFS which
starts the NFS share. This causes there to be a split second where the
IP is listening for NFS requests, but the NFS server isn't running yet,
so the IP stack sends a RST which causes the NFS client to report to the
OS a hard failure which causes the VirtualDomain resource to see an
invalid config, and thus breaks things.
* Resource Group: IP-ZFS-iSCSI:
* fence-datastore (stonith:fence_scsi): Started node1
* failover-ip (ocf::heartbeat:IPaddr): Started node1
* zfs-datastore (ocf::heartbeat:ZFS): Started node1
* ZFSiSCSI (ocf::heartbeat:ZFSiSCSI): Started node1
If I change it to this, then NFS requests simply go unanswered and the
client retries until it can make a connection, which is responded to.
* Resource Group: IP-ZFS-iSCSI:
* fence-datastore (stonith:fence_scsi): Started node1
* zfs-datastore (ocf::heartbeat:ZFS): Started node1
* ZFSiSCSI (ocf::heartbeat:ZFSiSCSI): Started node1
* failover-ip (ocf::heartbeat:IPaddr): Started node1
Originally I didn't do it this way because my iscsi and nfs stack bind
to the failover IP and I was worried stuff wouldn't start until the IP
was configured, but that doesn't seam to be a problem.
Matt
More information about the Users
mailing list