[ClusterLabs] Antw: Re: Antw: [EXT] Re: Failed migration causing fencing loop

Mon Apr 4 02:58:18 EDT 2022

>>> Andrei Borzenkov <arvidjaar at gmail.com> schrieb am 04.04.2022 um 06:39 in
Nachricht <e351f140-fe35-6b4d-16ce-008aee0d1679 at gmail.com>:
> On 31.03.2022 14:02, Ulrich Windl wrote:
>>>>> "Gao,Yan" <ygao at suse.com> schrieb am 31.03.2022 um 11:18 in Nachricht
>> <67785c2f‑f875‑cb16‑608b‑77d63d9b02c4 at suse.com>:
>>> On 2022/3/31 9:03, Ulrich Windl wrote:
>>>> Hi!
>>>>
>>>> I just wanted to point out one thing that hit us with SLES15 SP3:
>>>> Some failed live VM migration causing node fencing resulted in a fencing

>>> loop, because of two reasons:
>>>>
>>>> 1) Pacemaker thinks that even _after_ fencing there is some migration to

>>> "clean up". Pacemaker treats the situation as if the VM is running on both

>>> nodes, thus (50% chance?) trying to stop the VM on the node that just
booted 
> 
>>> after fencing. That's supid but shouldn't be fatal IF there weren't...
>>>>
>>>> 2) The stop operation of the VM (that atually isn't running) fails,
>>>
>>> AFAICT it could not connect to the hypervisor, but the logic in the RA 
>>> is kind of arguable that the probe (monitor) of the VM returned "not 
>>> running", but the stop right after that returned failure...
>>>
>>> OTOH, the point about pacemaker is the stop of the resource on the 
>>> fenced and rejoined node is not really necessary. There has been 
>>> discussions about this here and we are trying to figure out a solution 
>>> for it:
>>>
>>> https://github.com/ClusterLabs/pacemaker/pull/2146#discussion_r828204919 
>>>
>>> For now it requires administrator's intervene if the situation happens:
>>> 1) Fix the access to hypervisor before the fenced node rejoins.
>> 
>> Thanks for the explanation!
>> 
>> Unfortunately this can be tricky if libvirtd is involved (as it is here):
>> libvird uses locking (virtlockd), which in turn needs a cluster‑wird 
> filesystem for locks across the nodes.
>> When that filesystem is provided by the cluster, it's hard to delay node 
> joining until filesystem,  virtlockd and libvirtd are running.
>> 
> 
> So do not use filesystem provided by the same cluster. Use separate
> filesystem mounted outside of cluster, like separate high‑available NFS.

Hi!

Having a second cluster just pto provide VM locking seems a big overkill.
Actually I absolutely regret that I ever followed the advice to use libvirt
and VIrtualDomain as it seems to have no real benefit for Xen and PVMs.
As a matter of fact after more than 10 years using Xen PVMs in a cluster we
will move to VMware as SLES15 SP3 is the most unstable SLES ever seen (I
started with SLES 8).
SUSE support seems unable to either fix the memory corruption, or to provide a
kernel that does not have it (it seems SP2 did not have it).

Regards,
Ulrich

> 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/