[ClusterLabs] Antw: [EXT] Re: Failed migration causing fencing loop

Andrei Borzenkov arvidjaar at gmail.com
Mon Apr 4 00:39:15 EDT 2022


On 31.03.2022 14:02, Ulrich Windl wrote:
>>>> "Gao,Yan" <ygao at suse.com> schrieb am 31.03.2022 um 11:18 in Nachricht
> <67785c2f-f875-cb16-608b-77d63d9b02c4 at suse.com>:
>> On 2022/3/31 9:03, Ulrich Windl wrote:
>>> Hi!
>>>
>>> I just wanted to point out one thing that hit us with SLES15 SP3:
>>> Some failed live VM migration causing node fencing resulted in a fencing 
>> loop, because of two reasons:
>>>
>>> 1) Pacemaker thinks that even _after_ fencing there is some migration to 
>> "clean up". Pacemaker treats the situation as if the VM is running on both 
>> nodes, thus (50% chance?) trying to stop the VM on the node that just booted 
>> after fencing. That's supid but shouldn't be fatal IF there weren't...
>>>
>>> 2) The stop operation of the VM (that atually isn't running) fails,
>>
>> AFAICT it could not connect to the hypervisor, but the logic in the RA 
>> is kind of arguable that the probe (monitor) of the VM returned "not 
>> running", but the stop right after that returned failure...
>>
>> OTOH, the point about pacemaker is the stop of the resource on the 
>> fenced and rejoined node is not really necessary. There has been 
>> discussions about this here and we are trying to figure out a solution 
>> for it:
>>
>> https://github.com/ClusterLabs/pacemaker/pull/2146#discussion_r828204919 
>>
>> For now it requires administrator's intervene if the situation happens:
>> 1) Fix the access to hypervisor before the fenced node rejoins.
> 
> Thanks for the explanation!
> 
> Unfortunately this can be tricky if libvirtd is involved (as it is here):
> libvird uses locking (virtlockd), which in turn needs a cluster-wird filesystem for locks across the nodes.
> When that filesystem is provided by the cluster, it's hard to delay node joining until filesystem,  virtlockd and libvirtd are running.
> 

So do not use filesystem provided by the same cluster. Use separate
filesystem mounted outside of cluster, like separate high-available NFS.



More information about the Users mailing list