[ClusterLabs] Instable SLES15 SP3 kernel
Ulrich Windl
Ulrich.Windl at rz.uni-regensburg.de
Mon Apr 4 09:13:12 EDT 2022
>>> "Gao,Yan" <ygao at suse.com> schrieb am 04.04.2022 um 11:58 in Nachricht
<0d0f2b5e-3238-22df-4105-31e5a640d924 at suse.com>:
> On 2022/4/4 8:58, Ulrich Windl wrote:
>>>>> Andrei Borzenkov <arvidjaar at gmail.com> schrieb am 04.04.2022 um 06:39
in
>> Nachricht <e351f140-fe35-6b4d-16ce-008aee0d1679 at gmail.com>:
>>> On 31.03.2022 14:02, Ulrich Windl wrote:
>>>>>>> "Gao,Yan" <ygao at suse.com> schrieb am 31.03.2022 um 11:18 in Nachricht
>>>> <67785c2f‑f875‑cb16‑608b‑77d63d9b02c4 at suse.com>:
>>>>> On 2022/3/31 9:03, Ulrich Windl wrote:
>>>>>> Hi!
>>>>>>
>>>>>> I just wanted to point out one thing that hit us with SLES15 SP3:
>>>>>> Some failed live VM migration causing node fencing resulted in a
fencing
>>
>>>>> loop, because of two reasons:
>>>>>>
>>>>>> 1) Pacemaker thinks that even _after_ fencing there is some migration
to
>>
>>>>> "clean up". Pacemaker treats the situation as if the VM is running on
both
>>
>>>>> nodes, thus (50% chance?) trying to stop the VM on the node that just
>> booted
>>>
>>>>> after fencing. That's supid but shouldn't be fatal IF there weren't...
>>>>>>
>>>>>> 2) The stop operation of the VM (that atually isn't running) fails,
>>>>>
>>>>> AFAICT it could not connect to the hypervisor, but the logic in the RA
>>>>> is kind of arguable that the probe (monitor) of the VM returned "not
>>>>> running", but the stop right after that returned failure...
>>>>>
>>>>> OTOH, the point about pacemaker is the stop of the resource on the
>>>>> fenced and rejoined node is not really necessary. There has been
>>>>> discussions about this here and we are trying to figure out a solution
>>>>> for it:
>>>>>
>>>>> https://github.com/ClusterLabs/pacemaker/pull/2146#discussion_r828204919
>>>>>
>>>>> For now it requires administrator's intervene if the situation happens:
>>>>> 1) Fix the access to hypervisor before the fenced node rejoins.
>>>>
>>>> Thanks for the explanation!
>>>>
>>>> Unfortunately this can be tricky if libvirtd is involved (as it is
here):
>>>> libvird uses locking (virtlockd), which in turn needs a cluster‑wird
>>> filesystem for locks across the nodes.
>>>> When that filesystem is provided by the cluster, it's hard to delay node
>>> joining until filesystem, virtlockd and libvirtd are running.
>>>>
>>>
>>> So do not use filesystem provided by the same cluster. Use separate
>>> filesystem mounted outside of cluster, like separate high‑available NFS.
>>
>> Hi!
>>
>> Having a second cluster just pto provide VM locking seems a big overkill.
>> Actually I absolutely regret that I ever followed the advice to use
libvirt
>> and VIrtualDomain as it seems to have no real benefit for Xen and PVMs.
>> As a matter of fact after more than 10 years using Xen PVMs in a cluster
we
>> will move to VMware as SLES15 SP3 is the most unstable SLES ever seen (I
>> started with SLES 8).
>> SUSE support seems unable to either fix the memory corruption, or to
provide
> a
>> kernel that does not have it (it seems SP2 did not have it).
>
> Sounds like there's certain kernel issue related to Xen? Probably ask
> SUSE support to raise the priority of the ticket?
Hi!
Actually it's sufficient to use either rear to create a recovery image or to
copy a large file from OCFS2 to trigger the bug.
Unfortunately support isn't really making progress it seems (we have a PTF
kernel, but that isn't any better).
To prevent kernel panics and lots of failing VMs I'm running this script as
cron job:
---
# cat /etc/crontabs/reboot-before-panic.sh
#!/usr/bin/sh
# Detect RAM corruption. If detected log a message and reboot
# to prevent kernel panic
#cron jobs need a PATH
PATH=/sbin:/usr/sbin:/usr/bin:/bin
if journalctl -b -g 'Code: Bad RIP value|BUG: Bad rss-counter state mm:'
>/dev/null
then
MSG='RAM corruption detected, starting pro-active reboot'
logger -t reboot-before-panic -p local0.notice "$MSG"
shutdown -r +1 "$MSG"
fi
if journalctl -b -k | grep -q 'kernel: OCFS2: File system is now read-only\.'
then
MSG='OCFS2 problem detected, stopping cluster node, then reboot'
logger -t reboot-before-panic -p local0.notice "$MSG"
crm cluster stop
shutdown -r +1 "$MSG"
fi
---
Regards,
Ulrich
>
> Regards,
> Yan
>
>
>>
>> Regards,
>> Ulrich
>>
>>
>>>
>>> _______________________________________________
>>> Manage your subscription:
>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>
>>> ClusterLabs home: https://www.clusterlabs.org/
>>
>>
>>
>> _______________________________________________
>> Manage your subscription:
>> https://lists.clusterlabs.org/mailman/listinfo/users
>>
>> ClusterLabs home: https://www.clusterlabs.org/
More information about the Users
mailing list