[ClusterLabs] Q: fence_kdump and fence_kdump_send
Roger Zhou
zzhou at suse.com
Fri Feb 25 04:11:43 EST 2022
On 2/24/22 20:21, Ulrich Windl wrote:
> Hi!
>
> After reading about fence_kdump and fence_kdump_send I wonder:
> Does anybody use that in production?
> Having the networking and bonding in initrd does not sound like a good idea to me.
I assume one of motivation for fence_kdump is to reduce the dependency on the
shared disk which is the fundamental infrastructure for SBD.
> Wouldn't it be easier to integrate that functionality into sbd?
sbd does support "crashdump". Though, you may want to have some further
improvement.
> I mean: Let sbd wait for a "kdump-ed" message that initrd could send when kdump is complete.
> Basically that would be the same mechanism, but using storage instead of networking.
>
> If I get it right, the original fence_kdump would also introduce an extra fencing delay, and I wonder what happens with a hardware watchdog while a kdump is in progress...
>
> The background of all this is that our nodes kernel-panic, and support says the kdumps are all incomplete.
> The events are most likely:
> node1: panics (kdump)
> other_node: seens node1 had failed and fences it (via sbd).
>
> However sbd fencing wont work while kdump is executing (IMHO)
>
Setup both sbd + fence_kdump sounds not a good practice.
I understand the sbd watchdog is tricky in this combination.
> So what happens most likely is that the watchdog terminates the kdump.
> In that case all the mess with fence_kdump won't help, right?
>
With sbd crashdump functionality, it deals with the watchdog properly.
Here is a knowledge page as well
https://www.suse.com/support/kb/doc/?id=000019873
> Regards,
> Ulrich
>
>
>
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
More information about the Users
mailing list