[ClusterLabs] Q: fence_kdump and fence_kdump_send

Roger Zhou zzhou at suse.com
Fri Feb 25 04:11:43 EST 2022


On 2/24/22 20:21, Ulrich Windl wrote:
> Hi!
> 
> After reading about fence_kdump and fence_kdump_send I wonder:
> Does anybody use that in production?
> Having the networking and bonding in initrd does not sound like a good idea to me.

I assume one of motivation for fence_kdump is to reduce the dependency on the 
shared disk which is the fundamental infrastructure for SBD.

> Wouldn't it be easier to integrate that functionality into sbd?

sbd does support "crashdump". Though, you may want to have some further 
improvement.

> I mean: Let sbd wait for a "kdump-ed" message that initrd could send when kdump is complete.
> Basically that would be the same mechanism, but using storage instead of networking.
> 
> If I get it right, the original fence_kdump would also introduce an extra fencing delay, and I wonder what happens with a hardware watchdog while a kdump is in progress...
> 
> The background of all this is that our nodes kernel-panic, and support says the kdumps are all incomplete.
> The events are most likely:
> node1: panics (kdump)
> other_node: seens node1 had failed and fences it (via sbd).
> 
> However sbd fencing wont work while kdump is executing (IMHO)
> 

Setup both sbd + fence_kdump sounds not a good practice.
I understand the sbd watchdog is tricky in this combination.

> So what happens most likely is that the watchdog terminates the kdump.
> In that case all the mess with fence_kdump won't help, right?
> 

With sbd crashdump functionality, it deals with the watchdog properly.

Here is a knowledge page as well
https://www.suse.com/support/kb/doc/?id=000019873


> Regards,
> Ulrich
> 
> 
> 
> 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
> 



More information about the Users mailing list