[ClusterLabs] Antw: [EXT] Re: Q: fence_kdump and fence_kdump_send
Ulrich.Windl at rz.uni-regensburg.de
Fri Feb 25 03:07:18 EST 2022
>>> "Walker, Chris" <christopher.walker at hpe.com> schrieb am 24.02.2022 um 17:26
<PH0PR84MB1648635CAAF31E85D62595B2973D9 at PH0PR84MB1648.NAMPRD84.PROD.OUTLOOK.COM>
> We use the fence_kump* code extensively in production and have never had any
> problems with it (other than the normal initial configuration challenges).
> Kernel panic + kdump is our most common failure mode, so we exercise this
> code quite a bit.
Would you like to share your configuration, specifically the fencing
mechanisms and watchdogs you are using?
> From: Users <users‑bounces at clusterlabs.org>
> Date: Thursday, February 24, 2022 at 7:22 AM
> To: users at clusterlabs.org <users at clusterlabs.org>
> Subject: [ClusterLabs] Q: fence_kdump and fence_kdump_send
> After reading about fence_kdump and fence_kdump_send I wonder:
> Does anybody use that in production?
> Having the networking and bonding in initrd does not sound like a good idea
> to me.
> Wouldn't it be easier to integrate that functionality into sbd?
> I mean: Let sbd wait for a "kdump‑ed" message that initrd could send when
> kdump is complete.
> Basically that would be the same mechanism, but using storage instead of
> If I get it right, the original fence_kdump would also introduce an extra
> fencing delay, and I wonder what happens with a hardware watchdog while a
> kdump is in progress...
> The background of all this is that our nodes kernel‑panic, and support says
> the kdumps are all incomplete.
> The events are most likely:
> node1: panics (kdump)
> other_node: seens node1 had failed and fences it (via sbd).
> However sbd fencing wont work while kdump is executing (IMHO)
> So what happens most likely is that the watchdog terminates the kdump.
> In that case all the mess with fence_kdump won't help, right?
> Manage your subscription:
> ClusterLabs home:
More information about the Users