[ClusterLabs] Q: callback hooks for sbd?

Klaus Wenninger kwenning at redhat.com
Fri Mar 12 03:08:18 EST 2021


On 3/11/21 7:03 PM, Strahil Nikolov wrote:
> What about creating a custom fencing agent that does what you need and always reporting that it failed ?
> Using fencing topology and a custom fencing script as the first stonith device (that always fails) , you will be able to execute some commands (or at least try to do that) before the next fencing mechanism kicks in.
This is exactly what I was suggesting with the pseudo-fence-agent
below. If you are doing it on a separate level and have it
fail or on the same level reporting success shouldn't
really make a difference.

Drawback is that this requires that fencing infrastructure
can be reached by the node that initiates fencing. If the
network-link is down and we just have the shared disk to pass
the reboot-request or watchdog-fencing (we do detect that in
the daemon as well before the hardware watchdog) is kicking
in this wouldn't be working.

Btw. to assure that the hook is invoked when we are running
into the watchdog-case (provided that the daemon is still
alive of course) a 2nd hook at warning-level could be
implemented.


Regards,
Klaus
>
>
> Best Regards,
> Strahil Nikolov
>
>
>
>
>
>
> В четвъртък, 11 март 2021 г., 19:16:04 ч. Гринуич+2, Klaus Wenninger <kwenning at redhat.com> написа:
>
>
>
>
>
> On 3/11/21 12:30 PM, Ulrich Windl wrote:
>> Hi!
>>
>> I wonder: Is it possible to register some callback to sbd that is called whenever a fencing operation is to be executed?
>> I would like to run some command on the node that is going to be fenced.
> Don't know of anything that exists you could use for that purpose.
> Because of the base-principle it is anyway never gonna be reliable
> that the command is executed. When the watchdog is taking
> down the node you can't execute anything before of course.
> And we couldn't wait for your callback to return of course as
> we don't have a reliable channel back to the fencing party
> and thus have to execute with a strict timeout.
>
> What you can of course always do is put some pseudo-fence-agent
> on a topology level with the poison-pill agent that tries to contact
> the to be fenced party first to execute your command.
>
> What I could imagine is to trigger execution of a customizable
> command(s) in parallel with the sequence of reboot / sysrq
> or as a replacement of that. We would still have the watchdog
> to take care of a strict timeout.
> I suppose idea is trying to do something graceful before being
> taken down by the watchdog.
> Or were you thinking along those lines:
> https://bugzilla.redhat.com/show_bug.cgi?id=1869728
>
>
> Regards,
> Klaus
>> Regards,
>> Ulrich
>>
>>
>>



More information about the Users mailing list