[ClusterLabs] Fencing agent fence_xvm using multicast

Mon Jul 28 13:29:06 UTC 2025

On Mon, Jul 28, 2025 at 2:40 PM Jehan-Guillaume de Rorthais <jgdr at dalibo.com>
wrote:

> On Mon, 28 Jul 2025 14:10:04 +0200
> Klaus Wenninger <kwenning at redhat.com> wrote:
>
> > On Mon, Jul 28, 2025 at 12:56 PM Jehan-Guillaume de Rorthais <
> > jgdr at dalibo.com> wrote:
> >
> > > Hi Klaus,
> > >
> > > On Thu, 24 Jul 2025 17:56:31 +0200
> > > Klaus Wenninger <kwenning at redhat.com> wrote:
> > >
> > > > […]
> > > > If you have a single hypervisor where you have access to - some sort
> of
> > > > at least - going with SBD will probably give you more issues than it
> will
> > > > help you.
> > > > […]
> > > > But again I would encourage you to try something different unless
> you
> > > > need any of the points where SBD shines.
> > >
> > > I would be interested if you could you elaborate a bit on that?
> > >
> > > Is it that SBD for watchdog self-fencing only architecture is
> considered
> > > instable or insecure? How would it be?
> >
> > No, given you have a reliable watchdog and the timeouts are configured
> > properly SBD should be safe - both with and without shared disks.
>
> Ok, thank.
>
> > The shared disks don't add additional safety actually because SBD
> > anyway has to rely on the watchdog taking the node down reliably
> > shouldn't it be able to access the disk(s) anymore.
>
> My understanding is that SBD with shared disk is interesting:
>
> * in shared disk cluster scenario
> * to have faster cluster reactions in some circumstances
>

In some circumstances is true ;-)
In general the fencing side will have to wait because it might fall back to
the
device being taken down by the watchdog and that isn't any faster as with
watchdog-fencing. If the target is able to read the poison-pill it will
probably
reboot kind of instantaneously. But the fencing side will still have to
wait.
Probably not even the node coming back will speed up things as fencing
will still be pending. But of course the time in between can be used for
startup of the fenced node and it will be available to run services - if a
reboot recovers it.

Regards,
Klaus

>
> I should probably get back to the second point though, as I'm not really
> sure
> about it.
>
> > > And - appart from the single hv node - what's wrong with SBD on
> > > "virtualized" raw shared storage? Any bad field experience?
> >
> > Nothing is basically wrong. Of course a reliable watchdog might be
> > an issue in virtual environments and a fallback to softdog will never
> > give you the reliability of a piece of hardware ticking down
> independently
> > from CPU and everything.
>
> Check.
>
> > What I meant was that if you are running all your VMs on a single
> > hypervisor there is really no need to be able to cope with a
> split-network
> > szenario or anything like this. So why add something additional that
> needs
> > careful arrangement of timeouts, possibly disk(s), ... if your
> > hypervisor already offers an interface that allows you to control a
> > VM and that gives you reliable feedback of the status and which is
> > probably roughly as available as the hypervisor itself.
>
> Well, OK, that's was my understanding as well. I was curious I was missing
> something else 😅
>
> Thank you for the details!
>
> Have a good day,
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20250728/70664136/attachment.htm>