[ClusterLabs] Antw: Re: Antw: [EXT] Re: Stonith

Klaus Wenninger kwenning at redhat.com
Wed Dec 21 12:06:01 EST 2022


On Wed, Dec 21, 2022 at 4:51 PM Ken Gaillot <kgaillot at redhat.com> wrote:

> On Wed, 2022-12-21 at 10:45 +0100, Ulrich Windl wrote:
> > > > > Ken Gaillot <kgaillot at redhat.com> schrieb am 20.12.2022 um
> > > > > 16:21 in
> > Nachricht
> > <3a5960c2331f97496119720f6b5a760b3fe3bbcf.camel at redhat.com>:
> > > On Tue, 2022‑12‑20 at 11:33 +0300, Andrei Borzenkov wrote:
> > > > On Tue, Dec 20, 2022 at 10:07 AM Ulrich Windl
> > > > <Ulrich.Windl at rz.uni‑regensburg.de> wrote:
> > > > > > But keep in mind that if the whole site is down (or
> > > > > > unaccessible)
> > > > > > you
> > > > > > will not have access to IPMI/PDU/whatever on this site so
> > > > > > your
> > > > > > stonith
> > > > > > agents will fail ...
> > > > >
> > > > > But, considering the design, such site won't have a quorum and
> > > > > should commit suicide, right?
> > > > >
> > > >
> > > > Not by default.
> > >
> > > And even if it does, the rest of the cluster can't assume that it
> > > did,
> > > so resources can't be recovered. It could work with sbd, but the
> > > poster
> > > said that the physical hosts aren't accessible.
> >
> > Why? Assuming fencing is configured, the nodes part of the quorum
> > should wait
> > for fencing delay, assuming fencing (or suicide) was done.
> > Then they can manage resources. OK, a non-working fencing or suicide
> > mechanism
> > is a different story...
> >
> > Regards,
> > Ulrich
>
> Right, that would be using watchdog-based SBD for self-fencing, but the
> poster can't use SBD in this case.
>

Read it in a way that this would just be a PoC setup.
Like ssh-fencing as a replacement for a real fencing-device one can
use softdog (or whatever the virtual-environment offers that is supported
by the kernel as watchdog-device) with watchdog-fencing at least for
PoC purposes.
I guess it depends on how the final setup is gonna differ from the PoC
setup. Knowing that things like live-migration, pausing a machine,
running on heavily overcommitted hosts, snapshots, ... would
be critical for the scenario one could simply try to avoid these things
during PoC tests if they are not relevant for a final production setup.

Klaus


> --
> Ken Gaillot <kgaillot at redhat.com>
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20221221/ec29ae79/attachment.htm>


More information about the Users mailing list