[ClusterLabs] Antw: [EXT] Stonith failing
Ken Gaillot
kgaillot at redhat.com
Mon Aug 17 11:19:45 EDT 2020
On Fri, 2020-08-14 at 15:09 +0200, Gabriele Bulfon wrote:
> Thanks to all your suggestions, I now have the systems with stonith
> configured on ipmi.
A word of caution: if the IPMI is on-board -- i.e. it shares the same
power supply as the computer -- power becomes a single point of
failure. If the node loses power, the other node can't fence because
the IPMI is also down, and the cluster can't recover.
Some on-board IPMI controllers can share an Ethernet port with the main
computer, which would be a similar situation.
It's best to have a backup fencing method when using IPMI as the
primary fencing method. An example would be an intelligent power switch
or sbd.
> Two questions:
> - how can I simulate a stonith situation to check that everything is
> ok?
> - considering that I have both nodes with stonith against the other
> node, once the two nodes can communicate, how can I be sure the two
> nodes will not try to stonith each other?
>
> :)
> Thanks!
> Gabriele
>
>
>
> Sonicle S.r.l. : http://www.sonicle.com
> Music: http://www.gabrielebulfon.com
> Quantum Mechanics : http://www.cdbaby.com/cd/gabrielebulfon
>
>
>
> Da: Gabriele Bulfon <gbulfon at sonicle.com>
> A: Cluster Labs - All topics related to open-source clustering
> welcomed <users at clusterlabs.org>
> Data: 29 luglio 2020 14.22.42 CEST
> Oggetto: Re: [ClusterLabs] Antw: [EXT] Stonith failing
>
>
> >
> > It is a ZFS based illumos system.
> > I don't think SBD is an option.
> > Is there a reliable ZFS based stonith?
> >
> > Gabriele
> >
> >
> >
> > Sonicle S.r.l. : http://www.sonicle.com
> > Music: http://www.gabrielebulfon.com
> > Quantum Mechanics : http://www.cdbaby.com/cd/gabrielebulfon
> >
> >
> >
> > Da: Andrei Borzenkov <arvidjaar at gmail.com>
> > A: Cluster Labs - All topics related to open-source clustering
> > welcomed <users at clusterlabs.org>
> > Data: 29 luglio 2020 9.46.09 CEST
> > Oggetto: Re: [ClusterLabs] Antw: [EXT] Stonith failing
> >
> >
> > >
> > >
> > > On Wed, Jul 29, 2020 at 9:01 AM Gabriele Bulfon <
> > > gbulfon at sonicle.com> wrote:
> > > > That one was taken from a specific implementation on Solaris
> > > > 11.
> > > > The situation is a dual node server with shared storage
> > > > controller: both nodes see the same disks concurrently.
> > > > Here we must be sure that the two nodes are not going to
> > > > import/mount the same zpool at the same time, or we will
> > > > encounter data corruption:
> > > >
> > >
> > >
> > > ssh based "stonith" cannot guarantee it.
> > >
> > > > node 1 will be perferred for pool 1, node 2 for pool 2, only in
> > > > case one of the node goes down or is taken offline the
> > > > resources should be first free by the leaving node and taken by
> > > > the other node.
> > > >
> > > > Would you suggest one of the available stonith in this case?
> > > >
> > > >
> > >
> > >
> > > IPMI, managed PDU, SBD ...
> > > In practice, the only stonith method that works in case of
> > > complete node outage including any power supply is SBD.
--
Ken Gaillot <kgaillot at redhat.com>
More information about the Users
mailing list