[ClusterLabs] Antw: [EXT] Correctly stop pacemaker on 2-node cluster with SBD and failed devices?

Andrei Borzenkov arvidjaar at gmail.com
Wed Jun 16 03:03:27 EDT 2021


On Wed, Jun 16, 2021 at 9:05 AM Ulrich Windl
<Ulrich.Windl at rz.uni-regensburg.de> wrote:
>
> >>> Andrei Borzenkov <arvidjaar at gmail.com> schrieb am 15.06.2021 um 17:20 in
> Nachricht
> <CAA91j0XaGFRrYvum=Do3qoPFe5YUj9s_4VoEHcAH72QAHyGBew at mail.gmail.com>:
> > We had the following situation
> >
> > 2‑node cluster with single device (just single external storage
> > available). Storage failed. So SBD lost access to the device. Cluster
> > was still up, both nodes were running.
>
> Shouldn't sbd fence then (after some delay)?
>

No. That is what pacemaker integration is for.

> >
> > We thought that access to storage was restored, but one step was
> > missing so devices appeared empty.
> >
> > At this point I tried to restart the pacemaker. But as soon as I
> > stopped pacemaker SBD rebooted nodes ‑ which is logical, as quorum was
> > now lost.
> >
> > How to cleanly stop pacemaker in this case and keep nodes up?
>
> Unconfigurte sbd devices I guess.
>

Do you have *practical* suggestions on how to do it online in a
running pacemaker cluster? Can you explain how it is going to help
given that lack of sbd device was not the problem in the first place?


More information about the Users mailing list