[ClusterLabs] Antw: Re: Antw: [EXT] Cluster breaks after pcs unstandby node

Andrei Borzenkov arvidjaar at gmail.com
Mon Jan 18 04:59:02 EST 2021


On Mon, Jan 18, 2021 at 12:00 PM Steffen Vinther Sørensen
<svinther at gmail.com> wrote:
>
> Hi,
>
> I have persistent journal, but 'journalctl -b -1' was empty in this
> case, so it might not be optimally configured. And centralized logging
> is on the todo list
>
>
> btw. about the fencing, I have set ' HandlePowerKey=ignore' in
> /etc/systemd/logind.conf
> (for this hardware, I can find no bios settings on how to react to
> power key being pressed, so can not be set to instant-off)
>
> Now when a node is fenced it goes down more quickly, and its only
> journal output is:
> Jan 18 09:33:19 kvm03-node03 systemd-logind[4354]: Power key pressed.
> Jan 18 09:33:24 kvm03-node03 systemd-logind[4354]: Power key pressed.
>

By default pressing power key initiates graceful operating system
shutdown. It is not what is usually expected of fencing - fencing
should turn off system as soon as possible. You have no way to know
when operating system shutdown has completed. You would need to check
documentation of your hardware how to force hardware power off/reset
instead of soft shutdown.

> So it seems it needs to be pressed twice with 5 sec delay, and by
> looking at the hardware console, the system does not reboot before
> about 09.33.27 ( 8 secs totally)
>
> When the node is back online, 'journalctl -b -1' only reports the first
> Jan 18 09:33:19 kvm03-node03 systemd-logind[4354]: Power key pressed.
>
> The second line was never written to persistent journal
>
>
>
> On Mon, Jan 18, 2021 at 8:49 AM Ulrich Windl
> <Ulrich.Windl at rz.uni-regensburg.de> wrote:
> >
> > >>> Steffen Vinther Sørensen <svinther at gmail.com> schrieb am 16.01.2021 um
> > 19:28 in
> > Nachricht
> > <CALhdMBho79Kd7XjV2BvD+-J5i+94vKejnJYB5UEjG=w_hG1Scg at mail.gmail.com>:
> > > Hi and thank you for the insights
> >
> > Hi!
> > ...
> >
> > > I just did a test after the latest adjustments with colocations etc.
> > > trying to standby node02, ends up with node02 being fenced before
> > > migrations complete. Unfortunately logs from node02 was lost
> >
> > Don't you have a persistent journal on node2? Maybe it's a good idea to  make
> > all nodes log to an external syslog server, at least until your problems are
> > fixed. That would also have the benefit that you get a better global insight of
> > the sequence of events...
> >
> > ...
> >
> > Regards,
> > Ulrich
> >
> > _______________________________________________
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> >
> > ClusterLabs home: https://www.clusterlabs.org/
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/


More information about the Users mailing list