[ClusterLabs] Antw: [EXT] Coming in Pacemaker 2.1.2: better display of internal failures
Ken Gaillot
kgaillot at redhat.com
Wed Oct 20 09:52:23 EDT 2021
On Wed, 2021-10-20 at 09:35 +0200, Ulrich Windl wrote:
> > > > Ken Gaillot <kgaillot at redhat.com> schrieb am 19.10.2021 um
> > > > 19:16 in
> Nachricht
> <edb5ca2fd29f3f514679c39a71513fa5a4b2f88d.camel at redhat.com>:
> > Hi all,
> >
> > I hope to get the first release candidate for Pacemaker 2.1.2 out
> > in a
> > couple of weeks.
> >
> > One improvement will be in status displays (crm_mon, and the
> > crm_resource ‑‑force‑* options) for failed actions.
> >
> > OCF resource agents already have the ability to output an "exit
> > reason"
> > for failures. These are displayed in the status, to give more
> > detailed
> > information than just "error".
> >
> > Now, Pacemaker will set exit reasons for internal failures as well.
> > This includes problems such as an agent or systemd unit not being
> > installed, timeouts in Pacemaker communication as opposed to the
> > agent
> > itself, an agent process being killed by a signal, etc.
> >
> > As an example, sending a kill ‑9 to a running agent monitor would
> > previously result in status with no explanation, requiring some log
> > diving to figure it out:
> >
> > * rsc1_monitor_60000 on node1 'error' (1): call=188,
> > status='Error',
> > exitreason='', last‑rc‑change='Fri Sep 24 14:45:02 2021',
> > queued=0ms,
> > exec=0ms
> >
> > Now, the exit reason will plainly say what happened:
> >
> > * rsc1_monitor_60000 on node1 'error' (1): call=188,
> > status='Error',
> > exitreason='Process interrupted by signal', last‑rc‑change='Fri Sep
> > 24
> > 14:45:02 2021', queued=0ms, exec=0ms
>
> Oops: When you detected that a process was terminated by a signal you
> would
> also know _which_ signal; why not log it then?
> And: Do you also detect and log when a core-dump was created?
>
> That would just sound logical to me.
>
> Regards,
> Ulrich
Yes, the log messages do have more detail -- the crm_mon display has to
be more concise, but it should at least give a strong pointer to what
to look for in the logs or elsewhere.
--
Ken Gaillot <kgaillot at redhat.com>
More information about the Users
mailing list