[ClusterLabs] Possible idea for 2.0.0: renaming the Pacemaker daemons

Tue Apr 3 08:18:32 EDT 2018

On Tue, 3 Apr 2018 09:58:50 +1000
Andrew Beekhof <abeekhof at redhat.com> wrote:
> On Fri, Mar 30, 2018 at 8:36 PM, Jehan-Guillaume de Rorthais <
> jgdr at dalibo.com> wrote:  
> > On Thu, 29 Mar 2018 09:32:41 +1100
> > Andrew Beekhof <abeekhof at redhat.com> wrote:  
> > > On Thu, Mar 29, 2018 at 8:07 AM, Jehan-Guillaume de Rorthais <  
> > > jgdr at dalibo.com> wrote:  
[...]
> > > Though by now there is surely a decent library for logging to files with
> > > sub-second timestamps - if we could incorporate that into libqb and have
> > > corosync use it too...  
> >
> > In my opinion, this is neither the role of libqb  
> 
> 
> libqb has the logging library that pacemaker and corosync use.
> it is absolutely where this change should happen

I meant that this could be handled 100% by some external dedicated daemon, eg.
syslog or journalctl.

I was thinking about code simplification.

[...]

> > > then we could consider 1 log per daemon.
> > > In which case, the outcome of the PREFIX-SUFFIX discussion above could
> > > instead be used for /var/log/pacemaker/SUFFIX  
> >
> > I think the best would be to have one log for Corosync, one log for
> > Pacemaker (and all its sub-process/childs).
> >
> > Another good path toward understandable log file would be to hide what
> > process is speaking. Experienced user will still know that "LOG: setting
> > failcount to 3" comes from CRMd and "DEBUG1: failcount setted to 3" comes
> > from attrd.
> >
> > However, this would probably be a mess...because again, the cause might be
> > logged AFTER the effects/reaction :/
> 
> why?  i've never seen that be the case

Please find in attachment a demonstration of such behavior I found last week.
Note that this comes from a Sles 12 SP1 using Pacemaker 1.1.13...People there
were not able to upgrade the servers before we built the PoC together.

First column is the order in the log file. Second column is how I would expect
the messages to appear in the log.

Eg. I would expect L.11

  "pengine: notice: process_pe_message: Calculated Transition 29: [...]"

Before CRMd begin to process it at L.6-10.

Another exemple, I would expect LRMd L.35:

  "lrmd:  notice: log_finished:  finished - rsc:pgsqld action:notify"

Before the CRMd receive the result L.26...

Maybe this is something fixed in 1.1.18 or 2.0.0, I just couldn't find commit
messages related to this when searching through them quickly.

> > Maybe the solution is to log only messages from CRMD, where all the
> > orchestration comes from. Everything else might go to some debug level if
> > needed.
> 
> sorry, that is a terrible idea

I was throwing random ideas as I'm not familiar with internal architecture.
Maybe it should be pacemakerd to gather messages from all other messages and
spit them to stderr so they are captured by journald or redirected to a file...

Regards,
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mixed_messages.log
Type: text/x-log
Size: 5457 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20180403/4f45f368/attachment-0002.bin>