[ClusterLabs] Antw: Re: Pacemaker 2.0.3-rc3 now available

Mon Nov 18 12:13:57 EST 2019

On Mon, 18 Nov 2019 10:45:25 -0600
Ken Gaillot <kgaillot at redhat.com> wrote:

> On Fri, 2019-11-15 at 14:35 +0100, Jehan-Guillaume de Rorthais wrote:
> > On Thu, 14 Nov 2019 11:09:57 -0600
> > Ken Gaillot <kgaillot at redhat.com> wrote:
> >   
> > > On Thu, 2019-11-14 at 15:22 +0100, Ulrich Windl wrote:  
> > > > > > > Jehan-Guillaume de Rorthais <jgdr at dalibo.com> schrieb am
> > > > > > > 14.11.2019 um    
> > > > 
> > > > 15:17 in
> > > > Nachricht <20191114151719.6cbf4e38 at firost>:    
> > > > > On Wed, 13 Nov 2019 17:30:31 ‑0600
> > > > > Ken Gaillot <kgaillot at redhat.com> wrote:
> > > > > ...    
> > > > > > A longstanding pain point in the logs has been improved.
> > > > > > Whenever
> > > > > > the
> > > > > > scheduler processes resource history, it logs a warning for
> > > > > > any
> > > > > > failures it finds, regardless of whether they are new or old,
> > > > > > which can
> > > > > > confuse anyone reading the logs. Now, the log will contain
> > > > > > the
> > > > > > time of
> > > > > > the failure, so it's obvious whether you're seeing the same
> > > > > > event
> > > > > > or
> > > > > > not. The log will also contain the exit reason if one was
> > > > > > provided by
> > > > > > the resource agent, for easier troubleshooting.    
> > > > > 
> > > > > I've been hurt by this in the past and I was wondering what was
> > > > > the
> > > > > point of
> > > > > warning again and again in the logs for past failures during
> > > > > scheduling? 
> > > > > What this information brings to the administrator?    
> > > 
> > > The controller will log an event just once, when it happens.
> > > 
> > > The scheduler, on the other hand, uses the entire recorded resource
> > > history to determine the current resource state. Old failures (that
> > > haven't been cleaned) must be taken into account.  
> > 
> > OK, I wasn't aware of this. If you have a few minutes, I would be
> > interested to
> > know why the full history is needed and not just find the latest
> > entry from
> > there. Or maybe there's some comments in the source code that already
> > cover this question?  
> 
> The full *recorded* history consists of the most recent operation that
> affects the state (like start/stop/promote/demote), the most recent
> failed operation, and the most recent results of any recurring
> monitors.
> 
> For example there may be a failed monitor, but whether the resource is
> considered failed or not would depend on whether there was a more
> recent successful stop or start. Even if the failed monitor has been
> superseded, it needs to stay in the history for display purposes until
> the user has cleaned it up.

OK, understood.

Maybe that's why "FAILED" appears shortly in crm_mon during a resource move on
a clean resource, but with past failures? Maybe I should dig this weird
behavior and wrap up a bug report if I confirm this?

> > > Every run of the scheduler is completely independent, so it doesn't
> > > know about any earlier runs or what they logged. Think of it like
> > > Frosty the Snowman saying "Happy Birthday!" every time his hat is
> > > put
> > > on.  
> > 
> > I don't have this ref :)  
> 
> I figured not everybody would, but it was too fun to pass up :)
> 
> The snowman comes to life every time his magic hat is put on, but to
> him each time feels like he's being born for the first time, so he says
> "Happy Birthday!"
> 
> https://www.youtube.com/watch?v=1PbWTEYoN8o

heh :)

> > > As far as each run is concerned, it is the first time it's seen the
> > > history. This is what allows the DC role to move from node to node,
> > > and
> > > the scheduler to be run as a simulation using a saved CIB file.
> > > 
> > > We could change the wording further if necessary. The previous
> > > version
> > > would log something like:
> > > 
> > > warning: Processing failed monitor of my-rsc on node1: not running
> > > 
> > > and this latest change will log it like:
> > > 
> > > warning: Unexpected result (not running: No process state file
> > > found)
> > > was recorded for monitor of my-rsc on node1 at Nov 12 19:19:02 2019  
> > 
> > /result/state/ ?  
> 
> It's the result of a resource agent action, so it could be for example
> a timeout or a permissions issue.

ok

> > > I wanted to be explicit about the message being about processing
> > > resource history that may or may not be the first time it's been
> > > processed and logged, but everything I came up with seemed too long
> > > for
> > > a log line. Another possibility might be something like:
> > > 
> > > warning: Using my-rsc history to determine its current state on
> > > node1:
> > > Unexpected result (not running: No process state file found) was
> > > recorded for monitor at Nov 12 19:19:02 2019  
> > 
> > I better like the first one.
> > 
> > However, it feels like implementation details exposed to the world,
> > isn't it? How useful is this information for the end user? What the
> > user can do
> > with this information? There's noting to fix and this is not actually
> > an error
> > of the current running process.
> > 
> > I still fail to understand why the scheduler doesn't process the
> > history
> > silently, whatever it finds there, then warn for something really
> > important if
> > the final result is not expected...  
> 
> From the scheduler's point of view, it's all relevant information that
> goes into the decision making. Even an old failure can cause new
> actions, for example if quorum was not held at the time but has now
> been reached, or if there is a failure-timeout that just expired. So
> any failure history is important to understanding whatever the
> scheduler says needs to be done.
> 
> Also, the scheduler is run on the DC, which is not necessarily the node
> that executed the action. So it's useful for troubleshooting to present
> a picture of the whole cluster on the DC, rather than just what's the
> situation on the local node.

OK, kind of got it. The scheduler need to summarize the chain of event to
define the state of a resource based on the last event.

> I could see an argument for lowering it from warning to notice, but
> it's a balance between what's most useful during normal operation and
> what's most useful during troubleshooting.

So in my humble opinion, the messages should definitely be at notice level.
Maybe they should even go to debug level. I never had to troubleshoot a bad
decision from the scheduler because of a bad state summary.
Moreover, if needed, the admin can still study the history from cib backed up
on disk, isn't it?

The alternative would be to spit the event chain in details only if the result
of the summary is different from what the scheduler was expecting?