[ClusterLabs] low-cost ways to make Pacemaker more usable?
Ken Gaillot
kgaillot at redhat.com
Thu Dec 7 12:41:25 EST 2017
On Thu, 2017-12-07 at 17:15 +0000, Adam Spiers wrote:
> Ken Gaillot <kgaillot at redhat.com> wrote:
> > On Thu, 2017-12-07 at 12:13 +0000, Adam Spiers wrote:
> > > https://gocardless.com/blog/incident-review-api-and-dashboard-out
> > > age-
> > > on-10th-october/
> > >
> > > It's a great write-up, although a little frustrating that it is
> > > still
> > > not fully understood why a -inf colocation failed whereas a +inf
> > > succeeded. (I actually have a vague memory of discovering
> > > something
> > > very similar a while back, but I can't find the details.)
> >
> > That is an excellent post. I'll contact them directly to discuss it
> > further.
>
> Cool, thanks!
>
> > > IMHO this serves as a good example of the difficulty Pacemaker
> > > faces,
> > > and consequently as valuable feedback for how Pacemaker needs to
> > > improve: it's all too easy to do one tiny misconfiguration which
> > > can
> > > potentially bring the whole house of cards tumbling down, and
> > > it's
> > > often really hard to understand what went wrong.
> > >
> > > So FWIW, my personal view is that more than anything else right
> > > now,
> > > Pacemaker needs to be made easier to understand. I know this is
> > > a
> >
> > Agreed, but there are about a dozen things that are more important
> > than
> > anything else right now ;)
>
> Heheh yeah, I can related to that feeling ;-)
>
> > Personally, my current focus is technical debt: stripping out all
> > the
> > legacy features that were deprecated in 1.1.18, so we can release
> > 2.0.0
> > with a smaller code base that is easier to maintain going forward.
> > The
> > hope is that this pays off in greater time savings down the road,
> > but
> > it sucks up a lot of time in the near term.
> >
> > There are a large number of outstanding bug reports that bother me,
> > several of them quite serious, and I would like to spend more time
> > on
> > those before new features, but ...
> >
> > There is constant demand for new features from paying customers,
> > and we
> > can't stay relevant without trying to keep up at least to an
> > extent.
> > Several recent projects (bundles, alerts, versioned attributes)
> > could
> > really benefit from some follow-up work, and more major projects
> > are
> > right on the horizon (failure handling configuration overhaul,
> > crm_mon
> > overhaul, containerization of pacemaker/corosync, corosync 3/knet
> > compatibility).
> >
> > And of course usability is, indeed, an incredibly important area to
> > be
> > addressed, spanning log messages, documentation, and tooling.
>
> Yep, totally understood.
>
> > Which is to say, volunteers welcome :-)
>
> ... which is the cue for everyone to run away, leaving tumbleweed
> silence ;-)
>
> Seriously though, I acknowledge the lack of resources, so maybe just
> aim for a few small steps forward here and there?
>
> For example, making a few of the most crucial existing log messages
> less cryptic could maybe go a long way. Or if "dumbing down" log
> messages would make life harder for developers who are familiar with
> Pacemaker internals and need to be able to track all the gory
> details,
> recognise the fact that the kind of logs which developers and users
> need to read are vastly different, and consequently provide a way of
> distinguishing between the two kinds. Making all developer logs
> DEBUG
> level and non-developer other levels might be one way, but there are
> probably better approaches (e.g. tag all developer logs with a
> certain
> string which can be filtered out).
You're late to the party on this one :)
We do try to keep all messages of interest to novice users at the
critical-to-notice levels (which go to syslog by default), messages of
interest to more advanced users at the info level and to developers at
the debug-to-trace levels (which go to pacemaker.log by default).
There was a big push a few releases back to improve the wording of the
most user-visible log messages. You should have seen them before. ;)
In a 2015 release (libqb + pacemaker), we added support for a single
message to go into both syslog and pacemaker.log with different levels
of detail. The syslog message has plain English for users, and the
pacemaker.log message has added debugging information tacked onto the
end. For an example, see the pacemakerd "Starting Pacemaker" message in
each log.
This is definitely ongoing, and it would be really helpful to have
examples of particular messages of how they are now vs what they should
say.
> Another simple change would be to adopt a policy that rather than
> sharing information on this list in response to questions which
> arise,
> add the answers to the documentation and then just give a short reply
> to the list saying "here's the link to the documentation I just
> updated". I'm sure that the archives of this list are an absolute
> gold mine of useful information, but list archives make for really
> poor documentation ...
Agreed in principle, but again it goes back to time. Better a mailing
list post than something at the end of the to-do list. (Wiki edits
don't take much more time than mailing lists, so I could see taking
more advantage of that.)
> And BTW, lest I come across as a constant whinger ... I think you're
> doing an absolutely fantastic job as maintainer! ;-)
Thanks, and I definitely encourage your comments, they're helpful.
--
Ken Gaillot <kgaillot at redhat.com>
More information about the Users
mailing list