[ClusterLabs Developers] RA as a systemd wrapper -- the right way?

Mon May 22 13:26:36 EDT 2017

Resurrecting an old thread, because I stumbled on something relevant ...

There had been some discussion about having the ability to run a more
useful monitor operation on an otherwise systemd-based resource. We had
talked about a couple approaches with advantages and disadvantages.

I had completely forgotten about an older capability of pacemaker that
could be repurposed here: the (undocumented) "container" meta-attribute.

It was originally designed for running nagios checks on services inside
a virtual domain. The idea is that you can create an OCF VirtualDomain
resource, then create a nagios resource with its container set to the
VirtualDomain.

The effect is this: a resource with the container meta-attribute will be
started, stopped, and monitored normally, but if its monitor fails, it
will be recovered by recovering its container instead. Also, the
resource will be colocated with its container resource, and ordered
relative to it.

This works with the nagios use case because start and stop are
essentially no-ops for nagios resources. The nagios resource can "start"
on the same host that the VirtualDomain starts on, and the host will run
the nagios check at each monitor interval. If the monitor fails,
pacemaker will recover the VirtualDomain.

I haven't tested it, but this approach should work identically with a
systemd resource and a custom OCF resource with the extended monitor.
The OCF resource would function as a dummy resource (to know when it's
"running" or not), so start/stop would only set up the dummy state. If
the monitor fails, the systemd resource should be recovered.

If someone wants to verify that works, I'll make sure that the
documentation gets updated.
-- 
Ken Gaillot <kgaillot at redhat.com>