[ClusterLabs Developers] RA as a systemd wrapper -- the right way?
Adam Spiers
aspiers at suse.com
Sat Oct 22 00:40:08 UTC 2016
Ken Gaillot <kgaillot at redhat.com> wrote:
> On 09/26/2016 09:15 AM, Adam Spiers wrote:
> > [Sending this as a separate mail, since the last one was already (too)
> > long and focused on specific details, whereas this one takes a step
> > back to think about the bigger picture again.]
> >
> > Adam Spiers <aspiers at suse.com> wrote:
> >>>>>>> On 09/21/2016 03:25 PM, Adam Spiers wrote:
> >>>>>>>> As a result I have been thinking about the idea of changing the
> >>>>>>>> start/stop/status actions of these RAs so that they wrap around
> >>>>>>>> service(8) (which would be even more portable across distros than
> >>>>>>>> systemctl).
> >
> > [snipped discussion of OCF wrapper RA idea]
> >
> >> The fact that I don't see any problems where you apparently do makes
> >> me deeply suspicious of my own understanding ;-) Please tell me what
> >> I'm missing.
> >
> > [snipped]
> >
> > To clarify: I am not religiously defending this "wrapper OCF RA" idea
> > of mine to the death. It certainly sounds like it's not as clean as I
> > originally thought. But I'm still struggling to see any dealbreaker.
> >
> > OTOH, I'm totally open to better ideas.
> >
> > For example, could Pacemaker be extended to allow hybrid resources,
> > where some actions (such as start, stop, status) are handled by (say)
> > the systemd backend, and other actions (such as monitor) are handled
> > by (say) the OCF backend? Then we could cleanly rely on dbus for
> > collaborating with systemd, whilst adding arbitrarily complex
> > monitoring via OCF RAs. That would have several advantages:
> >
> > 1. Get rid of grotesque layering violations and maintenance boundaries
> > where the OCF RA duplicates knowledge of all kinds of things which
> > are distribution-specific, e.g.:
> >
> > https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/apache#L56
>
> A simplified agent will likely still need distro-specific intelligence
> to do even a limited subset of actions, so I'm not sure there's a gain
> there.
What distro-specific intelligence would it need? If the OCF RA was
only responsible for monitoring, it wouldn't need to know a lot of the
things which are only required for starting / stopping the service and
checking whether it's running, e.g.:
- Name of the daemon executable
- uid/gid it should be started as
- Daemon CLI arguments
- Location of pid file
In contrast, an OCF RA only responsible for monitoring would only need
to know how to talk to the service, which is not typically
distro-specific; in the REST API case, it only needs to know the endpoint
URL, which would be configured via Pacemaker resource parameters anyway.
> > 2. Drastically simplify OCF RAs by delegating start/stop/status etc.
> > to systemd, thereby increasing readability and reducing maintenance
> > burden.
> >
> > 3. OCF RAs are more likely to work out of the box with any distro,
> > or at least require less work to get working.
> >
> > 4. Services behave more similarly regardless of whether managed by
> > Pacemaker or the standard pid 1 service manager. For example, they
> > will always use the same pidfile, run as the same user, in the
> > right cgroup, be invoked with the same arguments etc.
> >
> > 5. Pacemaker can still monitor services accurately at the
> > application-level, rather than just relying on naive pid-level
> > monitoring.
> >
> > Or is this a terrible idea? ;-)
>
> I considered this, too. I don't think it's a terrible idea, but it does
> pose its own questions.
>
> * What hybrid actions should be allowed? It seems dangerous to allow
> starting from one code base and stopping from another, or vice versa,
> and really dangerous to allow something like migrate_to/migrate_from to
> be reimplemented. At one extreme, we allow anything and leave that
> responsibility on the user; at the other, we only allow higher-level
> monitors (i.e. using OCF_CHECK_LEVEL) to be hybridized.
Just monitors would be good enough for me.
> * Should the wrapper's actions be done instead of, or in addition to,
> the main resource's actions? Or maybe even allow the user to choose? I
> could see some wrappers intended to replace the native handling, and
> others to supplement it.
For my use case, in addition, because the only motivation is to
delegate start/stop/status to systemd (as happens currently with
systemd:* RAs) whilst retaining the ability to do service-level
testing of the resource via the OCF RA. So it wouldn't really be a
wrapper, but rather an extension.
In contrast, with the wrapper approach, it sounds like the delegation
would have to happen via systemctl not via Pacemaker's dbus code. And
if systemctl start/stop really are asynchronous non-blocking, the
delegation would need to be able to wrap these start/stop calls in a
polling loop as previously mentioned, in order to make them
synchronous non-blocking (which is the behaviour I think most people
would expect).
> * The answers to the above will help decide whether the wrapper is a
> separate resource (with its own parameters, operations, timeouts, etc.),
> or just a property of the main resource.
>
> * If we allow anything other than monitors to be hybridized, I think we
> get into a pacemaker-specific implementation. I don't think it's
> feasible to include this in the OCF standard -- it would essentially
> mandate pacemaker's "resource class" mechanism on all OCF users (which
> is beyond OCF's scope), and would likely break manual/scripted use
> altogether. We could possibly modify OCF so that agents so that no
> actions are mandatory, and it's up to the OCF-using software to verify
> that any actions it requires are supported. Or maybe wrappers just
> implement some actions as no-ops, and it's up to the user to know the
> limitations.
Sure. Hopefully you will be in Barcelona so we can discuss more?
More information about the Developers
mailing list