[ClusterLabs Developers] Opinions wanted: OCF agent types

Thu Aug 18 08:08:46 UTC 2022

On Thu, Aug 18, 2022 at 1:06 AM Reid Wahl <nwahl at redhat.com> wrote:
>
> On Wed, Aug 17, 2022 at 1:01 PM Ken Gaillot <kgaillot at redhat.com> wrote:
> >
> > Hi all,
> >
> > OCF 1.1 hasn't been out that long but I'm already looking ahead to OCF
> > 1.2 (which would remain backward-compatible).
> >
> > One big addition I'm contemplating is defining OCF resource agent
> > types, to address these problems:
> >
> > * Fence agents have a completely different standard from OCF resource
> > agents, and lack some of the features available to OCF agents (such as
> > meaningful error statuses and exit reasons for failures).
> >
> > * Pacemaker's node health feature uses OCF agents to monitor node
> > conditions, but there are some user pain points involved since they are
> > indistinguishable from regular OCF agents.
> >
> > * In the past there has been discussion of implementing "storage
> > agents" to help manage replication of external storage devices,
> > primarily for disaster recovery purposes.
> >
> > Visually, the agent type would be another field in
> > the agent specification, for example ocf:fence:heartbeat:iscsi or
> > ocf:health:pacemaker:cpu.
> >
> > "Regular" OCF agents would be (for example)
> > ocf:service:heartbeat:apache in full, but for backward compatibility
> > "service" would be the default, and ocf:heartbeat:apache would continue
> > to work.
> >
> > Alternatively, if we want to keep it to three fields, we could do
> > something like ocf-fence:heartbeat:iscsi and ocf-health:pacemaker:cpu.
> >
> > The OCF standard would have a shared section that all agent types would
> > be required to support. This could include things like exit status
> > codes, environment variables, and the meta-data action. Each agent type
> > would then have its own section with anything specific to that type --
> > for example, service agents need to support start and stop actions,
> > while fence agents need to support off and optionally reboot.
> >
> > The benefits would include:
> >
> > * Agent writers would have fewer differences to worry about and
> > libraries to learn.
> >
> > * Pacemaker and higher-level tools could easily distinguish agent types
> > and respond intelligently. For example, higher-level shells could list
> > all health agents and clone them automatically when used, and Pacemaker
> > could automatically exempt health agents from health restrictions so
> > that the agent can automatically detect when the node becomes healthy
> > again.
> >
> > * We would have a framework for adding new types if the need arises.
> >
> > Thoughts?
>
> It sounds like a good idea.
>
> With regard to "service" as the default OCF resource agent type, this
> may be confusing since we already have a "service" standard.

stumbled over that as well ... maybe simply 'resource'

Klaus

>
> > --
> > Ken Gaillot <kgaillot at redhat.com>
> >
> > _______________________________________________
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/developers
> >
> > ClusterLabs home: https://www.clusterlabs.org/
> >
>
>
> --
> Regards,
>
> Reid Wahl (He/Him)
> Senior Software Engineer, Red Hat
> RHEL High Availability - Pacemaker
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/developers
>
> ClusterLabs home: https://www.clusterlabs.org/
>