[ClusterLabs Developers] Opinions wanted: OCF agent types

Wed Aug 17 23:05:24 UTC 2022

On Wed, Aug 17, 2022 at 1:01 PM Ken Gaillot <kgaillot at redhat.com> wrote:
>
> Hi all,
>
> OCF 1.1 hasn't been out that long but I'm already looking ahead to OCF
> 1.2 (which would remain backward-compatible).
>
> One big addition I'm contemplating is defining OCF resource agent
> types, to address these problems:
>
> * Fence agents have a completely different standard from OCF resource
> agents, and lack some of the features available to OCF agents (such as
> meaningful error statuses and exit reasons for failures).
>
> * Pacemaker's node health feature uses OCF agents to monitor node
> conditions, but there are some user pain points involved since they are
> indistinguishable from regular OCF agents.
>
> * In the past there has been discussion of implementing "storage
> agents" to help manage replication of external storage devices,
> primarily for disaster recovery purposes.
>
> Visually, the agent type would be another field in
> the agent specification, for example ocf:fence:heartbeat:iscsi or
> ocf:health:pacemaker:cpu.
>
> "Regular" OCF agents would be (for example)
> ocf:service:heartbeat:apache in full, but for backward compatibility
> "service" would be the default, and ocf:heartbeat:apache would continue
> to work.
>
> Alternatively, if we want to keep it to three fields, we could do
> something like ocf-fence:heartbeat:iscsi and ocf-health:pacemaker:cpu.
>
> The OCF standard would have a shared section that all agent types would
> be required to support. This could include things like exit status
> codes, environment variables, and the meta-data action. Each agent type
> would then have its own section with anything specific to that type --
> for example, service agents need to support start and stop actions,
> while fence agents need to support off and optionally reboot.
>
> The benefits would include:
>
> * Agent writers would have fewer differences to worry about and
> libraries to learn.
>
> * Pacemaker and higher-level tools could easily distinguish agent types
> and respond intelligently. For example, higher-level shells could list
> all health agents and clone them automatically when used, and Pacemaker
> could automatically exempt health agents from health restrictions so
> that the agent can automatically detect when the node becomes healthy
> again.
>
> * We would have a framework for adding new types if the need arises.
>
> Thoughts?

It sounds like a good idea.

With regard to "service" as the default OCF resource agent type, this
may be confusing since we already have a "service" standard.

> --
> Ken Gaillot <kgaillot at redhat.com>
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/developers
>
> ClusterLabs home: https://www.clusterlabs.org/
>

-- 
Regards,

Reid Wahl (He/Him)
Senior Software Engineer, Red Hat
RHEL High Availability - Pacemaker