[ClusterLabs Developers] Proposal for OCF 1.2: Agent types

Ken Gaillot kgaillot at redhat.com
Thu Feb 24 17:02:12 UTC 2022


Hi all,

OCF 1.1 hasn't been out that long but I'm already looking ahead to OCF
1.2 (which would remain backward-compatible). I have a proposal that
would tackle these issues:

* Fence agents have a completely different standard from OCF resource
agents, and lack some of the features available to OCF agents (such as
meaningful error statuses and exit reasons for failures).

* Pacemaker's node health feature uses OCF agents to monitor node
conditions, but there are some user pain points involved since they are
indistinguishable from regular OCF agents.

* In the past there has been discussion of implementing "storage
agents" to help manage replication of external storage devices,
primarily for disaster recovery purposes.

My proposal is "agent types". Visually this would be another field in
the agent specification, for example ocf:fence:heartbeat:iscsi or
ocf:health:pacemaker:cpu.

"Regular" OCF agents would be (for example)
ocf:service:heartbeat:apache in full, but for backward compatibility
"service" would be the default, and ocf:heartbeat:apache would continue
to work.

Alternatively, if we want to keep it at three fields, we could do
something like ocf-fence:heartbeat:iscsi and ocf-health:pacemaker:cpu.

The OCF standard would have a shared section that all agent types would
be required to support. This could include things like exit status
codes, environment variables, and the meta-data action. Each agent type
would then have its own section with anything specific to that type --
for example, service agents need to support start and stop actions,
while fence agents need to support off and optionally reboot.

The benefits would include:

* Agent writers would have fewer differences to worry about and
libraries to learn.

* Pacemaker and higher-level tools could easily distinguish agent types
and respond intelligently. For example, a tool could list all health
agents and clone them automatically when used, and Pacemaker could
automatically exempt health agents from health restrictions so that the
agent can automatically detect when the node becomes healthy again.

* We would have a model for adding new types when needed.

Thoughts?
-- 
Ken Gaillot <kgaillot at redhat.com>



More information about the Developers mailing list