[ClusterLabs Developers] Resurrecting OCF
Ken Gaillot
kgaillot at redhat.com
Wed Sep 21 23:26:03 CEST 2016
On 09/21/2016 10:55 AM, Jan Pokorný wrote:
> On 21/09/16 14:50 +1000, Andrew Beekhof wrote:
>> I like where this is going.
>> Although I don’t think we want to get into the business of trying to
>> script config changes from one agent to another, so I’d drop #4
>
> Not agent parameter changes, just its specification -- to reflect
> formally what the proposed symlink-based delegation scheme does when
> the old one is still in use. If the old and new are incompatible,
> such automatic delegation is not possible anyway (that's one of
> the reasons "description" would come handy).
>
> I see there's much bigger potential (parameter renames, ...) but for
> that, each agent should be responsible on its own (somehow, subject
> of further evolution).
>
> Also, supposing there are more consumers of RA, the suggestion to
> run the script should be more generic ("when used from under
> pacemaker, ...").
>
>> I would make .deprecated a nested directory so that if we want to
>> retire (for example) a ClusterLabs agent in the future we can create
>> .deprecate/clusterlabs/ and put the agent there. Rather than make
>> this heartbeat specific.
>
> Good point; it would also prevent clashes when single directory should
> serve all the providers.
I don't understand the desire to treat "deprecated" agents any
differently. It should be sufficient to just mention it in their help
text / man page / meta-data / other documentation. Pacemaker isn't going
to run a "deprecated" agent any differently.
When users see ocf:whatever:whatever, they know where to look for the
script. Why frustrate them by making them waste time figuring out how a
"nonexistent" RA is being used and finding it?
If the goal is to let users know that an agent is deprecated (which is
the only reason that I can think of), then we can add an attribute in
the meta-data, and UIs/pacemaker can report/log it if present.
<resource-agent
name="Evmsd"
deprecated="No longer actively maintained"
>
>> I wonder if some of this should live in pacemaker itself though…
>
> This runs directly to the other side of the RA-pacemaker bias,
> pacemaker caring about RA evolutionary internals :-)
>
> In the outlook, that would make any separated OCF standard efforts
> worthless and we could just call it pacemaker resource standard
> right away and forget about any sort of self-containment
> (the proposed procedure aims to align with).
>
> I am not sure that would be the best thing.
Agreed, anything we come up with should be explicit in the OCF standard.
But I think this behavior could be specified in the standard.
>> If resources_action_create() cannot find ocf:${provider}:${agent} in
>> its usual location, look up
>> ${OCF_ROOT_DIR}/.compat/${provider}/__entries__
>>
>> Format for __entries__:
>> # old, replacement
>> # ${agent} , ${new_provider}:${new_agent} , ${description}
>> IPaddr , clusterlabs:IP , Replaced with different semantics
>> IPaddr2 , clusterlabs:IP , Moved
>> drbd , linbit:drbd , Moved
>> eDirectory , , Deleted
>
> Additional "what happened" field might work well in the update
> suggestions.
>
>> Assuming an entry is found:
>> - If . compat/${old_provider}/${old_agent} exists, notify the user
>> “somehow”, then call it.
>> - Otherwise, return OCF_ERR_NOT_INSTALLED and use ${description} and
>> ${replacement} as the exit reason (which shows up in pcs status).
>>
>> Perhaps the “somehow” is creating PCMK_OCF_DEPRECATED (with the same
>> semantics as PCMK_OCF_DEGRADED) and prepending ${description} to the
>> output (assuming its not a metadata op) and/or the exit reason[1].
>> Maybe only on successful start operations to minimise the noise?
>>
>> [1] Shouldn’t be too hard with some extra fields for 'struct
>> svc_action_private_s’ or svc_action_t
I like the idea of intelligent aliasing. I'm hoping we can do it without
a separate directory structure and meta-meta-data files.
What about continuing to use symlinks from the old name to the new name,
and adding an aliases section to the agent meta-data?
<resource-agent name="IP" version="1.5">
<version>2.0</version>
<aliases>
<alias name="ocf:heartbeat:IPaddr"
reason="Replaced with different semantics">
<alias name="ocf:heartbeat:IPaddr2"
reason="Superseded by clusterlabs provider">
</aliases>
</resource-agent>
The file would be where people expect to find it, and the intent would
be readable.
If pacemaker loads an RA's metadata and finds the configured name in the
aliases section, it could log a warning with the reason.
The only drawback I see is that there is no inherent coordination
between the symlinks and the aliases. But either would be fine without
the other, so I don't see that as serious.
>>
>>> On 19 Aug 2016, at 6:59 PM, Jan Pokorný <jpokorny at redhat.com> wrote:
>>>
>>> On 18/08/16 17:27 +0200, Klaus Wenninger wrote:
>>>> On 08/18/2016 05:16 PM, Ken Gaillot wrote:
>>>>> On 08/18/2016 08:31 AM, Kristoffer Grönlund wrote:
>>>>>> Jan Pokorný <jpokorny at redhat.com> writes:
>>>>>>
>>>>>>> Thinking about that, ClusterLabs may be considered a brand established
>>>>>>> well enough for "clusterlabs" provider to work better than anything
>>>>>>> general such as previously proposed "core". Also, it's not expected
>>>>>>> there will be more RA-centered projects under this umbrella than
>>>>>>> resource-agents (pacemaker deserves to be a provider on its own),
>>>>>>> so it would be pretty unambiguous pointer.
>>>>>> I like this suggestion as well.
>>>>> Sounds good to me.
>>>>>
>>>>>>> And for new, not well-tested agents within resource-agents, there could
>>>>>>> also be a provider schema akin to "clusterlabs-staging" introduced.
>>>>>>>
>>>>>>> 1 CZK
>>>>>> ...and this too.
>>>>> I'd rather not see this. If the RA gets promoted to "well-tested",
>>>>> everyone's configuration has to change. And there's never a clear line
>>>>> between "not well-tested" and "well-tested", so things wind up staying
>>>>> in "beta" status long after they're widely used in production, which
>>>>> unnecessarily makes people question their reliability.
>>>>>
>>>>> If an RA is considered experimental, say so in the documentation
>>>>> (including the man page and help text), and give it an "0.x" version number.
>>>>>
>>>>>> Here is another one: While we are moving agents into a new namespace,
>>>>>> perhaps it is time to clean up some of the legacy agents that are no
>>>>>> longer recommended or of questionable quality? Off the top of my head,
>>>>>> there are
>>>>>>
>>>>>> * heartbeat/Evmsd
>>>>>> * heartbeat/EvmsSCC
>>>>>> * heartbeat/LinuxSCSI
>>>>>> * heartbeat/pingd
>>>>>> * heartbeat/IPaddr
>>>>>> * heartbeat/ManageRAID
>>>>>> * heartbeat/vmware
>>>>>>
>>>>>> A pet peeve of mine would also be to move heartbeat/IPaddr2 to
>>>>>> clusterlabs/IP, to finally get rid of that weird 2 in the name...
>>>>> +1!!! (or is it -2?)
>>>>>
>>>>>> Cheers,
>>>>>> Kristoffer
>>>>> Obviously, we need to keep the ocf:heartbeat provider around for
>>>>> backward compatibility, for the extensive existing uses both in cluster
>>>>> configurations and in the zillions of how-to's scattered around the web.
>>>>>
>>>>> Also, despite the recommendation of creating your own provider, many
>>>>> people drop custom RAs in the heartbeat directory.
>>>>>
>>>>> The simplest approach would be to just symlink heartbeat to clusterlabs,
>>>>> but I think that's a bad idea. If a custom RA deployment or some package
>>>>> other than resource-agents puts an RA there, resource-agents will try to
>>>>> make it a symlink and the other package will try to make it a directory.
>>>>> Plus, people may have configuration management systems and/or file
>>>>> integrity systems that need it to be a directory.
>>>>>
>>>>> So, I'd recommend we keep the heartbeat directory, and keep the old RAs
>>>>> you list above in it, move the rest of the RAs to the new clusterlabs
>>>>> directory, and symlink each one back to the heartbeat directory. At the
>>>>> same time, we can announce the heartbeat provider as deprecated, and
>>>>> after a very long time (when it's difficult to find references to it via
>>>>> google), we can drop it.
>>>>
>>>> Maybe a way to go for the staging-RAs as well:
>>>> Have them in clusterlabs-staging and symlinked (during install
>>>> or package-generation) into clusterlabs ... while they are
>>>> cleanly separated in the source-tree.
>>>
>>> So, having some more thoughts on this, here's the possible action
>>> plan (just for heartbeat -> clusterlabs transition + deprecating
>>> some agents, but clusterlabs-staging -> clusterlabs would be similar):
>>>
>>> # (adapt and) move original heartbeat agents
>>>
>>> 1. have a resource.d subdirectory "clusterlabs" and move (possibly under
>>> new names) agents that were a priori updated to reflect new revision
>>> of OCF there
>>>
>>> 2. have a resource.d subdirectory ".deprecated" (for instance) and
>>> move the RAs that are going to be sunset over there (i.e.,
>>> original heartbeat agents = agents moved to clusterlabs + agents
>>> moved to .deprecated + agents that remained under heartbeat, pending
>>> to be moved under cluster labs)
>>>
>>> # preparation for backward compatibility
>>>
>>> 3. have a file with old heartbeat name -> new clusterlabs name mapping
>>> for the agents from 0., i.e., hence physically changed the directory;
>>> the format can be as simple as CVS with "old name; [new name]" lines
>>> where omitted new name means that actual name hasn't changed
>>> (unlike proposed IPaddress2 -> IP)
>>>
>>> 4. have an XSL template that will convert resource references per the
>>> translation file from 3. (this XSLT should be automatically
>>> generated based on that file) and a script that will call
>>> something like:
>>> cibadmin -Q | xsltproc <XSLT> - | cibadmin --replace --xml-pipe
XSL to replace some agent names? I think sed is enough for that :)
Perhaps we could expand any RA aliases as part of "cibadmin --upgrade"
(or have a separate option for it). That would trigger resource
restarts, though, unless we added intelligence to allow that in pacemaker.
>>> 5. have a shell script "__cl_compat__" (for instance, name clearly
>>> distinguishable will become handy later on), that will:
>>> - figure which symlink it was called under ("$0") and figure out
>>> how it should behave based on file from 3.:
>>> . $0 found as old name with new name -> clusterlabs/<new name>
>>> will be called
>>> . $0 found as old name without new name -> clusterlabs/<old name>
>>> will be called
>>> . $0 not found as old name -> .deprecated/<old name> will be
>>> called if exists (otherwise fail early)
>>> - if "$HA_RSCTMP/$(basename $0)_compat" exists, just run:
>>> $0 "$@"; exit $?
>>> the purpose here is to avoid excessive spamming in the logs
>>> - touch "$HA_RSCTMP/$(basename $0)_compat"
>>> - emit a warning "Your configuration referes to the agent with
>>> an obsolete specification", followed with corresponding:
>>> . "please consider changing ocf:heartbeat:<old name> to
>>> ocf:clusterlabs:<new name>, you may use <script from 4.>
>>> to ease such transition"
>>> . "please consider changing ocf:heartbeat:<old name> to
>>> ocf:clusterlabs:<old name>, you may use <script from 4.>
>>> to ease such transition"
>>> . "please consider finding another alternative for
>>> ocf:heartbeat:<old name> as this agent is not actively
>>> maintained and will be dropped in the next major release;
>>> alternatively, if you volunteer to maintain it,
>>> please reach developers at clusterlabs.org <mailto:developers at clusterlabs.org> mailing list"
>>>
>>> # plugging it all together
>>>
>>> 6. for agents moved from heartbeat in any of clusterlabs/.deprecated,
>>> (items 1. and 2.), provide respective symlinks from heartbeat
>>> pointing to __cl_compat__ script from 5.
>>>
>>> Possibly recycle for clusterlabs-staging idea.
>>>
>>>
>>> Now, for the higher level tools (crm, pcs), they should avoid listing
>>> or suggesting agents that are symlinks to files matching wildcard
>>> "__*__", and perhaps even actively suggest the alternative if this
>>> such one is to be used -- this could be reached by making __compat__
>>> script from 5. handle one new action (to be reflected in the OCF
>>> revision as optional), say "new-alias" that would output what
>>> to use instead (based on file from 3. it works with anyway).
>>>
>>>>> I wouldn't even want to update ClusterLabs docs to use the new name
>>>>> until all major distros have the new resource-agents, which would
>>>>> probably be at least a couple of years (I'm looking at you, Debian).
More information about the Developers
mailing list