[ClusterLabs Developers] Resurrecting OCF
Jan Pokorný
jpokorny at redhat.com
Thu Sep 22 09:28:33 UTC 2016
On 21/09/16 16:26 -0500, Ken Gaillot wrote:
> On 09/21/2016 10:55 AM, Jan Pokorný wrote:
>> On 21/09/16 14:50 +1000, Andrew Beekhof wrote:
>>> I like where this is going.
>>> Although I don’t think we want to get into the business of trying to
>>> script config changes from one agent to another, so I’d drop #4
>>
>> Not agent parameter changes, just its specification -- to reflect
>> formally what the proposed symlink-based delegation scheme does when
>> the old one is still in use. If the old and new are incompatible,
>> such automatic delegation is not possible anyway (that's one of
>> the reasons "description" would come handy).
>>
>> I see there's much bigger potential (parameter renames, ...) but for
>> that, each agent should be responsible on its own (somehow, subject
>> of further evolution).
>>
>> Also, supposing there are more consumers of RA, the suggestion to
>> run the script should be more generic ("when used from under
>> pacemaker, ...").
>>
>>> I would make .deprecated a nested directory so that if we want to
>>> retire (for example) a ClusterLabs agent in the future we can create
>>> .deprecate/clusterlabs/ and put the agent there. Rather than make
>>> this heartbeat specific.
>>
>> Good point; it would also prevent clashes when single directory should
>> serve all the providers.
>
> I don't understand the desire to treat "deprecated" agents any
> differently. It should be sufficient to just mention it in their help
> text / man page / meta-data / other documentation. Pacemaker isn't going
> to run a "deprecated" agent any differently.
And I don't understand how you came into the conclusion there's
anything changed from the outer view, beside occasional note about
deprecation being emitted to the logs.
It'd be an implementation detail self-contained in resource-agents.
> When users see ocf:whatever:whatever, they know where to look for the
> script. Why frustrate them by making them waste time figuring out how a
> "nonexistent" RA is being used and finding it?
Symlink makes a clear connection and beside, I proposed "new-alias"
action. I think you overestimate how often the agents are physically
investigated (I guess the project would have more committers if it
was the case).
> If the goal is to let users know that an agent is deprecated (which is
> the only reason that I can think of), then we can add an attribute in
> the meta-data, and UIs/pacemaker can report/log it if present.
>
> <resource-agent
> name="Evmsd"
> deprecated="No longer actively maintained"
> >
>
>>> I wonder if some of this should live in pacemaker itself though…
>>
>> This runs directly to the other side of the RA-pacemaker bias,
>> pacemaker caring about RA evolutionary internals :-)
>>
>> In the outlook, that would make any separated OCF standard efforts
>> worthless and we could just call it pacemaker resource standard
>> right away and forget about any sort of self-containment
>> (the proposed procedure aims to align with).
>>
>> I am not sure that would be the best thing.
>
> Agreed, anything we come up with should be explicit in the OCF standard.
> But I think this behavior could be specified in the standard.
As the standard provides guarantees for outer interfacing, there's no
utter need to externalize otherwise self-contained subtleties, in this
case beyond saying that symlinks to __formatted__ files should be
excluded from agent lists (might be overridden on demand).
>>> If resources_action_create() cannot find ocf:${provider}:${agent} in
>>> its usual location, look up
>>> ${OCF_ROOT_DIR}/.compat/${provider}/__entries__
>>>
>>> Format for __entries__:
>>> # old, replacement
>>> # ${agent} , ${new_provider}:${new_agent} , ${description}
>>> IPaddr , clusterlabs:IP , Replaced with different semantics
>>> IPaddr2 , clusterlabs:IP , Moved
>>> drbd , linbit:drbd , Moved
>>> eDirectory , , Deleted
>>
>> Additional "what happened" field might work well in the update
>> suggestions.
>>
>>> Assuming an entry is found:
>>> - If . compat/${old_provider}/${old_agent} exists, notify the user
>>> “somehow”, then call it.
>>> - Otherwise, return OCF_ERR_NOT_INSTALLED and use ${description} and
>>> ${replacement} as the exit reason (which shows up in pcs status).
>>>
>>> Perhaps the “somehow” is creating PCMK_OCF_DEPRECATED (with the same
>>> semantics as PCMK_OCF_DEGRADED) and prepending ${description} to the
>>> output (assuming its not a metadata op) and/or the exit reason[1].
>>> Maybe only on successful start operations to minimise the noise?
>>>
>>> [1] Shouldn’t be too hard with some extra fields for 'struct
>>> svc_action_private_s’ or svc_action_t
>
> I like the idea of intelligent aliasing. I'm hoping we can do it without
> a separate directory structure and meta-meta-data files.
>
> What about continuing to use symlinks from the old name to the new name,
> and adding an aliases section to the agent meta-data?
>
> <resource-agent name="IP" version="1.5">
> <version>2.0</version>
> <aliases>
> <alias name="ocf:heartbeat:IPaddr"
> reason="Replaced with different semantics">
> <alias name="ocf:heartbeat:IPaddr2"
> reason="Superseded by clusterlabs provider">
> </aliases>
> </resource-agent>
>
> The file would be where people expect to find it, and the intent would
> be readable.
>
> If pacemaker loads an RA's metadata and finds the configured name in the
> aliases section, it could log a warning with the reason.
>
> The only drawback I see is that there is no inherent coordination
> between the symlinks and the aliases. But either would be fine without
> the other, so I don't see that as serious.
>
>>>
>>>> On 19 Aug 2016, at 6:59 PM, Jan Pokorný <jpokorny at redhat.com> wrote:
>>>>
>>>> On 18/08/16 17:27 +0200, Klaus Wenninger wrote:
>>>>> On 08/18/2016 05:16 PM, Ken Gaillot wrote:
>>>>>> On 08/18/2016 08:31 AM, Kristoffer Grönlund wrote:
>>>>>>> Jan Pokorný <jpokorny at redhat.com> writes:
>>>>>>>
>>>>>>>> Thinking about that, ClusterLabs may be considered a brand established
>>>>>>>> well enough for "clusterlabs" provider to work better than anything
>>>>>>>> general such as previously proposed "core". Also, it's not expected
>>>>>>>> there will be more RA-centered projects under this umbrella than
>>>>>>>> resource-agents (pacemaker deserves to be a provider on its own),
>>>>>>>> so it would be pretty unambiguous pointer.
>>>>>>> I like this suggestion as well.
>>>>>> Sounds good to me.
>>>>>>
>>>>>>>> And for new, not well-tested agents within resource-agents, there could
>>>>>>>> also be a provider schema akin to "clusterlabs-staging" introduced.
>>>>>>>>
>>>>>>>> 1 CZK
>>>>>>> ...and this too.
>>>>>> I'd rather not see this. If the RA gets promoted to "well-tested",
>>>>>> everyone's configuration has to change. And there's never a clear line
>>>>>> between "not well-tested" and "well-tested", so things wind up staying
>>>>>> in "beta" status long after they're widely used in production, which
>>>>>> unnecessarily makes people question their reliability.
>>>>>>
>>>>>> If an RA is considered experimental, say so in the documentation
>>>>>> (including the man page and help text), and give it an "0.x" version number.
>>>>>>
>>>>>>> Here is another one: While we are moving agents into a new namespace,
>>>>>>> perhaps it is time to clean up some of the legacy agents that are no
>>>>>>> longer recommended or of questionable quality? Off the top of my head,
>>>>>>> there are
>>>>>>>
>>>>>>> * heartbeat/Evmsd
>>>>>>> * heartbeat/EvmsSCC
>>>>>>> * heartbeat/LinuxSCSI
>>>>>>> * heartbeat/pingd
>>>>>>> * heartbeat/IPaddr
>>>>>>> * heartbeat/ManageRAID
>>>>>>> * heartbeat/vmware
>>>>>>>
>>>>>>> A pet peeve of mine would also be to move heartbeat/IPaddr2 to
>>>>>>> clusterlabs/IP, to finally get rid of that weird 2 in the name...
>>>>>> +1!!! (or is it -2?)
>>>>>>
>>>>>>> Cheers,
>>>>>>> Kristoffer
>>>>>> Obviously, we need to keep the ocf:heartbeat provider around for
>>>>>> backward compatibility, for the extensive existing uses both in cluster
>>>>>> configurations and in the zillions of how-to's scattered around the web.
>>>>>>
>>>>>> Also, despite the recommendation of creating your own provider, many
>>>>>> people drop custom RAs in the heartbeat directory.
>>>>>>
>>>>>> The simplest approach would be to just symlink heartbeat to clusterlabs,
>>>>>> but I think that's a bad idea. If a custom RA deployment or some package
>>>>>> other than resource-agents puts an RA there, resource-agents will try to
>>>>>> make it a symlink and the other package will try to make it a directory.
>>>>>> Plus, people may have configuration management systems and/or file
>>>>>> integrity systems that need it to be a directory.
>>>>>>
>>>>>> So, I'd recommend we keep the heartbeat directory, and keep the old RAs
>>>>>> you list above in it, move the rest of the RAs to the new clusterlabs
>>>>>> directory, and symlink each one back to the heartbeat directory. At the
>>>>>> same time, we can announce the heartbeat provider as deprecated, and
>>>>>> after a very long time (when it's difficult to find references to it via
>>>>>> google), we can drop it.
>>>>>
>>>>> Maybe a way to go for the staging-RAs as well:
>>>>> Have them in clusterlabs-staging and symlinked (during install
>>>>> or package-generation) into clusterlabs ... while they are
>>>>> cleanly separated in the source-tree.
>>>>
>>>> So, having some more thoughts on this, here's the possible action
>>>> plan (just for heartbeat -> clusterlabs transition + deprecating
>>>> some agents, but clusterlabs-staging -> clusterlabs would be similar):
>>>>
>>>> # (adapt and) move original heartbeat agents
>>>>
>>>> 1. have a resource.d subdirectory "clusterlabs" and move (possibly under
>>>> new names) agents that were a priori updated to reflect new revision
>>>> of OCF there
>>>>
>>>> 2. have a resource.d subdirectory ".deprecated" (for instance) and
>>>> move the RAs that are going to be sunset over there (i.e.,
>>>> original heartbeat agents = agents moved to clusterlabs + agents
>>>> moved to .deprecated + agents that remained under heartbeat, pending
>>>> to be moved under cluster labs)
>>>>
>>>> # preparation for backward compatibility
>>>>
>>>> 3. have a file with old heartbeat name -> new clusterlabs name mapping
>>>> for the agents from 0., i.e., hence physically changed the directory;
>>>> the format can be as simple as CVS with "old name; [new name]" lines
>>>> where omitted new name means that actual name hasn't changed
>>>> (unlike proposed IPaddress2 -> IP)
>>>>
>>>> 4. have an XSL template that will convert resource references per the
>>>> translation file from 3. (this XSLT should be automatically
>>>> generated based on that file) and a script that will call
>>>> something like:
>>>> cibadmin -Q | xsltproc <XSLT> - | cibadmin --replace --xml-pipe
>
> XSL to replace some agent names? I think sed is enough for that :)
>
> Perhaps we could expand any RA aliases as part of "cibadmin --upgrade"
> (or have a separate option for it). That would trigger resource
> restarts, though, unless we added intelligence to allow that in pacemaker.
>
>>>> 5. have a shell script "__cl_compat__" (for instance, name clearly
>>>> distinguishable will become handy later on), that will:
>>>> - figure which symlink it was called under ("$0") and figure out
>>>> how it should behave based on file from 3.:
>>>> . $0 found as old name with new name -> clusterlabs/<new name>
>>>> will be called
>>>> . $0 found as old name without new name -> clusterlabs/<old name>
>>>> will be called
>>>> . $0 not found as old name -> .deprecated/<old name> will be
>>>> called if exists (otherwise fail early)
>>>> - if "$HA_RSCTMP/$(basename $0)_compat" exists, just run:
>>>> $0 "$@"; exit $?
>>>> the purpose here is to avoid excessive spamming in the logs
>>>> - touch "$HA_RSCTMP/$(basename $0)_compat"
>>>> - emit a warning "Your configuration referes to the agent with
>>>> an obsolete specification", followed with corresponding:
>>>> . "please consider changing ocf:heartbeat:<old name> to
>>>> ocf:clusterlabs:<new name>, you may use <script from 4.>
>>>> to ease such transition"
>>>> . "please consider changing ocf:heartbeat:<old name> to
>>>> ocf:clusterlabs:<old name>, you may use <script from 4.>
>>>> to ease such transition"
>>>> . "please consider finding another alternative for
>>>> ocf:heartbeat:<old name> as this agent is not actively
>>>> maintained and will be dropped in the next major release;
>>>> alternatively, if you volunteer to maintain it,
>>>> please reach developers at clusterlabs.org <mailto:developers at clusterlabs.org> mailing list"
>>>>
>>>> # plugging it all together
>>>>
>>>> 6. for agents moved from heartbeat in any of clusterlabs/.deprecated,
>>>> (items 1. and 2.), provide respective symlinks from heartbeat
>>>> pointing to __cl_compat__ script from 5.
>>>>
>>>> Possibly recycle for clusterlabs-staging idea.
>>>>
>>>>
>>>> Now, for the higher level tools (crm, pcs), they should avoid listing
>>>> or suggesting agents that are symlinks to files matching wildcard
>>>> "__*__", and perhaps even actively suggest the alternative if this
>>>> such one is to be used -- this could be reached by making __compat__
>>>> script from 5. handle one new action (to be reflected in the OCF
>>>> revision as optional), say "new-alias" that would output what
>>>> to use instead (based on file from 3. it works with anyway).
>>>>
>>>>>> I wouldn't even want to update ClusterLabs docs to use the new name
>>>>>> until all major distros have the new resource-agents, which would
>>>>>> probably be at least a couple of years (I'm looking at you, Debian).
--
Jan (Poki)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/developers/attachments/20160922/d9ca30f7/attachment-0004.sig>
More information about the Developers
mailing list