[ClusterLabs] Antw: Coming in 1.1.16: versioned resource parameters

Ken Gaillot kgaillot at redhat.com
Thu Aug 11 14:12:23 UTC 2016


On 08/11/2016 03:35 AM, Klaus Wenninger wrote:
> On 08/11/2016 09:13 AM, Ulrich Windl wrote:
>>>>> Ken Gaillot <kgaillot at redhat.com> schrieb am 10.08.2016 um 22:36 in Nachricht
>> <804dd911-56a6-328c-00a4-43133f59d39f at redhat.com>:
>>> Have you ever changed a resource agent in a backward-incompatible way,
>>> and found yourself wishing you could do a rolling upgrade?
>> Hi!
>>
>> It seems you are fighting the consequence, not the cause: Why do such a thing? Why not make an intermediate RA that complains about the old parameters being obsolete (BTW: Does the XML metadata have an "obsolete" attribute for parameters?) while also supporting the new parameters? Then you would first update your RAs, the the configuration. Everything will continue to work. Then you can make the next generation of RAs that drop support of the obsolete parameters.
>>
>> I'm afarid adding more and more features to pacemaker will make it bloatware, instead of being small, efficient and reliable.
> 
> There are probably a lot of reasons why the feature could be considered
> a good idea.
> 2 coming to my mind are:
> 
> - everything that allows us to reduce complexity in RAs is usually a
> good idea.
>   Having a generic feature that is used in a lot of use-cases with a lot
> of RAs
>   you will get this feature tested well.
>   Everything you implement in an RA will in the worst case just be
> tested by you
>   if it is something custom that is not of generic use. In any case it
> won't be
>   tested as well as a pacemaker-feature.
>   My personal experience that most of the time when something is not
> behaving
>   as expected it is rather a shortcoming of some RA than a pacemaker
> problem -
>   not saying pacemaker is perfect and doesn't have problems - don't get
> me wrong here.
> 
> - when it is about updating RAs you are not maintaining by yourself you
>   usually don't want to touch them. If you want them to already have the
> checking
>   built in then this will result in needing some kind of synchronization
> of the
>   update-cycles of the different RA-sources among each other and with
> your installation.
>>
>> Regards,
>> Ulrich

Bloat is definitely an issue to consider when adding features. I try to
weigh how many users might be interested, how isolated the new code can
be from other code, whether the feature has any performance impact when
not configured, what alternative approaches are available, and how well
it fits with pacemaker's existing design.

In this case, the main thing that reassured me was that the code is
reasonably well isolated and should have no significant effect when the
feature is not used in the configuration, and it fit very well with the
existing rules capability.

Klaus' comments about the limitations of handling it in the RA are a
reasonable argument for handling it within pacemaker.

Certainly, I agree the best approach is to maintain backward
compatibility in RAs, but that's not always under the control of the
cluster administrator.

>>> With a new feature in the current master branch, which will be part of
>>> the next Pacemaker release, you will be able to specify different
>>> resource parameters to be used with different versions of a resource agent.
>>>
>>> Pacemaker already supports using rules to select resource parameters.
>>> You can, for example, use different parameters on different nodes, or at
>>> different times of the day:
>>>
>>> http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explai 
>>> ned/index.html#_using_rules_to_control_resource_options
>>>
>>> The new feature allows the special node attribute #ra-version in these
>>> rules (comparable to the built-in #uname for node name). When a resource
>>> has a rule with #ra-version, Pacemaker will evaluate the rule against
>>> the installed version of the resource agent, as defined by
>>> <resource-agent ... version=... > in the agent's metadata.
>>>
>>> For example, the following XML configuration creates a resource "A", and
>>> passes the options "widget=1 really-old-param=5" if the resource agent
>>> version is 1.0 or older, and the options "widget=1 super-new-param=10"
>>> if the version is newer:
>>>
>>> <primitive id="A" class="ocf" provider="me" type="myRA">
>>>
>>>     <instance_attributes id="id01" score="3">
>>>         <rule id="id02" score="INFINITY" >
>>>              <expression id="id03"
>>>                  type=version attribute="#ra-version"
>>>                  operation="gt" value="1.0"/>
>>>         </rule>
>>>         <nvpair id="id04" name="super-new-param" value="10"/>
>>>     </instance_attributes>
>>>
>>>     <instance_attributes id="id05" score="2">
>>>         <rule id="id06" score="INFINITY" >
>>>              <expression id="id07"
>>>                  type=version attribute="#ra-version"
>>>                  operation="lte" value="1.0"/>
>>>         </rule>
>>>         <nvpair id="id08" name="really-old-param" value="5"/>
>>>     </instance_attributes>
>>>
>>>     <instance_attributes id="id09" score="1">
>>>         <nvpair id="id10" name="widget" value="1"/>
>>>     </instance_attributes>
>>>
>>> </primitive>
>>>
>>> Of course, higher-level tools may provide a more convenient interface.
>>>
>>> This allows for a rolling upgrade of a resource agent that changed
>>> parameters. Some nodes can have the older version, and others can have
>>> the newer version, and the correct parameters will be used wherever the
>>> resource is placed.
>>>
>>> Some considerations before using:
>>>
>>> * All nodes must be upgraded to a Pacemaker version supporting this
>>> feature before it can be used.
>>>
>>> * The version is re-checked whenever the resource is started. A stop
>>> action is always executed with the same parameters as the previous
>>> start. Therefore, it is still not recommended to upgrade a resource
>>> agent while the resource is active on that node -- each node should be
>>> put into standby when it is upgraded (or if only the resource agent is
>>> being upgraded, ensure the resource is not running on the node).
>>>
>>> * The version check requires an extra metadata call when starting the
>>> resource.
>>>
>>> * Live (hot) migration is disabled when versioned parameters are in use
>>> (otherwise, half the migration could be performed with one set of
>>> parameters and the other half with another set).
>>>
>>> The impact of the last two points can be minimized by using versioned
>>> parameters only while upgrades are being done, and using normal
>>> (unversioned) parameters otherwise.
>>>
>>> Special thanks to Igor Tsiglyar and Mikhail Ksenofontov, who created
>>> this feature as part of a student project with EMC under the supervision
>>> of Victoria Cherkalova.




More information about the Users mailing list