[ClusterLabs] Antw: Doing reload right

Wed Jul 13 23:45:16 EDT 2016

On Fri, Jul 1, 2016 at 7:48 PM, Jan Pokorný <jpokorny at redhat.com> wrote:
> On 01/07/16 09:23 +0200, Ulrich Windl wrote:
>>>>> Ken Gaillot <kgaillot at redhat.com> schrieb am 30.06.2016 um 18:58 in Nachricht
>> <57754F9F.8070908 at redhat.com>:
>>> I've been meaning to address the implementation of "reload" in Pacemaker
>>> for a while now, and I think the next release will be a good time, as it
>>> seems to be coming up more frequently.
>>>
>>> In the current implementation, Pacemaker considers a resource parameter
>>> "reloadable" if the resource agent supports the "reload" action, and the
>>> agent's metadata marks the parameter with "unique=0". If (only) such
>>> parameters get changed in the resource's pacemaker configuration,
>>> pacemaker will call the agent's reload action rather than the
>>> stop-then-start it usually does for parameter changes.
>>>
>>> This is completely broken for two reasons:
>>
>> I agree ;-)
>>
>>>
>>> 1. It relies on "unique=0" to determine reloadability. "unique" was
>>> originally intended (and is widely used by existing resource agents) as
>>> a hint to UIs to indicate which parameters uniquely determine a resource
>>> instance. That is, two resource instances should never have the same
>>> value of a "unique" parameter. For this purpose, it makes perfect sense
>>> that (for example) the path to a binary command would have unique=0 --
>>> multiple resource instances could (and likely would) use the same
>>> binary. However, such a parameter could never be reloadable.
>>
>> I tought unique=0 were reloadable (unique=1 were not)...

Exactly

> I see a doubly-distorted picture here:
> - actually "unique=1" on a RA parameter (together with this RA supporting
>   "reload") currently leads to reload-on-change

Are you 100% sure about that?

> - also the provided example shows why reload for "unique=0" is wrong,
>   but as the opposite applies as of current state, it's not an argument
>   why something is broken
>
> See also:
> https://github.com/ClusterLabs/pacemaker/commit/2f5d44d4406e9a8fb5b380cb56ab8a70d7ad9c23

What about it?

>
>>> 2. Every known resource agent that implements a reload action does so
>>> incorrectly. Pacemaker uses reload for changes in the resource's
>>> *pacemaker* configuration, while all known RAs use reload for a
>>> service's native reload capability of its own configuration file. As an
>>> example, the ocf:heartbeat:named RA calls "rndc reload" for its reload
>>> action, which will have zero effect on any pacemaker-configured
>>> parameters -- and on top of that, the RA uses "unique=0" in its correct
>>> UI sense, and none of those parameters are actually reloadable.
>
> (per the last subclause, applicable also, after mentioned inversion, for
> "unique=1", such as a pid file path, which cannot be reloadable for
> apparent reason)
>
>> Maybe LSB confusion...
>
> That's not entirely fair vindication, as when you have to do some
> extra actions with parameters in LSB-aliased "start" action in the
> RA, you should do such reflections also for "reload".
>
>>> My proposed solution is:
>>>
>>> * Add a new "reloadable" attribute for resource agent metadata, to
>>> indicate reloadable parameters. Pacemaker would use this instead of
>>> "unique".
>>
>> No objections if you change the XML metadata version number this time ;-)
>
> Good point, but I guess everyone's a bit scared to open this Pandora
> box as there's so much technical debt connected to that (unifying FA/RA
> metadata if possible, adding new UI-oriented annotations, pacemaker's
> silent additions like "private" parameter).
> I'd imagine an established authority for OCF matters (and maintaing
> https://github.com/ClusterLabs/OCF-spec) and at least partly formalized
> process inspired by Python PEPs for coordinated development:
> https://www.python.org/dev/peps/pep-0001/
>
> </dreams>
>
>>> * Add a new "reload-options" RA action for the ability to reload
>>> Pacemaker-configured options. Pacemaker would call this instead if "reload".
>>
>> Why not "reload-parameters"?
>
> That came to my mind as well.  Or not wasting time/space on too many
> letters, just "reload-params", perhaps.

The changes for when a reload should take place make plenty of sense,
it is clearly an ongoing source of confusion.  However I'm not so sure
about this part.

Would it not be better to have a single reload operation that took
into account the new config and any changed parameters?  When would we
want to update from only one source of changes?

Splitting the functionality into two functions seems like it would
increase not decrease confusion.  It took even me some time to realise
what you had in mind.

Or is the intention that since most RA writers only think of config
file reload, we are protecting users from incomplete agents? I would
have though requiring the new 'reloadable' to be added to attributes
would be sufficient for this purpose  (call it
'this-attribute-is-reloadable' if you really want to hammer it home
:-).

>
> </bike-shedding>
>
>>> * Formalize that "reload" means reload the service's own configuration,
>>> legitimizing the most common existing RA implementations. (Pacemaker
>>> itself will not use this, but tools such as crm_resource might.)
>>
>> Maybe be precise what your "reload-options" is expected to do,
>> compared to the "reload" action.  I'm still a bit confused. Maybe a
>> working example...
>
> IIUIC, reload-options should first reflect the parameters as when
> "start" is invoked, then delegate the responsibility to something
> that triggers as native reload as possible (which was mentioned
> is commonly [and problematically] implemented directly in current
> "reload" actions of common RAs).
>
>>> * Review all ocf:pacemaker and ocf:heartbeat agents to make sure they
>>> use unique, reloadable, reload, and reload-options properly.
>>>
>>> The downside is that this breaks backward compatibility. Any RA that
>>> actually implements unique and reload so that reload works will lose
>>> reload capability until it is updated to the new style.
>>
>> Maybe there's a solution that is even simpler: Keep the action name,
>> but don't use "relaod" unless the RA indicated at least one
>> "reloadable" parameter. Naturally old RAs don't do it. And once an
>> author touches the RA to fix thing, he/she should do so (not just
>> adding "reloadable" to the metadata).
>
> IMHO, idea worth pursuing
>
>>> While we usually go to great lengths to preserve backward compatibility,
>>> I think it is OK to break it in this case, because most RAs that
>>> implement reload do so wrongly: some implement it as a service reload, a
>>> few advertise reload but don't actually implement it, and others map
>>> reload to start, which might theoretically work in some cases (I'm not
>>> familiar enough with iSCSILogicalUnit and iSCSITarget to be sure), but
>>> typically won't, as the previous service options are not reverted (for
>>> example, I think Route would incorrectly leave the old route in the old
>>> table).
>>>
>>> So, I think breaking backward compatibility is actually a good thing
>>> here, since the most reload can do with existing RAs is trigger bad
>>> behavior.
>>
>> See my comment above.
>>
>>>
>>> The opposing view would be that we shouldn't punish any RA writer who
>>> implemented this correctly. However, there's no solution that preserves
>>
>> The software could give a warning if the RA provides "reload", but
>> does not define any reloadble parameter. So to notify users and
>> developers that some change is needed.
>>
>>> backward compatibility with both UI usage of unique and reload usage of
>>> unique. Plus, the worst that would happen is that the RA would stop
>>> being reloadable -- not as bad as the current possibilities from
>>> mis-implemented reload.
>>
>> Agree on that.
>>
>>>
>>> My questions are:
>>>
>>> Does anyone know of an RA that uses reload correctly? Dummy doesn't
>>> count ;-)
>>
>> Revalidation by the RA authors (and adding "reloadable" attribute to
>> metadata parameters) ;-)
>>
>>>
>>> Does anyone object to the (backward-incompatible) solution proposed here?
>>
>> I commented inline.
>
> --
> Jan (Poki)
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>