[ClusterLabs] Informing RAs about recovery: failed resource recovery, or any start-stop cycle?

Thu May 19 15:29:47 EDT 2016

Le Thu, 19 May 2016 13:15:20 -0500,
Ken Gaillot <kgaillot at redhat.com> a écrit :

> On 05/19/2016 11:43 AM, Jehan-Guillaume de Rorthais wrote:
>> Le Thu, 19 May 2016 10:53:31 -0500,
>> Ken Gaillot <kgaillot at redhat.com> a écrit :
>> 
>>> A recent thread discussed a proposed new feature, a new environment
>>> variable that would be passed to resource agents, indicating whether a
>>> stop action was part of a recovery.
>>>
>>> Since that thread was long and covered a lot of topics, I'm starting a
>>> new one to focus on the core issue remaining:
>>>
>>> The original idea was to pass the number of restarts remaining before
>>> the resource will no longer tried to be started on the same node. This
>>> involves calculating (fail-count - migration-threshold), and that
>>> implies certain limitations: (1) it will only be set when the cluster
>>> checks migration-threshold; (2) it will only be set for the failed
>>> resource itself, not for other resources that may be recovered due to
>>> dependencies on it.
>>>
>>> Ulrich Windl proposed an alternative: setting a boolean value instead. I
>>> forgot to cc the list on my reply, so I'll summarize now: We would set a
>>> new variable like OCF_RESKEY_CRM_recovery=true whenever a start is
>>> scheduled after a stop on the same node in the same transition. This
>>> would avoid the corner cases of the previous approach; instead of being
>>> tied to migration-threshold, it would be set whenever a recovery was
>>> being attempted, for any reason. And with this approach, it should be
>>> easier to set the variable for all actions on the resource
>>> (demote/stop/start/promote), rather than just the stop.
>> 
>> I can see the value of having such variable during various actions.
>> However, we can also deduce the transition is a recovering during the
>> notify actions with the notify variables (the only information we lack is
>> the order of the actions). A most flexible approach would be to make sure
>> the notify variables are always available during the whole transaction for
>> **all** actions, not just notify. It seems like it's already the case, but
>> a recent discussion emphase this is just a side effect of the current
>> implementation. I understand this as they were sometime available outside
>> of notification "by accident".
> 
> It does seem that a recovery could be implied from the
> notify_{start,stop}_uname variables, but notify variables are only set
> for clones that support the notify action. I think the goal here is to
> work with any resource type. Even for clones, if they don't otherwise
> need notifications, they'd have to add the overhead of notify calls on
> all instances, that would do nothing.

Exact, notify variables are only available for clones, presently. What I was
suggesting is that notify variables were always available, whatever the
resource is a clone, a ms or a standard one.

And I wasn't meaning notify *action* should be activated all the time for
all the resources. The notify switch for clones/ms could be kept to false by
default so the notify action is not called itself during the transitions.

> > Also, I can see the benefit of having the remaining attempt for the current
> > action before hitting the migration-threshold. I might misunderstand
> > something here, but it seems to me both informations are different. 
> 
> I think the use cases that have been mentioned would all be happy with
> just the boolean. Does anyone need the actual count, or just whether
> this is a stop-start vs a full stop?

I was thinking of a use case where a graceful demote or stop action failed
multiple times and to give a chance to the RA to choose another method to stop
the resource before it requires a migration. As instance, PostgreSQL has 3
different kind of stop, the last one being not graceful, but still better than
a kill -9.

> The problem with the migration-threshold approach is that there are
> recoveries that will be missed because they don't involve
> migration-threshold. If the count is really needed, the
> migration-threshold approach is necessary, but if recovery is the really
> interesting information, then a boolean would be more accurate.

I think I misunderstood the original use cases you try to achieve. It seems to
me we are talking about different a feature.

>> Basically, what we need is a better understanding of the transition itself
>> from the RA actions.
>> 
>> If you are still brainstorming on this, as a RA dev, what I would
>> suggest is:
>> 
>>   * provide and enforce the notify variables in all actions
>>   * add the actions order during the current transition to these variables
>> using eg. OCF_RESKEY_CRM_meta_notify_*_actionid
> 
> The action ID would be different for each node being acted on, so it
> would be more complicated (maybe *_actions="NODE1:ID1,NODE2:ID2,..."?).

Following the principle adopted for other variables, each ID would apply to the
corresponding resource and node in OCF_RESKEY_CRM_meta_notify_*_uname and
OCF_RESKEY_CRM_meta_notify_*_rsc.

> Also, RA writers would need to be aware that some actions may be
> initiated in parallel. Probably more complex than it's worth.

Oh, exact. But I was actually thinking about the hypothetical transition 
where a resource is started then stopped. It might looks like a recovery
without this order information, and this actions can not be performed in
parallel.

Anyway, maybe another approach would be to expose the simplified transition
itself (like the information pengine already provide in the log files as
instance) in a whole new different way, easy to parse.

Regards,