[ClusterLabs] FR: send failcount to OCF RA start/stop actions

Jehan-Guillaume de Rorthais jgdr at dalibo.com
Wed May 4 08:22:33 EDT 2016


Le Wed, 4 May 2016 13:09:04 +0100,
Adam Spiers <aspiers at suse.com> a écrit :

> Hi all,

Hello,

> As discussed with Ken and Andrew at the OpenStack summit last week, we
> would like Pacemaker to be extended to export the current failcount as
> an environment variable to OCF RA scripts when they are invoked with
> 'start' or 'stop' actions.  This would mean that if you have
> start-failure-is-fatal=false and migration-threshold=3 (say), then you
> would be able to implement a different behaviour for the third and
> final 'stop' of a service executed on a node, which is different to
> the previous 'stop' actions executed just prior to attempting a
> restart of the service.  (In the non-clone case, this would happen
> just before migrating the service to another node.)
> 
> One use case for this is to invoke "nova service-disable" if Pacemaker
> fails to restart the nova-compute service on an OpenStack compute
> node.
> 
> Is it feasible to squeeze this in before the 1.1.15 release?

Wouldn't it possible to do the following command from the RA to get its
current failcount ?

  crm_failcount --resource "$OCF_RESOURCE_INSTANCE" -G

Moreover, how would you track the previous failures were all from the start
action? I suppose you will have to track internally the failcount yourself,
isn't it? Maybe you could track failure in some fashion using private
attributes (eg. start_attempt and last_start_ts)?

Regards,




More information about the Users mailing list