[ClusterLabs] FR: send failcount to OCF RA start/stop actions
Jehan-Guillaume de Rorthais
jgdr at dalibo.com
Wed May 4 12:22:33 UTC 2016
Le Wed, 4 May 2016 13:09:04 +0100,
Adam Spiers <aspiers at suse.com> a écrit :
> Hi all,
Hello,
> As discussed with Ken and Andrew at the OpenStack summit last week, we
> would like Pacemaker to be extended to export the current failcount as
> an environment variable to OCF RA scripts when they are invoked with
> 'start' or 'stop' actions. This would mean that if you have
> start-failure-is-fatal=false and migration-threshold=3 (say), then you
> would be able to implement a different behaviour for the third and
> final 'stop' of a service executed on a node, which is different to
> the previous 'stop' actions executed just prior to attempting a
> restart of the service. (In the non-clone case, this would happen
> just before migrating the service to another node.)
>
> One use case for this is to invoke "nova service-disable" if Pacemaker
> fails to restart the nova-compute service on an OpenStack compute
> node.
>
> Is it feasible to squeeze this in before the 1.1.15 release?
Wouldn't it possible to do the following command from the RA to get its
current failcount ?
crm_failcount --resource "$OCF_RESOURCE_INSTANCE" -G
Moreover, how would you track the previous failures were all from the start
action? I suppose you will have to track internally the failcount yourself,
isn't it? Maybe you could track failure in some fashion using private
attributes (eg. start_attempt and last_start_ts)?
Regards,
More information about the Users
mailing list