[ClusterLabs] How to stop resource after N failures?

Fri Oct 27 15:32:28 EDT 2017

On Fri, 2017-10-27 at 13:52 -0500, David Kent wrote:
> Hello,
> 
> Is there any way to configure a pacemaker cluster so that a resource
> will restart N times and then stop?
> 
> This is essentially a mix of migration-threshold and on-failure=stop.

Not yet. It's a planned future enhancement. It's relatively high on the
list, but it's a long list.

>  Let me describe my setup and what I've tried.
> 
> My cluster is a set of independent apps sharing a VIP (all apps are
> co-located with the VIP). If an app fails, I want to try a restart
> since the issue might have been ephemeral. For safety, I don't want
> to keep restarting indefinitely; three attempts is a reasonable max.
> I also don't want one app to cause everything (VIP and all apps) to
> move. This would interrupt all established connections. It makes more
> sense the leave the one app down and let the working apps stay put.
> 
> If I use on-failure=stop, Pacemaker won't attempt a restart. If I use
> migration-threshold, one app failing causes all apps and the VIP to
> move. I've read through all the documentation, man pages, and message
> boards I can find, but I don't see a solution. Any ideas? Is this
> setup possible in Pacemaker or do I need to pick between on-
> failure=stop and migration-threshold=3?
-- 
Ken Gaillot <kgaillot at redhat.com>