[ClusterLabs] RFC: allowing soft recovery attempts before ignore/block/etc.

Wed Sep 21 09:23:21 CEST 2016

Ken Gaillot <kgaillot at redhat.com> writes:

> Hi everybody,
>
> Currently, Pacemaker's on-fail property allows you to configure how the
> cluster reacts to operation failures. The default "restart" means try to
> restart on the same node, optionally moving to another node once
> migration-threshold is reached. Other possibilities are "ignore",
> "block", "stop", "fence", and "standby".
>
> Occasionally, we get requests to have something like migration-threshold
> for values besides restart. For example, try restarting the resource on
> the same node 3 times, then fence.
>
> I'd like to get your feedback on two alternative approaches we're
> considering.
>
> ###
>
> Our first proposed approach would add a new hard-fail-threshold
> operation property. If specified, the cluster would first try restarting
> the resource on the same node, before doing the on-fail handling.
>
> For example, you could configure a promote operation with
> hard-fail-threshold=3 and on-fail=fence, to fence the node after 3 failures.
>
> One point that's not settled is whether failures of *any* operation
> would count toward the 3 failures (which is how migration-threshold
> works now), or only failures of the specified operation.
>
> Currently, if a start fails (but is retried successfully), then a
> promote fails (but is retried successfully), then a monitor fails, the
> resource will move to another node if migration-threshold=3. We could
> keep that behavior with hard-fail-threshold, or only count monitor
> failures toward monitor's hard-fail-threshold. Each alternative has
> advantages and disadvantages.
>
> ###
>
> The second proposed approach would add a new on-restart-fail resource
> property.
>
> Same as now, on-fail set to anything but restart would be done
> immediately after the first failure. A new value, "ban", would
> immediately move the resource to another node. (on-fail=ban would behave
> like on-fail=restart with migration-threshold=1.)
>
> When on-fail=restart, and restarting on the same node doesn't work, the
> cluster would do the on-restart-fail handling. on-restart-fail would
> allow the same values as on-fail (minus "restart"), and would default to
> "ban".
>
> So, if you want to fence immediately after any promote failure, you
> would still configure on-fail=fence; if you want to try restarting a few
> times first, you would configure on-fail=restart and on-restart-fail=fence.
>
> This approach keeps the current threshold behavior -- failures of any
> operation count toward the threshold. We'd rename migration-threshold to
> something like hard-fail-threshold, since it would apply to more than
> just migration, but unlike the first approach, it would stay a resource
> property.
>
> ###
>
> Comparing the two approaches, the first is more flexible, but also more
> complex and potentially confusing.
>
> With either approach, we would deprecate the start-failure-is-fatal
> cluster property. start-failure-is-fatal=true would be equivalent to
> hard-fail-threshold=1 with the first approach, and on-fail=ban with the
> second approach. This would be both simpler and more useful -- it allows
> the value to be set differently per resource.

Apologies for quoting the entire mail, but I had a hard time picking out
which part was more relevant when replying.

First of all, is there a use case for when fence-after-3-failures is a
useful behavior? I seem to recall some case where someone expected that
to be the behavior and were surprised by how pacemaker works, but that
problem wouldn't be helped by adding another option for them not to know
about.

My second comment would be that to me, the first option sounds less
complex, but then I don't know the internals of pacemaker that
well. Having a special case on-fail for restarts seems inelegant,
somehow.

If implementing the first option, I would prefer to keep the behavior of
migration-threshold of counting all failures, not just
monitors. Otherwise there would be two closely related thresholds with
subtly divergent behavior, which seems confusing indeed.

Cheers,
Kristoffer

> -- 
> Ken Gaillot <kgaillot at redhat.com>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-- 
// Kristoffer Grönlund
// kgronlund at suse.com