[ClusterLabs] RFC: allowing soft recovery attempts before ignore/block/etc.

Thu Sep 22 13:58:13 EDT 2016

Ken Gaillot <kgaillot at redhat.com> writes:
>
> "restart" is the only on-fail value that it makes sense to escalate.
>
> block/stop/fence/standby are final. Block means "don't touch the
> resource again", so there can't be any further response to failures.
> Stop/fence/standby move the resource off the local node, so failure
> handling is reset (there are 0 failures on the new node to begin with).

Hrm. If a restart potentially migrates the resource to a different node,
is the failcount reset then as well? If so, wouldn't that complicate the
hard-fail-threshold variable too, since potentially, the resource could
keep migrating between nodes and since the failcount is reset on each
migration, it would never reach the hard-fail-threshold. (or am I
missing something?)

-- 
// Kristoffer Grönlund
// kgronlund at suse.com