[ClusterLabs] RFC: allowing soft recovery attempts before ignore/block/etc.
kgaillot at redhat.com
Wed Sep 21 15:45:26 EDT 2016
On 09/21/2016 02:23 AM, Kristoffer Grönlund wrote:
> First of all, is there a use case for when fence-after-3-failures is a
> useful behavior? I seem to recall some case where someone expected that
> to be the behavior and were surprised by how pacemaker works, but that
> problem wouldn't be helped by adding another option for them not to know
I think I've most often encountered it with ignore/block. Sometimes
users have one particular service that's buggy and not really important,
so they ignore errors (or block). But they would like to try restarting
it a few times first.
I think fence-after-3-failures would make as much sense as
fence-immediately. The idea behind restarting a few times then taking a
more drastic action is that restarting is for the case where the service
crashed or is in a buggy state, and if that doesn't work, maybe
something's wrong with the node.
More information about the Users