[ClusterLabs] RFC: allowing soft recovery attempts before ignore/block/etc.

Kristoffer Grönlund kgronlund at suse.com
Thu Sep 22 06:42:46 UTC 2016


Ken Gaillot <kgaillot at redhat.com> writes:

> I'm not saying it's a bad idea, just that it's more complicated than it
> first sounds, so it's worth thinking through the implications.

Thinking about it and looking at how complicated it gets, maybe what
you'd really want, to make it clearer for the user, is the ability to
explicitly configure the behavior, either globally or per-resource. So
instead of having to tweak a set of variables that interact in complex
ways, you'd configure something like rule expressions,

<on_fail>
  <restart repeat="3" />
  <migrate timeout="60s" />
  <fence/>
</on_fail>

So, try to restart the service 3 times, if that fails migrate the
service, if it still fails, fence the node.

(obviously the details and XML syntax are just an example)

This would then replace on-fail, migration-threshold, etc.

-- 
// Kristoffer Grönlund
// kgronlund at suse.com




More information about the Users mailing list