[ClusterLabs] About the Pacemaker
kgaillot at redhat.com
Tue Oct 23 09:59:32 EDT 2018
On Tue, 2018-10-23 at 21:20 +0800, T. Ladd Omar wrote:
> For the question one, I don't think start-failure-is-fatal is good
> way for me. It barely has no interval for retrying and easily leads
> to flooding log output in a short time.
> T. Ladd Omar <mwang911 at gmail.com> 于2018年10月23日周二 下午9:06写道：
> > Hi all, I send this message to get some answers for my questions
> > about Pacemaker.
> > 1. In order to cleanup start-failed resources automatically, I add
> > failure-timeout attribute for resources, however, the common way to
> > trigger the recovery is by cluster-recheck whose interval is 15min
> > by default. I wonder how lower value could I set for the cluster-
> > recheck-interval. I had to let the failed resources recover
> > somewhat quickly while little impact taken by the more frequent
> > cluster-recheck.
> > Or, is there another way to automatically cleanup start-failed
> > resources ?
failure-timeout with a lower cluster-recheck-interval is fine. I don't
think there's ever been solid testing on what a lower bound for the
interval is. I've seen users set it as low as 1 minute, but that seems
low to me. My gut feeling is 5 minutes is a good trade-off. The simpler
your cluster is (# nodes / # resources / features used), the lower the
number could be.
> > 2. Is Pacemaker suitable for the Master-Slave model HA ? I had some
> > productive problems when I use Pacemaker. If only one resource
> > stopped on one node, should I failover all this node for the whole
> > cluster? If not, the transactions from the ports on this node may
> > fail for this failure. If yes, it seems to be big action for just
> > one resource failure.
Definitely, master/slave operation is one of the most commonly used
Pacemaker features. You have the flexibility of failing over any
combination of resources you want. Look into clone resources,
master/slave clones, colocation constraints, and the on-fail property
Ken Gaillot <kgaillot at redhat.com>
More information about the Users