[ClusterLabs] Antw: About the Pacemaker

Wed Oct 24 07:46:57 UTC 2018

>>> "T. Ladd Omar" <mwang911 at gmail.com> schrieb am 23.10.2018 um 15:06 in Nachricht
<CAKT+O1P+bvYYZzjCjE5B7qcSv8fezNgf1tj9sZoJB=FDLg_stg at mail.gmail.com>:
> Hi all, I send this message to get some answers for my questions about
> Pacemaker.
> 1. In order to cleanup start-failed resources automatically, I add
> failure-timeout attribute for resources, however, the common way to trigger
> the recovery is by cluster-recheck whose interval is 15min by default. I
> wonder how lower value could I set for the cluster-recheck-interval. I had
> to let the failed resources recover somewhat quickly while little impact
> taken by the more frequent cluster-recheck.

I think if your agents fail periodically, and you need to do a periodic cleanup of failed actions, your configuration is not stable enough for production. Also, if you always cleanup failed actions, resources may not move to a good node of one node has a problem.

> Or, is there another way to automatically cleanup start-failed resources ?
> 2. Is Pacemaker suitable for the Master-Slave model HA ? I had some
> productive problems when I use Pacemaker. If only one resource stopped on
> one node, should I failover all this node for the whole cluster? If not,
> the transactions from the ports on this node may fail for this failure. If
> yes, it seems to be big action for just one resource failure.