[ClusterLabs] What is the mechanism for pacemaker to recovery resources

lkxjtu lkxjtu at 163.com
Thu May 10 14:02:47 UTC 2018


Great! These two parameters (batch-limit & node-action-limit) solve my problem. Thank you very much!

By the way, is there any way to know the number of parallel action on node and cluster?




At 2018-05-10 20:56:27, "lkxjtu" <lkxjtu at 163.com> wrote:

On Tue, 2018-05-08 at 23:52 +0800, lkxjtu wrote: > I have a three node cluster of about 50 resources. When I reboot
> three nodes at the same time, I observe the resource by "crm status".
> I found that pacemaker starts 3-5 resources at a time, from top to
> bottom, rather than start all at the same time. Is there any
> parameter control?
> It seems to be acceptable. But if there is a resource that can not
> start up because of a exception, the latter resources recovery will
> become very slow.I don't know the principle of pacemaker recovery
> resources.In particular, order and priority.Is there any
> suggestions?Thank you very much!
There are a few things affecting start-up order. First (obviously) is your constraints. If you have any ordering constraints, they will enforce the configured order. Second is internal constraints. Pacemaker has certain built-in constraints for safety. This includes obvious logical requirements such as starting a resource before promoting it. Pacemaker will do a probe (one-time monitor) of each resource on each node to find its initial state; everything is ordered after those probes. A clone won't be promoted until all pending starts complete. Last is throttling. By default Pacemaker computes a maximum number of jobs that can be executed at once across the entire cluster, and for each node. The number is based on observed CPU load on the nodes (and thus depends partly on the number of CPU cores). Usually it is best to allow Pacemaker to calculate the throttling, but you can force particular values by setting: - node-action-limit: a cluster-wide property specifying the maximum number of actions that can be executed at once on any one node. - PCMK_node_action_limit: an environment variable specifying the same thing but can be configured differently per node. - batch-limit: a cluster-wide property specifying the maximum number of actions that can be executed at once across the entire cluster. The purpose of throttling is to keep Pacemaker from overloading the nodes such that actions might start timing out, causing unnecessary recovery.


| |
lkxjtu
邮箱:lkxjtu at 163.com
|

签名由 网易邮箱大师 定制
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20180510/551eed16/attachment.html>


More information about the Users mailing list