[Pacemaker] (LRMD|PCMK)_MAX_CHILDREN?

Lars Marowsky-Bree lmb at suse.com
Wed Sep 11 07:33:17 EDT 2013


On 2013-09-11T19:55:38, Andrew Beekhof <andrew at beekhof.net> wrote:

> > sorry for being thick, but I can't find this in the code now. Did this
> > slip through again in April?
> Apparently. But before we add it, I'd like to see if we can do something coherent.
> Having 3 (or more) different variables (batch-limit, migration-limit and this) for controlling these things doesn't seem optimal or user friendly.

Well, they're all doing something completely different.

A cluster-wide limit on operations (batch-limit) limits the total
cluster and network/storage load.

The max_children prevent a given node from being overloaded by
concurrent operations. (Reducing batch-limit to emulate this kills
cluster-wide parallelism and is not optimal.) Clearly, it's not perfect
either (since it assumes all rsc ops on a node are identical in
weight; whereas in reality we may want to limit VM start-up to 4, but
would happily see 32 IP addresses go up at once, or 48 monitors ...),
but it is an appropriate simplification.

migration-limit is indeed a special case (needed to limit nodes from
being overloaded by migrate, which were at the time the only ops that
affect two nodes at once - batch-limit="4" was too coarse a hammer). I
do recall that we discussed making it more generic - so that one could
configure cluster-/node-wide limits for certain operations of specific
resource types, but that was (rightly) judged to be a rather complex can
of worms by you.

> If anything, we should likely be putting work into auto-tuning this
> stuff instead.  Somehow.

I'm not sure about how batch-limit can be auto-tuned.
migration-threshold is mostly a function of the network bandwidth, too.

MAX_CHILDREN did, sort of, auto-tune (by defaulting to number of cores,
or something similar, which was appropriate enough[1]).

It can all be made into a generic, powerful, flexible mechanism that
describes them all. But I'm afraid that it'd also be quite complex. I'm
happy to think about it, but the three limits we have/had seemed
sufficient for the real-world.


Regards,
    Lars

[1] the main complaint was that it was configured via sysconfig, and not
dynamic via a node attribute as it should be. When we reintroduce it, we
may want to make nodes default to PCMK/LRMD_MAX_CHILDREN if unset in
the CIB, and otherwise have that value override the environment
variable?  That'd be a benefit now that pcmk and lrmd are more closely
married.


-- 
Architect Storage/HA
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde





More information about the Pacemaker mailing list