[Pacemaker] Large cluster

Fri Jan 6 03:43:30 EST 2012

Hi,

On Thu, Jan 5, 2012 at 6:43 PM, Graantik <graantik at gmail.com> wrote:
> Hi all,
>
> I have a task that I think can logically be implemented using a
> pacemaker/corosync cluster with many nodes (e.g. 15) and maybe thousand or
> more resources. Most of the resources are parametrized processes controlled
> by a custom resource agent. The resources are added and removed dynamically,
> typically many (e.g. 100) at one time.
>
> My first tests in a VM environment show that - even after some tuning of
> lrmd max-children and custom-batch-limit, optimizing the RA and having the
> processes idle - adding so many resources in one step (xml based) appears to
> bring the cluster to its knees, i.e. nodes become unresponsive, DC and other
> nodes have very high load, and the operation takes an hour or longer.
>
> Does this mean that the design limit of this software/hardware is reached or
> are there ways like tuning or best practices to make such a scenario work?

In terms of performance testing on large clusters there is an article
that may be interesting to read
http://theclusterguy.clusterlabs.org/post/1241986422/large-cluster-performance

In the article it talks about using 10000 resources, so it's higher
than your use case, you can take away from it the timings that you
have had and the ones presented there and go from there.

Bare in mind that when dealing with so many resources and nodes it
might help to tweak certain things, such as the maximum message size
for corosync (the article mentions using 256k), timeouts in corosync
token might have to be increased, as high load on the systems may
delay replies in network traffic, and also having to sync the CIB onto
~15 nodes as you mentioned means that you _should_ use multicast,
switches must support igmp snooping and have it enabled and properly
configured, the entire cluster should be in a separate vlan, or have
some form of dedicated network, to ensure not only throughput but also
latency and to prevent interference of other network traffic, etc.

>
> Are there known implementations of comparable size?

In terms of nodes, most I know of are clusters of ~10-12 nodes, in
terms of resources, not that I know of.

HTH,
Dan

>
> Thanks
> Gerhard
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>

-- 
Dan Frincu
CCNA, RHCE