[ClusterLabs] corosync/pacemaker on ~100 nodes cluser

Ken Gaillot kgaillot at redhat.com
Tue Aug 23 13:00:44 EDT 2016


On 08/23/2016 11:46 AM, Klaus Wenninger wrote:
> On 08/23/2016 06:26 PM, Radoslaw Garbacz wrote:
>> Hi,
>>
>> I would like to ask for settings (and hardware requirements) to have
>> corosync/pacemaker running on about 100 nodes cluster.
> Actually I had thought that 16 would be the limit for full
> pacemaker-cluster-nodes.
> For larger deployments pacemaker-remote should be the way to go. Were
> you speaking of a cluster with remote-nodes?
> 
> Regards,
> Klaus
>>
>> For now some nodes get totally frozen (high CPU, high network usage),
>> so that even login is not possible. By manipulating
>> corosync/pacemaker/kernel parameters I managed to run it on ~40 nodes
>> cluster, but I am not sure which parameters are critical, how to make
>> it more responsive and how to make the number of nodes even bigger.

16 is a practical limit without special hardware and tuning, so that's
often what companies that offer support for clusters will accept.

I know people have gone well higher than 16 with a lot of optimization,
but I think somewhere between 32 and 64 corosync can't keep up with the
messages. Your 40 nodes sounds about right. I'd be curious to hear what
you had to do (with hardware, OS tuning, and corosync tuning) to get
that far.

As Klaus mentioned, Pacemaker Remote is the preferred way to go beyond
that currently:

http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Remote/index.html

>> Thanks,
>>
>> -- 
>> Best Regards,
>>
>> Radoslaw Garbacz
>> XtremeData Incorporation




More information about the Users mailing list