[ClusterLabs] Antw: Re: large cluster with corosync

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Wed Jan 4 08:14:12 EST 2017


>>> Jan Friesse <jfriesse at redhat.com> schrieb am 04.01.2017 um 13:52 in Nachricht
<586CEFF9.7070904 at redhat.com>:
[...]
> 
> No, they are not enforced. 16/32 are official supported number of nodes. 
> Basically, this is number what was tested and known to work reliably. 
> This doesn't mean corosync doesn't work with bigger number of nodes. 
> Eventho I'm quite surprised that 64 nodes really works.
> 
> Variables you can try tweak.
> - Definitively start with increase totem.config (default 1000, you can 
> try 10000)
> - If it doesn't help, try increase totem.join (default is 50, 1000 may 
> work) and consider increase totem.send_join (default is 0, 100 may be 
> good idea).
> - As a last variable, increase of totem.merge (default is 200, 2000 may 
> do the job).
> 
> And definitively let us know about results. It's quite hard to test such 
> a big amount of nodes so some of the variable may be sub-optimal. When 
> we know which of variables are victims, we can change their defaults.
> 
> Regards,
>    Honza
[...]

I had told this wish a few times before already: Despite of the poor documentation, can't somebody make a kind of spreadsheet, where the user can enter some parameters (e.g.: network delay, number of nodes, number of cores, tolerable network delays, tolerable "dead time" for a node, etc.), and then the suggested parameters for corosync/TOTEM  are calculated, considering inter-dependencies.

Or maybe the reverse direction: Enter your configuration parameters, and the other values are output (supported number of nodes, maximum network delay, etc.)

Ulrich






More information about the Users mailing list