[ClusterLabs] Buffer overflow (re-transmission list)

Jan Friesse jfriesse at redhat.com
Mon Jan 2 03:35:59 EST 2017

> Hello,
> I have a four node cluster. Each node connected with a centralized switch.
> MTU size is default, 1500. On each node, a program continuously tries to
> multi-cast as many messages as possible. With the default settings
> (corosync.conf), buffer overflow does *not* occur till program runs on
> three nodes. However as soon as fourth node start multi-casting, overflow
> occur and significantly reduce the performance.
> Why buffer overflow with just four nodes?
> Is hardware topology, centralized switch, is not correct?

Centralized switch is ok.

> Later, I reduced window_size and max_messages to 20 and 5 respectively. No
> overflow but not sure whether performance is as expected.
> I would like to better understand these two parameters. Lets say in a
> cluster of four nodes, I have window_size 50 and max_messages also set to
> 50. Does that mean only one node will be able to multi-cast in a single
> token rotation? If one node was not able to multi-cast because window_size

Not exactly. Token contains "backlog" field which is updated by all 
members. This backlog is then used to find out if there is not a member 
who wasn't able to send messages (whose backlog is too big). So let's 
say one node sent a lot messages in one round, and second node sent no 
messages. In next round, first node finds out this information and its 
number of allowed messages to sent is lowered (or in extreme case goes
to zero) so second node can send some messages.


> is reached, when and how this node will get opportunity to send message?
> Thanks,
> Satish
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

More information about the Users mailing list