[ClusterLabs] Questions about the infamous TOTEM retransmit list

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Tue Jan 12 03:23:31 EST 2021


Hi!

Before setting up our first pacemaker cluster we thought one low-speed redundant network would be good in addition to the normal high-speed network.
However as is seems now (SLES15 SP2) there is NO reasonable RRP mode to drive such a configuration with corosync.

Passive RRP mode with UDPU still sends each packet through both nets, being throttled by the slower network.
(Originally we were using multicast, but that was even worse)

Now I realized that even under modest load, I see messages about "retransmit list", like this:
Jan 08 10:57:56 h16 corosync[3562]:   [TOTEM ] Retransmit List: 3e2
Jan 08 10:57:56 h16 corosync[3562]:   [TOTEM ] Retransmit List: 3e2 3e4
Jan 08 11:13:21 h16 corosync[3562]:   [TOTEM ] Retransmit List: 60e 610 612 614
Jan 08 11:13:21 h16 corosync[3562]:   [TOTEM ] Retransmit List: 610 614
Jan 08 11:13:21 h16 corosync[3562]:   [TOTEM ] Retransmit List: 614
Jan 08 11:13:41 h16 corosync[3562]:   [TOTEM ] Retransmit List: 6ed

Questions on that:
Will the situation be much better with knet?
Is there a smooth migration path from UDPU to knet?

Regards,
Ulrich







More information about the Users mailing list