[ClusterLabs] Corosync permanently desyncs in face of packet loss

Jan Friesse jfriesse at redhat.com
Mon Jan 18 03:51:36 EST 2021


Mariusz,

> Hi,
> 
> We've had a hardware problem causing asynchronous packet drop on one of
> our nodes that caused unrecoverable (required
> restarting corosync on both nodes) state, that then repeated next day. Log of the events in
> attachment.
> 
> It did recover few times after the problem, but when it happened it
> just spammed
> 
> Jan 13 14:28:30 [20833] node-db2 corosync notice  [TOTEM ] A new membership (2:72076) was formed. Members
> Jan 13 14:28:30 [20833] node-db2 corosync warning [CPG   ] downlist left_list: 0 received
> Jan 13 14:28:30 [20833] node-db2 corosync notice  [QUORUM] Members[1]: 2
> Jan 13 14:28:30 [20833] node-db2 corosync notice  [MAIN  ] Completed service synchronization, ready to provide service.
> 
> I've also seen some of
> 
> 
>   corosync warning [KNET  ] pmtud: possible MTU misconfiguration detected. kernel is reporting MTU: 1500 bytes for host 1 link 0 but the other node is not acknowledging packets of this size.
>   corosync warning [KNET  ] pmtud: This can be caused by this node interface MTU too big or a network device that does not support or has been misconfigured to manage MTU of this size, or packet loss. knet will continue to run but performances might be affected.
> 
> in previous failure.
> 
> After packet loss reason was fixed it also did not fix itself without restart.
> 
> In limited testing with udpu protocol that did not occur but that period of testing was much shorter as we fixed the networking issue in the meantime.
> 
> We've using stable version from Debian Buster (3.0.1). >
> Is that a known problem/bug ?

There were quite a few bugs in libknet < 1.15 and some of them may 
explain behavior you see. I would suggest to try backports (where 1.16 
seems to be available) or Proxmox repositories (where is packaged also 
newer corosync).

Regards,
   Honza

> 
> 
> Cheers,
> Mariusz
> 
> 
> 
> 
> 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
> 



More information about the Users mailing list