[ClusterLabs] corosync 3.0.1 on Debian/Buster reports some MTU errors

wferi at niif.hu wferi at niif.hu
Fri Nov 22 06:10:52 EST 2019


Jean-Francois Malouin <Jean-Francois.Malouin at bic.mni.mcgill.ca> writes:

> * christine caulfield <ccaulfie at redhat.com> [20191121 03:19]:
>
>> On 18/11/2019 21:31, Jean-Francois Malouin wrote:
>>
>>> However the system log on the nodes reports those much more frequently, a few
>>> times a day:
>>> 
>>> Nov 17 23:26:20 node1 corosync[2258]:   [KNET  ] link: host: 2 link: 1 is down
>>> Nov 17 23:26:20 node1 corosync[2258]:   [KNET  ] host: host: 2 (passive) best link: 0 (pri: 0)
>>> Nov 17 23:26:26 node1 corosync[2258]:   [KNET  ] rx: host: 2 link: 1 is up
>>> Nov 17 23:26:26 node1 corosync[2258]:   [KNET  ] host: host: 2 (passive) best link: 1 (pri: 1)
>> 
>> Those don't look good. having a link down for 6 seconds looks like a
>> serious network outage that needs looking into, especially if they
>> are that frequent, or it could be a bug. You don't say which version
>> of libknet you have installed but make sure it's the latest one.
>
> libknet1 is 1.8-2 and is the latest one from Debian buster distro.

If no other solution emerges, try installing libknet1_1.13-1 from
bullseye (all of its dependencies are satisfied in buster).  There are
important fixes in that version, but I can't tell whether those are
relevant in your case.  If this proves successful, though, that will
provide me with some ammunition for pushing for a stable update.
-- 
Feri


More information about the Users mailing list