[ClusterLabs] corosync 3.0.1 on Debian/Buster reports some MTU errors
wferi at niif.hu
wferi at niif.hu
Fri Nov 22 06:10:52 EST 2019
Jean-Francois Malouin <Jean-Francois.Malouin at bic.mni.mcgill.ca> writes:
> * christine caulfield <ccaulfie at redhat.com> [20191121 03:19]:
>
>> On 18/11/2019 21:31, Jean-Francois Malouin wrote:
>>
>>> However the system log on the nodes reports those much more frequently, a few
>>> times a day:
>>>
>>> Nov 17 23:26:20 node1 corosync[2258]: [KNET ] link: host: 2 link: 1 is down
>>> Nov 17 23:26:20 node1 corosync[2258]: [KNET ] host: host: 2 (passive) best link: 0 (pri: 0)
>>> Nov 17 23:26:26 node1 corosync[2258]: [KNET ] rx: host: 2 link: 1 is up
>>> Nov 17 23:26:26 node1 corosync[2258]: [KNET ] host: host: 2 (passive) best link: 1 (pri: 1)
>>
>> Those don't look good. having a link down for 6 seconds looks like a
>> serious network outage that needs looking into, especially if they
>> are that frequent, or it could be a bug. You don't say which version
>> of libknet you have installed but make sure it's the latest one.
>
> libknet1 is 1.8-2 and is the latest one from Debian buster distro.
If no other solution emerges, try installing libknet1_1.13-1 from
bullseye (all of its dependencies are satisfied in buster). There are
important fixes in that version, but I can't tell whether those are
relevant in your case. If this proves successful, though, that will
provide me with some ammunition for pushing for a stable update.
--
Feri
More information about the Users
mailing list