[ClusterLabs] corosync 3.0.1 on Debian/Buster reports some MTU errors
Jean-Francois Malouin
Jean-Francois.Malouin at bic.mni.mcgill.ca
Mon Nov 18 16:31:34 EST 2019
Hi,
Maybe not directly a pacemaker question but maybe some of you have seen this
problem:
A 2 node pacemaker cluster running corosync-3.0.1 with dual communication ring
sometimes reports errors like this in the corosync log file:
[KNET ] pmtud: PMTUD link change for host: 2 link: 0 from 470 to 1366
[KNET ] pmtud: PMTUD link change for host: 2 link: 1 from 470 to 1366
[KNET ] pmtud: Global data MTU changed to: 1366
[CFG ] Modified entry 'totem.netmtu' in corosync.conf cannot be changed at run-time
[CFG ] Modified entry 'totem.netmtu' in corosync.conf cannot be changed at run-time
Those do not happen very frequenly, once a week or so...
However the system log on the nodes reports those much more frequently, a few
times a day:
Nov 17 23:26:20 node1 corosync[2258]: [KNET ] link: host: 2 link: 1 is down
Nov 17 23:26:20 node1 corosync[2258]: [KNET ] host: host: 2 (passive) best link: 0 (pri: 0)
Nov 17 23:26:26 node1 corosync[2258]: [KNET ] rx: host: 2 link: 1 is up
Nov 17 23:26:26 node1 corosync[2258]: [KNET ] host: host: 2 (passive) best link: 1 (pri: 1)
Are those to be dismissed or are they indicative of a network misconfig/problem?
I tried setting 'knet_transport: udpu' in the totem section (the default value)
but it didn't seem to make a difference...Hard coding netmtu to 1500 and
allowing for longer (10s) token timeout also didn't seem to affect the issue.
Corosync config follows:
/etc/corosync/corosync.conf
totem {
version: 2
cluster_name: bicha
transport: knet
link_mode: passive
ip_version: ipv4
token: 10000
netmtu: 1500
knet_transport: sctp
crypto_model: openssl
crypto_hash: sha256
crypto_cipher: aes256
keyfile: /etc/corosync/authkey
interface {
linknumber: 0
knet_transport: udp
knet_link_priority: 0
}
interface {
linknumber: 1
knet_transport: udp
knet_link_priority: 1
}
}
quorum {
provider: corosync_votequorum
two_node: 1
# expected_votes: 2
}
nodelist {
node {
ring0_addr: xxx.xxx.xxx.xxx
ring1_addr: zzz.zzz.zzz.zzx
name: node1
nodeid: 1
}
node {
ring0_addr: xxx.xxx.xxx.xxy
ring1_addr: zzz.zzz.zzz.zzy
name: node2
nodeid: 2
}
}
logging {
to_logfile: yes
to_syslog: yes
logfile: /var/log/corosync/corosync.log
syslog_facility: daemon
debug: off
timestamp: on
logger_subsys {
subsys: QUORUM
debug: off
}
}
More information about the Users
mailing list