<!DOCTYPE html><html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<p>Dear List,</p>
<p>I have been running two different 3-node clusters for some time.
I am having a fatal problem with corosync: After a node failure,
rebooted node does NOT start corosync.</p>
<p>Clusters;</p>
<ul>
<li>All nodes are running Ubuntu Server 24.04</li>
<li>corosync is 3.1.7</li>
<li>corosync-qdevice is 3.0.3</li>
<li>pacemaker is 2.1.6</li>
<li>The third node at both clusters is a quorum device. Cluster is
on ffsplit algorithm.</li>
<li>All nodes are baremetal & attached to a dedicated
kronosnet network.</li>
<li>STONITH is enabled in one of the clusters and disabled for the
other.</li>
</ul>
<p></p>
<p>corosync & pacemaker service starts (systemd) are disabled. I
am starting any cluster with the command <font size="5" face="monospace">pcs cluster start</font>.</p>
<p>corosync NEVER starts AFTER a node failure (node is rebooted).
There is nothing in <font size="5" face="monospace">/var/log/corosync/corosync.log</font>,
service freezes as:</p>
<p><font size="5" face="monospace">Aug 01 12:54:56 [3193] charon
corosync notice [MAIN ] Corosync Cluster Engine 3.1.7 starting
up<br>
Aug 01 12:54:56 [3193] charon corosync info [MAIN ] Corosync
built-in features: dbus monitoring watchdog augeas systemd
xmlconf vqsim nozzle snmp pie relro bindnow</font></p>
<p>corosync never starts kronosnet. I checked kronosnet interfaces,
all OK, there is IP connectivity in between. If I do <font size="5" face="monospace">corosync -t</font>, it is the same
freeze.<br>
</p>
<p>I could ONLY manage to start corosync by reinstalling it: <font size="5" face="monospace">apt reinstall corosync ; pcs cluster
start</font>.</p>
<p>The above issue repeated itself at least 5-6 times. I do NOT see
anything in syslog either. I will be glad if you lead me on how to
solve this.</p>
<p>Thanks,</p>
<p>Murat<br>
</p>
</body>
</html>