[ClusterLabs] corosync won't start after node failure

Jehan-Guillaume de Rorthais jgdr at dalibo.com
Mon Aug 19 10:31:23 UTC 2024


On Mon, 19 Aug 2024 12:58:09 +0300
Murat Inal <mrt_nl at hotmail.com> wrote:

> [Resending the below due to message format problem]
> 
> 
> Dear List,
> 
> I have been running two different 3-node clusters for some time. I am 
> having a fatal problem with corosync: After a node failure, rebooted 
> node does NOT start corosync.
> 
> Clusters;
> 
>   * All nodes are running Ubuntu Server 24.04
>   * corosync is 3.1.7
>   * corosync-qdevice is 3.0.3
>   * pacemaker is 2.1.6
>   * The third node at both clusters is a quorum device. Cluster is on
>     ffsplit algorithm.
>   * All nodes are baremetal & attached to a dedicated kronosnet network.
>   * STONITH is enabled in one of the clusters and disabled for the other.
> 
> corosync & pacemaker service starts (systemd) are disabled. I am 
> starting any cluster with the command pcs cluster start.

Sorry if I misunderstood your mail, but if a service is disabled, that means
that it is not started on boot. You have to start it by hand.

I would advice to enable corosync on boot, but not Pacemaker :

  # enable corosync on boot:
  systemctl enable corosync

  # start corosync right now:
  systemctl start corosync

> I could ONLY manage to start corosync by reinstalling it

That's because bey default, the packaging start the service itself after
installation.

Regards,


More information about the Users mailing list