[ClusterLabs] corosync not able to form cluster

Prasad Nagaraj prasad.nagaraj76 at gmail.com
Fri Jun 8 09:37:13 UTC 2018


Hi Christine - Thanks for looking into the logs.
I also see that the node eventually comes out of GATHER state here:

Jun 07 16:56:10 corosync [TOTEM ] entering GATHER state from 0.
Jun 07 16:56:10 corosync [TOTEM ] Creating commit token because I am the rep.

Does it mean, it has timed out or given up and then came out ?

second point, I did see some unexpected entries when I did tcpdump on the
node coro.4.. [ Its also pasted in one of the earlier threads] You can see
that it was receiving messages like

10:23:17.117347 IP 172.22.0.13.50468 > 172.22.0.4.netsupport: UDP, length
332
10:23:17.140960 IP 172.22.0.8.50438 > 172.22.0.4.netsupport: UDP, length 82
10:23:17.141319 IP 172.22.0.6.38535 > 172.22.0.4.netsupport: UDP, length 156

Please note that 172.22.0.8 and 172.22.0.6 are not part of my group and I
was wondering why these messages are coming ?

Thanks!

On Fri, Jun 8, 2018 at 2:34 PM, Christine Caulfield <ccaulfie at redhat.com>
wrote:

> On 07/06/18 18:32, Prasad Nagaraj wrote:
> > Hi Christine - Got it:)
> >
> > I have collected few seconds of debug logs from all nodes after startup.
> > Please find them attached.
> > Please let me know if this will help us to identify rootcause.
> >
>
> The problem is on the node coro.4 - it never gets out of the JOIN
>
> "Jun 07 16:55:37 corosync [TOTEM ] entering GATHER state from 11."
>
> process so something is wrong on that node, either a rogue routing table
> entry, dangling iptables rule or even a broken NIC.
>
> Chrissie
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20180608/ae1d901f/attachment-0001.html>


More information about the Users mailing list