[ClusterLabs] corosync not able to form cluster

Christine Caulfield ccaulfie at redhat.com
Thu Jun 7 11:13:51 EDT 2018


On 07/06/18 15:53, Prasad Nagaraj wrote:
> Hi - As you can see in the corosync.conf details - i have already kept
> debug: on
> 

But only in the (disabled) AMF subsystem, not for corosync as a whole :)

    logger_subsys {
    subsys: AMF
    debug: on
    }


Chrissie


> 
> On Thu, 7 Jun 2018, 8:03 pm Christine Caulfield, <ccaulfie at redhat.com
> <mailto:ccaulfie at redhat.com>> wrote:
> 
>     On 07/06/18 15:24, Prasad Nagaraj wrote:
>     >
>     > No iptables or otherwise firewalls are setup on these nodes.
>     >
>     > One observation is that each node sends messages on with its own ring
>     > sequence number which is not converging.. I have seen that in a good
>     > cluster, when nodes respond with same sequence number, the
>     membership is
>     > automatically formed. But in our case, that is not the case.
>     >
> 
>     That's just a side-effect of the cluster not forming. It's not causing
>     it. Can you enable full corosync debugging (just add debug:on to the end
>     of the logging {} stanza) and see if that has any more useful
>     information (I only need the corosync bits, not the pcmk ones)
> 
>     Chrissie
> 
>     > Example: we can see that one node sends
>     > Jun 07 07:55:04 corosync [pcmk  ] notice: pcmk_peer_update:
>     Transitional
>     > membership event on ring 71084: memb=1, new=0, lost=0
>     > .....
>     > Jun 07 07:55:16 corosync [pcmk  ] notice: pcmk_peer_update:
>     Transitional
>     > membership event on ring 71096: memb=1, new=0, lost=0
>     > Jun 07 07:55:16 corosync [pcmk  ] notice: pcmk_peer_update: Stable
>     > membership event on ring 71096: memb=1, new=0, lost=0
>     >
>     > other node sends messages with its own numbers
>     > Jun 07 07:55:12 corosync [pcmk  ] notice: pcmk_peer_update:
>     Transitional
>     > membership event on ring 71088: memb=1, new=0, lost=0
>     > Jun 07 07:55:12 corosync [pcmk  ] notice: pcmk_peer_update: Stable
>     > membership event on ring 71088: memb=1, new=0, lost=0
>     > .......
>     > Jun 07 07:55:24 corosync [pcmk  ] notice: pcmk_peer_update:
>     Transitional
>     > membership event on ring 71100: memb=1, new=0, lost=0
>     > Jun 07 07:55:24 corosync [pcmk  ] notice: pcmk_peer_update: Stable
>     > membership event on ring 71100: memb=1, new=0, lost=0
>     >
>     > Any idea why this happens, and why the seq. numbers from different
>     nodes
>     > are not converging ?
>     >
>     > Thanks!
>     >
>     >
>     >
>     >
>     >
>     > _______________________________________________
>     > Users mailing list: Users at clusterlabs.org
>     <mailto:Users at clusterlabs.org>
>     > https://lists.clusterlabs.org/mailman/listinfo/users
>     >
>     > Project Home: http://www.clusterlabs.org
>     > Getting started:
>     http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>     > Bugs: http://bugs.clusterlabs.org
>     >
> 
>     _______________________________________________
>     Users mailing list: Users at clusterlabs.org <mailto:Users at clusterlabs.org>
>     https://lists.clusterlabs.org/mailman/listinfo/users
> 
>     Project Home: http://www.clusterlabs.org
>     Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>     Bugs: http://bugs.clusterlabs.org
> 
> 
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 




More information about the Users mailing list