[ClusterLabs] corosync not able to form cluster

Christine Caulfield ccaulfie at redhat.com
Thu Jun 7 10:03:50 UTC 2018


On 07/06/18 09:21, Prasad Nagaraj wrote:
> Hi - I am running corosync on  3 nodes of CentOS release 6.9 (Final).
> Corosync version is  corosync-1.4.7.
> The nodes are not seeing each other and not able to form memberships.
> What I see is continuous message about " A processor joined or left the
> membership and a new membership was formed."
> For example:on node:  vm2883711991 
> 

I can't draw any conclusions from the logs, we'd need to see what
corosync though it was binding to and the IP addresses of the hosts.

Have a look at the start of the logs and see if they match what you'd
expect (ie are similar to the ones on the working clusters), Also check
using lsof, to see what addresses corosync is bound to. tcpdump on port
5405 will show you if traffic is leaving the nodes and being received.

Also check firewall settings and make sure the nodes can ping each other.

If you're still stumped them feel free to post more info here for us to
look at, though if you have that configuration working on other nodes it
might be something in your environment

Chrissie


> 
> Jun 07 07:54:52 corosync [pcmk  ] info: pcmk_peer_update: MEMB:
> vm2883711991 184555180
> Jun 07 07:54:52 corosync [TOTEM ] A processor joined or left the
> membership and a new membership was formed.
> Jun 07 07:54:52 corosync [CPG   ] chosen downlist: sender r(0)
> ip(172.22.0.11) ; members(old:1 left:0)
> Jun 07 07:54:52 corosync [MAIN  ] Completed service synchronization,
> ready to provide service.
> Jun 07 07:55:04 corosync [pcmk  ] notice: pcmk_peer_update: Transitional
> membership event on ring 71084: memb=1, new=0, lost=0
> Jun 07 07:55:04 corosync [pcmk  ] info: pcmk_peer_update: memb:
> vm2883711991 184555180
> Jun 07 07:55:04 corosync [pcmk  ] notice: pcmk_peer_update: Stable
> membership event on ring 71084: memb=1, new=0, lost=0
> Jun 07 07:55:04 corosync [pcmk  ] info: pcmk_peer_update: MEMB:
> vm2883711991 184555180
> Jun 07 07:55:04 corosync [TOTEM ] A processor joined or left the
> membership and a new membership was formed.
> Jun 07 07:55:04 corosync [CPG   ] chosen downlist: sender r(0)
> ip(172.22.0.11) ; members(old:1 left:0)
> Jun 07 07:55:04 corosync [MAIN  ] Completed service synchronization,
> ready to provide service.
> Jun 07 07:55:16 corosync [pcmk  ] notice: pcmk_peer_update: Transitional
> membership event on ring 71096: memb=1, new=0, lost=0
> Jun 07 07:55:16 corosync [pcmk  ] info: pcmk_peer_update: memb:
> vm2883711991 184555180
> Jun 07 07:55:16 corosync [pcmk  ] notice: pcmk_peer_update: Stable
> membership event on ring 71096: memb=1, new=0, lost=0
> Jun 07 07:55:16 corosync [pcmk  ] info: pcmk_peer_update: MEMB:
> vm2883711991 184555180
> Jun 07 07:55:16 corosync [TOTEM ] A processor joined or left the
> membership and a new membership was formed.
> Jun 07 07:55:16 corosync [CPG   ] chosen downlist: sender r(0)
> ip(172.22.0.11) ; members(old:1 left:0)
> Jun 07 07:55:16 corosync [MAIN  ] Completed service synchronization,
> ready to provide service.
> Jun 07 07:55:28 corosync [pcmk  ] notice: pcmk_peer_update: Transitional
> membership event on ring 71108: memb=1, new=0, lost=0
> Jun 07 07:55:28 corosync [pcmk  ] info: pcmk_peer_update: memb:
> vm2883711991 184555180
> Jun 07 07:55:28 corosync [pcmk  ] notice: pcmk_peer_update: Stable
> membership event on ring 71108: memb=1, new=0, lost=0
> Jun 07 07:55:28 corosync [pcmk  ] info: pcmk_peer_update: MEMB:
> vm2883711991 184555180
> Jun 07 07:55:28 corosync [TOTEM ] A processor joined or left the
> membership and a new membership was formed.
> Jun 07 07:55:28 corosync [CPG   ] chosen downlist: sender r(0)
> ip(172.22.0.11) ; members(old:1 left:0)
> Jun 07 07:55:28 corosync [MAIN  ] Completed service synchronization,
> ready to provide service.
> Jun 07 07:55:40 corosync [pcmk  ] notice: pcmk_peer_update: Transitional
> membership event on ring 71120: memb=1, new=0, lost=0
> Jun 07 07:55:40 corosync [pcmk  ] info: pcmk_peer_update: memb:
> vm2883711991 184555180
> Jun 07 07:55:40 corosync [pcmk  ] notice: pcmk_peer_update: Stable
> membership event on ring 71120: memb=1, new=0, lost=0
> Jun 07 07:55:40 corosync [pcmk  ] info: pcmk_peer_update: MEMB:
> vm2883711991 184555180
> Jun 07 07:55:40 corosync [TOTEM ] A processor joined or left the
> membership and a new membership was formed.
> Jun 07 07:55:40 corosync [CPG   ] chosen downlist: sender r(0)
> ip(172.22.0.11) ; members(old:1 left:0)
> Jun 07 07:55:40 corosync [MAIN  ] Completed service synchronization,
> ready to provide service.
> Jun 07 07:55:52 corosync [pcmk  ] notice: pcmk_peer_update: Transitional
> membership event on ring 71132: memb=1, new=0, lost=0
> Jun 07 07:55:52 corosync [pcmk  ] info: pcmk_peer_update: memb:
> vm2883711991 184555180
> Jun 07 07:55:52 corosync [pcmk  ] notice: pcmk_peer_update: Stable
> membership event on ring 71132: memb=1, new=0, lost=0
> Jun 07 07:55:52 corosync [pcmk  ] info: pcmk_peer_update: MEMB:
> vm2883711991 184555180
> Jun 07 07:55:52 corosync [TOTEM ] A processor joined or left the
> membership and a new membership was formed.
> Jun 07 07:55:52 corosync [CPG   ] chosen downlist: sender r(0)
> ip(172.22.0.11) ; members(old:1 left:0)
> Jun 07 07:55:52 corosync [MAIN  ] Completed service synchronization,
> ready to provide service.
> Jun 07 07:56:04 corosync [pcmk  ] notice: pcmk_peer_update: Transitional
> membership event on ring 71144: memb=1, new=0, lost=0
> Jun 07 07:56:04 corosync [pcmk  ] info: pcmk_peer_update: memb:
> vm2883711991 184555180
> Jun 07 07:56:04 corosync [pcmk  ] notice: pcmk_peer_update: Stable
> membership event on ring 71144: memb=1, new=0, lost=0
> Jun 07 07:56:04 corosync [pcmk  ] info: pcmk_peer_update: MEMB:
> vm2883711991 184555180
> Jun 07 07:56:04 corosync [TOTEM ] A processor joined or left the
> membership and a new membership was formed.
> Jun 07 07:56:17 corosync [pcmk  ] notice: pcmk_peer_update: Transitional
> membership event on ring 71156: memb=1, new=0, lost=0
> Jun 07 07:56:17 corosync [pcmk  ] info: pcmk_peer_update: memb:
> vm2883711991 184555180
> Jun 07 07:56:17 corosync [pcmk  ] notice: pcmk_peer_update: Stable
> membership event on ring 71156: memb=1, new=0, lost=0
> Jun 07 07:56:17 corosync [pcmk  ] info: pcmk_peer_update: MEMB:
> vm2883711991 184555180
> Jun 07 07:56:17 corosync [TOTEM ] A processor joined or left the
> membership and a new membership was formed.
> Jun 07 07:56:17 corosync [CPG   ] chosen downlist: sender r(0)
> ip(172.22.0.11) ; members(old:1 left:0)
> Jun 07 07:56:17 corosync [MAIN  ] Completed service synchronization,
> ready to provide service.
> 
> Similarly on node:  vme6c95166f0 
> 
> Jun 07 07:55:12 corosync [pcmk  ] notice: pcmk_peer_update: Transitional
> membership event on ring 71088: memb=1, new=0, lost=0
> Jun 07 07:55:12 corosync [pcmk  ] info: pcmk_peer_update: memb:
> vme6c95166f0 67114668
> Jun 07 07:55:12 corosync [pcmk  ] notice: pcmk_peer_update: Stable
> membership event on ring 71088: memb=1, new=0, lost=0
> Jun 07 07:55:12 corosync [pcmk  ] info: pcmk_peer_update: MEMB:
> vme6c95166f0 67114668
> Jun 07 07:55:12 corosync [TOTEM ] A processor joined or left the
> membership and a new membership was formed.
> Jun 07 07:55:12 corosync [CPG   ] chosen downlist: sender r(0)
> ip(172.22.0.4) ; members(old:1 left:0)
> Jun 07 07:55:12 corosync [MAIN  ] Completed service synchronization,
> ready to provide service.
> Jun 07 07:55:24 corosync [pcmk  ] notice: pcmk_peer_update: Transitional
> membership event on ring 71100: memb=1, new=0, lost=0
> Jun 07 07:55:24 corosync [pcmk  ] info: pcmk_peer_update: memb:
> vme6c95166f0 67114668
> Jun 07 07:55:24 corosync [pcmk  ] notice: pcmk_peer_update: Stable
> membership event on ring 71100: memb=1, new=0, lost=0
> Jun 07 07:55:24 corosync [pcmk  ] info: pcmk_peer_update: MEMB:
> vme6c95166f0 67114668
> Jun 07 07:55:24 corosync [TOTEM ] A processor joined or left the
> membership and a new membership was formed.
> Jun 07 07:55:24 corosync [CPG   ] chosen downlist: sender r(0)
> ip(172.22.0.4) ; members(old:1 left:0)
> Jun 07 07:55:24 corosync [MAIN  ] Completed service synchronization,
> ready to provide service.
> Jun 07 07:55:37 corosync [pcmk  ] notice: pcmk_peer_update: Transitional
> membership event on ring 71112: memb=1, new=0, lost=0
> Jun 07 07:55:37 corosync [pcmk  ] info: pcmk_peer_update: memb:
> vme6c95166f0 67114668
> Jun 07 07:55:37 corosync [pcmk  ] notice: pcmk_peer_update: Stable
> membership event on ring 71112: memb=1, new=0, lost=0
> Jun 07 07:55:37 corosync [pcmk  ] info: pcmk_peer_update: MEMB:
> vme6c95166f0 67114668
> Jun 07 07:55:37 corosync [TOTEM ] A processor joined or left the
> membership and a new membership was formed.
> Jun 07 07:55:49 corosync [pcmk  ] notice: pcmk_peer_update: Transitional
> membership event on ring 71124: memb=1, new=0, lost=0
> Jun 07 07:55:49 corosync [pcmk  ] info: pcmk_peer_update: memb:
> vme6c95166f0 67114668
> Jun 07 07:55:49 corosync [pcmk  ] notice: pcmk_peer_update: Stable
> membership event on ring 71124: memb=1, new=0, lost=0
> Jun 07 07:55:49 corosync [pcmk  ] info: pcmk_peer_update: MEMB:
> vme6c95166f0 67114668
> Jun 07 07:55:49 corosync [TOTEM ] A processor joined or left the
> membership and a new membership was formed.
> Jun 07 07:55:49 corosync [CPG   ] chosen downlist: sender r(0)
> ip(172.22.0.4) ; members(old:1 left:0)
> Jun 07 07:55:49 corosync [MAIN  ] Completed service synchronization,
> ready to provide service.
> Jun 07 07:56:02 corosync [pcmk  ] notice: pcmk_peer_update: Transitional
> membership event on ring 71136: memb=1, new=0, lost=0
> Jun 07 07:56:02 corosync [pcmk  ] info: pcmk_peer_update: memb:
> vme6c95166f0 67114668
> Jun 07 07:56:02 corosync [pcmk  ] notice: pcmk_peer_update: Stable
> membership event on ring 71136: memb=1, new=0, lost=0
> Jun 07 07:56:02 corosync [pcmk  ] info: pcmk_peer_update: MEMB:
> vme6c95166f0 67114668
> Jun 07 07:56:02 corosync [TOTEM ] A processor joined or left the
> membership and a new membership was formed.
> Jun 07 07:56:02 corosync [CPG   ] chosen downlist: sender r(0)
> ip(172.22.0.4) ; members(old:1 left:0)
> Jun 07 07:56:02 corosync [MAIN  ] Completed service synchronization,
> ready to provide service.
> Jun 07 07:56:14 corosync [pcmk  ] notice: pcmk_peer_update: Transitional
> membership event on ring 71148: memb=1, new=0, lost=0
> Jun 07 07:56:14 corosync [pcmk  ] info: pcmk_peer_update: memb:
> vme6c95166f0 67114668
> Jun 07 07:56:14 corosync [pcmk  ] notice: pcmk_peer_update: Stable
> membership event on ring 71148: memb=1, new=0, lost=0
> Jun 07 07:56:14 corosync [pcmk  ] info: pcmk_peer_update: MEMB:
> vme6c95166f0 67114668
> Jun 07 07:56:14 corosync [TOTEM ] A processor joined or left the
> membership and a new membership was formed.
> Jun 07 07:56:27 corosync [pcmk  ] notice: pcmk_peer_update: Transitional
> membership event on ring 71160: memb=1, new=0, lost=0
> Jun 07 07:56:27 corosync [pcmk  ] info: pcmk_peer_update: memb:
> vme6c95166f0 67114668
> Jun 07 07:56:27 corosync [pcmk  ] notice: pcmk_peer_update: Stable
> membership event on ring 71160: memb=1, new=0, lost=0
> Jun 07 07:56:27 corosync [pcmk  ] info: pcmk_peer_update: MEMB:
> vme6c95166f0 67114668
> Jun 07 07:56:27 corosync [TOTEM ] A processor joined or left the
> membership and a new membership was formed.
> Jun 07 07:56:39 corosync [pcmk  ] notice: pcmk_peer_update: Transitional
> membership event on ring 71172: memb=1, new=0, lost=0
> Jun 07 07:56:39 corosync [pcmk  ] info: pcmk_peer_update: memb:
> vme6c95166f0 67114668
> Jun 07 07:56:39 corosync [pcmk  ] notice: pcmk_peer_update: Stable
> membership event on ring 71172: memb=1, new=0, lost=0
> Jun 07 07:56:39 corosync [pcmk  ] info: pcmk_peer_update: MEMB:
> vme6c95166f0 67114668
> Jun 07 07:56:39 corosync [TOTEM ] A processor joined or left the
> membership and a new membership was formed.
> Jun 07 07:56:39 corosync [CPG   ] chosen downlist: sender r(0)
> ip(172.22.0.4) ; members(old:1 left:0)
> Jun 07 07:56:39 corosync [MAIN  ] Completed service synchronization,
> ready to provide service.
> 
> and similar message on the third node as well.
> 
> my corosync.conf is as follows:
> compatibility: whitetank
> totem {
>     version: 2
>     secauth: off
>     threads: 0
>     interface {
>     member {
>             memberaddr: 172.22.0.4
>         }
> member {
>             memberaddr: 172.22.0.11
>         }
> member {
>             memberaddr: 172.22.0.13
>         }
> 
>     bindnetaddr: 172.22.0.4
> 
>     ringnumber: 0
>     mcastport: 5405
>     ttl: 1
>     }
>     transport: udpu
>     token: 10000
>     token_retransmits_before_loss_const: 10
> }
> 
> logging {
>     fileline: off
>     to_stderr: yes
>     to_logfile: yes
>     to_syslog: no
>     logfile: /var/log/cluster/corosync.log
>     timestamp: on
>     logger_subsys {
>     subsys: AMF
>     debug: on
>     }
> }
> service {
>     name: pacemaker
>     ver: 1
> }
> amf {
>     mode: disabled
> }
> 
> In general , this configuration has worked for me in other clusters I
> have, but this particular setup is running into this issue. Request help
> on how to debug and resolve this condition.
> 
> Thanks
> 
> 
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 



More information about the Users mailing list