[Pacemaker] Impossible to add a 4th node to a cluster

Dejan Muhamedagic dejanmm at fastmail.fm
Fri Oct 29 06:55:36 EDT 2010


Hi,

On Thu, Oct 28, 2010 at 04:09:36PM +0200, Guillaume Chanaud wrote:
>  Hello,
> 
> i have a cluster of two master/slave drbd server running into a vlan
> (machines are dedicated servers)
> (filer1 and filer2)
> I added a third node to the cluster (a "blank node" for the moment)
> correctly
> (server1)
> When i add a 4th node to the cluster (which is a "mirror" of server1)
> (server2)
> this node start as standalone...Here is the message.log :
> 
> Oct 28 15:59:27 ns209045 corosync[16543]:   [TOTEM ] A processor
> joined or left the membership and a new membership was formed.
> Oct 28 15:59:28 ns209045 corosync[16543]:   [pcmk  ] notice:
> pcmk_peer_update: Transitional membership event on ring 945392:
> memb=1, new=0, lost=0
> Oct 28 15:59:28 ns209045 corosync[16543]:   [pcmk  ] info:
> pcmk_peer_update: memb: server2 16820416
> Oct 28 15:59:28 ns209045 corosync[16543]:   [pcmk  ] notice:
> pcmk_peer_update: Stable membership event on ring 945392: memb=1,
> new=0, lost=0
> Oct 28 15:59:28 ns209045 corosync[16543]:   [pcmk  ] info:
> pcmk_peer_update: MEMB: server2 16820416
> Oct 28 15:59:28 ns209045 corosync[16543]:   [TOTEM ] A processor
> joined or left the membership and a new membership was formed.
> Oct 28 15:59:29 ns209045 corosync[16543]:   [pcmk  ] notice:
> pcmk_peer_update: Transitional membership event on ring 945416:
> memb=1, new=0, lost=0
> Oct 28 15:59:29 ns209045 corosync[16543]:   [pcmk  ] info:
> pcmk_peer_update: memb: server2 16820416
> Oct 28 15:59:29 ns209045 corosync[16543]:   [pcmk  ] notice:
> pcmk_peer_update: Stable membership event on ring 945416: memb=1,
> new=0, lost=0
> Oct 28 15:59:29 ns209045 corosync[16543]:   [pcmk  ] info:
> pcmk_peer_update: MEMB: server2 16820416
> 
> [...] Message repeat many many times
> 
> Now i stop the server1, and i start the server2...server2 start
> correctly and is added to the cluster...but when
> i want to start server1, same thing happens...(so things are
> inverted but result is the same...when i start one the serverX, the
> other can't start...)
> 
> My corosync.conf is configured in broadcast, not multicast....I have
> lots of problem with multicast because lots of briged VM on the vlan
> doesn't see the multicast packets, or doesn't join the multicast
> group correctly...
> 
> Any hint on this ??

I can vaguely recall problems with vlan and corosync (with
multicast). Any chance of trying this without vlan?

Otherwise, you will find better audience at the openais mailing
list for corosync/openais issues.

Thanks,

Dejan

> Guillaume
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker




More information about the Pacemaker mailing list