[Pacemaker] Impossible to add a 4th node to a cluster

Fri Oct 29 12:15:03 EDT 2010

  Le 28/10/2010 22:08, Pavlos Parissis a écrit :
> On 28 October 2010 18:30, Guillaume Chanaud
> <guillaume.chanaud at connecting-nature.com>  wrote:
> [...snip...]
>>> corosync and auth files are the same on server2?
>>>
>> Yes of course :D (copied by scp), as i told server1 can join when server2 is
>> offline, and server 2 can join when server1 is offline, but if one is
>> online, the other can't join and log the above things in loop...
> xm you said that you server2 is a clone of server1, check if they have
> different uuids
Sorry, when i said clone this is because they are "almost" the same...In 
fact it's not fs cloning, each server
has been installed/configured separately, i already checked uuid ! 
(server1 is id=16820416 server2 is id=33597632)
>> In fact i have loooooooottttttssssss of problem with
>> corosync/pacemaker...multicast/broadcast between physical
>> servers/virtual....lots of different shit everywhere, error log are always
>> different depending on what i try...
> try to go step up step,  make sure you have correct rings, check
> related threads about rings
The config file is exactly the same on each server, it's a pretty "base" 
config...The only thing is it's broadcasting not multicasting...
Multicasting fail because of network i think...My infrastructure is made 
of two physical filer + 2 physical servers hosting 5 VM each.
Everything communicate through a vlan in an dedicated host environment 
(i do not own or control the physical layer part of the system...i 
should rely on what my host provide...But communicating through a vlan 
should not complicate multicasting/broadcasting i think...)
  here it is :
compatibility: whitetank
aisexec {
         # Run as root - this is necessary to be able to manage 
resources with Pacemaker
         user: root
         group: root
}
service {
         # Load the Pacemaker Cluster Resource Manager
         name: pacemaker
         ver: 0
         use_logd: yes
}
totem {
         version: 2
         secauth: off
         threads: 0
         interface {
                 ringnumber: 0
                 bindnetaddr: 192.168.0.0
#              mcastaddr: 239.192.168.1
                 broadcast: yes
                 mcastport: 5405
         }
}

logging {
         fileline: off
         to_stderr: yes
         to_logfile: yes
         to_syslog: yes
         logfile: /tmp/corosync.log
         debug: off
         timestamp: on
         logger_subsys {
                 subsys: AMF
                 debug: off
         }
}

amf {
         mode: disabled
}
>> The strange things is that the filer1 filer2 server2 and server1 are all
>> running the same distro (gentoo) with same tools and are on the same vlan
>> (which is working for lots of services like nfs...)
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs:
>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker