[Pacemaker] Nodes unable to connect / find each other

Thu Mar 15 13:27:22 UTC 2012

On 03/15/2012 01:57 PM, Regendoerp, Achim wrote:
> As a status update, not got any further...
> Confirmed with the Networks people that Multicast is enabled, but no luck.
> Using unicast crm is not able to connect to the cluster (below is the unicast config used).
> With unicast, there's no traffic at all (observed via tcpdump)

No traffic at all? And the network ports 5404/5405 are allocated when
checking with netstat or lsof? selinux enabled and showing problems in
audit logs, firwall issue?

Regards,
Andreas

-- 
Need help with Pacemaker?
http://www.hastexo.com/now

> 
> Beginning to wonder if there's either a version mismatch problem between those packages perhaps.
> 
> These are the package versions:
> 
> corosynclib-1.4.1-4.el6_2.1.x86_64
> clusterlib-3.0.12.1-23.el6.x86_64
> pacemaker-cluster-libs-1.1.6-3.el6.x86_64
> resource-agents-3.9.2-7.el6.x86_64
> cluster-glue-libs-1.0.5-2.el6.x86_64
> pacemaker-libs-1.1.6-3.el6.x86_64
> corosync-1.4.1-4.el6_2.1.x86_64
> pacemaker-cli-1.1.6-3.el6.x86_64
> cluster-glue-1.0.5-2.el6.x86_64
> pacemaker-1.1.6-3.el6.x86_64
> 
> Between yesterday and today updated corosync from
> 1.4.1-4.el6 to 1.4.1-4.el6_2.1
> No change however :|
> 
> OS is CentOS 6.2
> 
> Colleague's now compiling a few heartbeat packages instead to see if that gets the job done.
> 
> This being the config for unicast
> 
> compatibility: whitetank
> 
> totem {
>         version: 2
>         secauth: off
>         threads: 0
>         join:   1000
>         consensus: 7500
>         max_messages: 20
>         interface {
>                 ringnumber: 0
>                 bindnetaddr: 10.26.29.0
>                 #mcastaddr: 226.94.1.1
>                 mcastport: 5405
>                 member
>                 {
>                        memberaddr: 10.26.29.238
>                 }
>                 member
>                 {
>                        memberaddr: 10.26.29.239
>                 }
>                 ttl: 3
>         }
>        transport: udpu
>         clear_node_high_bit:    yes
> }
> 
> logging {
>         fileline: off
>         to_stderr: off
>         to_logfile: yes
>         to_syslog: yes
>         logfile: /var/log/cluster/corosync.log
>         debug: on
>         timestamp: on
>         logger_subsys {
>                 subsys: AMF
>                 debug: off
>         }
> }
> 
> amf {
>         mode: disabled
> }
> 
> 
> Any further ideas perhaps? Since I'm losing the plot myself somehow :(
> 
> Cheers!
> 
> Achim
> 
> -----Original Message-----
> From: Regendoerp, Achim [mailto:Achim.Regendoerp at galacoral.com] 
> Sent: 15 March 2012 10:56
> To: The Pacemaker cluster resource manager
> Subject: Re: [Pacemaker] Nodes unable to connect / find each other
> 
> Hi,
> 
> According to the network guy the multicast is all there, but we've got no step further, so we're trying your suggested method now :)
> 
> Didn't know that corosync now supports unicast too, must've missed that...
> 
> Thanks for the heads up!
> 
> Achim
> 
> 
> 
> -----Original Message-----
> From: Andreas Kurz [mailto:andreas at hastexo.com] 
> Sent: 14 March 2012 21:56
> To: pacemaker at oss.clusterlabs.org
> Subject: Re: [Pacemaker] Nodes unable to connect / find each other
> 
> On 03/14/2012 08:22 PM, mark - pacemaker list wrote:
>> Hi,
>>
>> On Wed, Mar 14, 2012 at 1:43 PM, Regendoerp, Achim 
>> <Achim.Regendoerp at galacoral.com 
>> <mailto:Achim.Regendoerp at galacoral.com>>
>> wrote:
>>
>>     Hi,____
>>
>>     __ __
>>
>>     Below is a cut out from the tcpdump run on both boxes. The tcpdump
>>     is the same on both boxes.____
>>
>>     The traffic only appears if I set the bindnetaddr in
>>     /etc/corosync/corosync.conf to the machines' individual IP instead
>>     of to 10.26.29.0 (as advised by howtos).____
>>
>>     Having the latter IP as bindnetaddr, there's no traffic at 
>> all.____
>>
>>     __ __
>>
>>
>>
>> I initially had some communication issues like these, then I realized 
>> I'd forgotten to set up my switch/vlan to handle multicast.  Oops.  
>> With that fixed, everything came up and has worked wonderfully since.
> 
> Or use upnp ... unicasts if you don't want/can change your network setup
> 
> Regards,
> Andreas
> 
> --
> Need help with Pacemaker?
> http://www.hastexo.com/now
> 
>>
>> Something to check, anyhow.
>>
>> Regards,
>> Mark
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org 
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org Getting started: 
>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
> 
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 222 bytes
Desc: OpenPGP digital signature
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20120315/8020ffa9/attachment-0004.sig>