[ClusterLabs] getting "Totem is unable to form a cluster" error

Andrei Borzenkov arvidjaar at gmail.com
Fri Apr 8 18:51:01 CEST 2016


08.04.2016 17:51, Jan Friesse пишет:
>> On 04/08/16 13:01, Jan Friesse wrote:
>>  >> pacemaker 1.1.12-11.12
>>  >> openais 1.1.4-5.24.5
>>  >> corosync 1.4.7-0.23.5
>>  >>
>>  >> Its a two node active/passive cluster and we just upgraded the
>> SLES 11
>>  >> SP 3 to SLES 11 SP 4(nothing  else) but when we try to start the
>> cluster
>>  >> service we get the following error:
>>  >>
>>  >> "Totem is unable to form a cluster because of an operating system or
>>  >> network fault."
>>  >>
>>  >> Firewall is stopped and disabled on both the nodes. Both nodes can
>>  >> ping/ssh/vnc each other.
>>  >
>>  > Hard to help. First of all, I would recommend to ask SUSE support
>> because I don't really have access to source code of corosync
>> 1.4.7-0.23.5 package, so really don't know what patches are added.
>>  >
>>  >
>> Yup, ticket opened with SUSE Support.
>>
>>  >>
>>  >>
>>  >>
>>  >> /var/log/messages:
>>  >> Apr  6 17:51:49 prd1 corosync[8672]:  [MAIN  ] Corosync Cluster
>> Engine
>>  >> ('1.4.7'): started and ready to provide service.
>>  >> Apr  6 17:51:49 prd1 corosync[8672]:  [MAIN  ] Corosync built-in
>>  >> features: nss
>>  >> Apr  6 17:51:49 prd1 corosync[8672]:  [MAIN  ] Successfully
>> configured
>>  >> openais services to load
>>  >> Apr  6 17:51:49 prd1 corosync[8672]:  [MAIN  ] Successfully read main
>>  >> configuration file '/etc/corosync/corosync.conf'.
>>  >> Apr  6 17:51:49 prd1 corosync[8672]:  [TOTEM ] Initializing transport
>>  >> (UDP/IP Unicast).
>>  >> Apr  6 17:51:49 prd1 corosync[8672]:  [TOTEM ] Initializing
>>  >> transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
>>  >> Apr  6 17:51:49 prd1 corosync[8672]:  [TOTEM ] The network
>> interface is
>>  >> down.
>>  >
>>  > ^^^ This is important line. It means corosync was unable to find
>> interface for bindnetaddr 192.168.150.0. Make sure interface with this
>> network address exists.
>>  >
>>  >
>> this machine has two IP address assigned on interface bond0
>>
>> bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
>>      link/ether 74:e6:e2:73:e5:61 brd ff:ff:ff:ff:ff:ff
>>      inet 10.150.20.91/24 brd 10.150.20.55 scope global bond0
>>      inet 192.168.150.12/22 brd 192.168.151.255 scope global
>> bond0:cluster
>>      inet6 fe80::76e6:e2ff:fe73:e561/64 scope link
>>         valid_lft forever preferred_lft forever
> 
> This is ifconfig output? I'm just wondering how you were able to set two
> ipv4 addresses (in this format, I would expect another interface like
> bond0:1 or nothing at all)?
> 

That is how Linux stack works for the last 10 or 15 years. The bond0:1
is legacy emulation for ifconfig addicts.

ip addr add 10.150.20.91/24 dev bond0

> Anyway, I was trying to create bonding interface and set second ipv4
> (via ip addr) and corosync (flatiron what is 1.4.8 + 4 for your problem
> completely unrelated patches) was able to detect it without any problem.
> 
> I can recommend you to try:
> - Set bindnetaddr to IP address of given node (so you have to change
> bindnetaddr on both nodes)
> - Try upstream corosync 1.4.8/flatiron
> 
> Regards,
>   Honza
> 
>>
>> And I can ping 192.168.150.12 from this machine and from other machines
>> on network
>>
>>
>>
>> -- 
>> Regards,
>>
>> Muhammad Sharfuddin
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> http://clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
> 
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org




More information about the Users mailing list