[ClusterLabs] getting "Totem is unable to form a cluster" error
Jan Friesse
jfriesse at redhat.com
Fri Apr 8 14:51:23 UTC 2016
> On 04/08/16 13:01, Jan Friesse wrote:
> >> pacemaker 1.1.12-11.12
> >> openais 1.1.4-5.24.5
> >> corosync 1.4.7-0.23.5
> >>
> >> Its a two node active/passive cluster and we just upgraded the SLES 11
> >> SP 3 to SLES 11 SP 4(nothing else) but when we try to start the
> cluster
> >> service we get the following error:
> >>
> >> "Totem is unable to form a cluster because of an operating system or
> >> network fault."
> >>
> >> Firewall is stopped and disabled on both the nodes. Both nodes can
> >> ping/ssh/vnc each other.
> >
> > Hard to help. First of all, I would recommend to ask SUSE support
> because I don't really have access to source code of corosync
> 1.4.7-0.23.5 package, so really don't know what patches are added.
> >
> >
> Yup, ticket opened with SUSE Support.
>
> >>
> >>
> >>
> >> /var/log/messages:
> >> Apr 6 17:51:49 prd1 corosync[8672]: [MAIN ] Corosync Cluster Engine
> >> ('1.4.7'): started and ready to provide service.
> >> Apr 6 17:51:49 prd1 corosync[8672]: [MAIN ] Corosync built-in
> >> features: nss
> >> Apr 6 17:51:49 prd1 corosync[8672]: [MAIN ] Successfully configured
> >> openais services to load
> >> Apr 6 17:51:49 prd1 corosync[8672]: [MAIN ] Successfully read main
> >> configuration file '/etc/corosync/corosync.conf'.
> >> Apr 6 17:51:49 prd1 corosync[8672]: [TOTEM ] Initializing transport
> >> (UDP/IP Unicast).
> >> Apr 6 17:51:49 prd1 corosync[8672]: [TOTEM ] Initializing
> >> transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
> >> Apr 6 17:51:49 prd1 corosync[8672]: [TOTEM ] The network interface is
> >> down.
> >
> > ^^^ This is important line. It means corosync was unable to find
> interface for bindnetaddr 192.168.150.0. Make sure interface with this
> network address exists.
> >
> >
> this machine has two IP address assigned on interface bond0
>
> bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
> link/ether 74:e6:e2:73:e5:61 brd ff:ff:ff:ff:ff:ff
> inet 10.150.20.91/24 brd 10.150.20.55 scope global bond0
> inet 192.168.150.12/22 brd 192.168.151.255 scope global bond0:cluster
> inet6 fe80::76e6:e2ff:fe73:e561/64 scope link
> valid_lft forever preferred_lft forever
This is ifconfig output? I'm just wondering how you were able to set two
ipv4 addresses (in this format, I would expect another interface like
bond0:1 or nothing at all)?
Anyway, I was trying to create bonding interface and set second ipv4
(via ip addr) and corosync (flatiron what is 1.4.8 + 4 for your problem
completely unrelated patches) was able to detect it without any problem.
I can recommend you to try:
- Set bindnetaddr to IP address of given node (so you have to change
bindnetaddr on both nodes)
- Try upstream corosync 1.4.8/flatiron
Regards,
Honza
>
> And I can ping 192.168.150.12 from this machine and from other machines
> on network
>
>
>
> --
> Regards,
>
> Muhammad Sharfuddin
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Users
mailing list