[Pacemaker] CMAN and Pacemaker with IPv6

Tue Jul 15 06:53:25 UTC 2014

Dear Honza,

Sorry to say this, but I found new error again. LOL

This time, I already install the 1.4.1-17 as your advice.
And the nodename, without altname, is map to IPv6 using hosts file.
Everything is fine, but the 2 node can't communicate to each other.
So I add the multicast address manually, using command `ccs -f
/etc/cluster/cluster.conf --setmulticast ff::597` on both node.
After that the CMAN cannot start.

Starting cluster:
   Checking if cluster has been disabled at boot...        [  OK  ]
   Checking Network Manager...                             [  OK  ]
   Global setup...                                         [  OK  ]
   Loading kernel modules...                               [  OK  ]
   Mounting configfs...                                    [  OK  ]
   Starting cman... Timed-out waiting for cluster Check cluster logs for details
                                                           [FAILED]

I also found a lot of LOG, but I think that this is where the problem has occur.

Jul 15 13:36:14 corosync [MAIN  ] Corosync Cluster Engine ('1.4.1'):
started and ready to provide service.
Jul 15 13:36:14 corosync [MAIN  ] Corosync built-in features: nss dbus rdma snmp
Jul 15 13:36:14 corosync [MAIN  ] Successfully read config from
/etc/cluster/cluster.conf
Jul 15 13:36:14 corosync [MAIN  ] Successfully parsed cman config
Jul 15 13:36:14 corosync [TOTEM ] Initializing transport (UDP/IP Multicast).
Jul 15 13:36:14 corosync [TOTEM ] Initializing transmit/receive
security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Jul 15 13:36:14 corosync [TOTEM ] Unable to bind the socket to receive
multicast packets: Cannot assign requested address (99)
Jul 15 13:36:14 corosync [TOTEM ] Could not set traffic priority:
Socket operation on non-socket (88)
Jul 15 13:36:14 corosync [TOTEM ] The network interface
[2001:db8::151] is now up.
Jul 15 13:36:14 corosync [QUORUM] Using quorum provider quorum_cman
Jul 15 13:36:14 corosync [SERV  ] Service engine loaded: corosync
cluster quorum service v0.1
Jul 15 13:36:14 corosync [CMAN  ] CMAN 3.0.12.1 (built Apr 14 2014
09:36:10) started
Jul 15 13:36:14 corosync [SERV  ] Service engine loaded: corosync CMAN
membership service 2.90
Jul 15 13:36:14 corosync [SERV  ] Service engine loaded: openais
checkpoint service B.01.01
Jul 15 13:36:14 corosync [SERV  ] Service engine loaded: corosync
extended virtual synchrony service
Jul 15 13:36:14 corosync [SERV  ] Service engine loaded: corosync
configuration service
Jul 15 13:36:14 corosync [SERV  ] Service engine loaded: corosync
cluster closed process group service v1.01
Jul 15 13:36:14 corosync [SERV  ] Service engine loaded: corosync
cluster config database access v1.01
Jul 15 13:36:14 corosync [SERV  ] Service engine loaded: corosync
profile loading service
Jul 15 13:36:14 corosync [QUORUM] Using quorum provider quorum_cman
Jul 15 13:36:14 corosync [SERV  ] Service engine loaded: corosync
cluster quorum service v0.1
Jul 15 13:36:14 corosync [MAIN  ] Compatibility mode set to whitetank.
Using V1 and V2 of the synchronization engine.
Jul 15 13:36:17 corosync [MAIN  ] Totem is unable to form a cluster
because of an operating system or network fault. The most common cause
of this message is that the local firewall is configured improperly.
Jul 15 13:36:19 corosync [MAIN  ] Totem is unable to form a cluster
because of an operating system or network fault. The most common cause
of this message is that the local firewall is configured improperly.
Jul 15 13:36:20 corosync [MAIN  ] Totem is unable to form a cluster
because of an operating system or network fault. The most common cause
of this message is that the local firewall is configured improperly.

I cannot find the solution on Internet about "[TOTEM ] Unable to bind
the socket to receive multicast packets: Cannot assign requested
address (99)".
Do you have any idea?

Teenigma

On Tue, Jul 15, 2014 at 10:02 AM, Teerapatr Kittiratanachai
<maillist.tk at gmail.com> wrote:
> Honza
>
> Great, Thank you very much.
>
> But the terrible thing for me is I'm using the package from OpenSUSE repo.
> When i turn back to CentOS repo, which store lower version, the
> Dependency problem has occurred.
>
> Anyway, thank you for your help.
>
> Teenigma
>
> On Mon, Jul 14, 2014 at 8:51 PM, Jan Friesse <jfriesse at redhat.com> wrote:
>>> Honza,
>>>
>>> How do I include the patch with my CentOS package?
>>> Do I need to compile them manually?
>>
>>
>> Yes. Also official CentOS version was never 1.4.5. If you are using CentOS,
>> just use stock 1.4.1-17.1. Patch is included there.
>>
>> Honza
>>
>>
>>>
>>> TeEniGMa
>>>
>>> On Mon, Jul 14, 2014 at 3:21 PM, Jan Friesse <jfriesse at redhat.com> wrote:
>>>>
>>>> Teerapatr,
>>>>
>>>>
>>>>> For more information,
>>>>>
>>>>>
>>>>> these are LOG from /var/log/messages
>>>>> ...
>>>>> Jul 14 10:28:07 wh00 kernel: : DLM (built Mar 25 2014 20:01:13)
>>>>> installed
>>>>> Jul 14 10:28:07 wh00 corosync[2716]:   [MAIN  ] Corosync Cluster
>>>>> Engine ('1.4.5'): started and ready to provide service.
>>>>> Jul 14 10:28:07 wh00 corosync[2716]:   [MAIN  ] Corosync built-in
>>>>> features: nss
>>>>> Jul 14 10:28:07 wh00 corosync[2716]:   [MAIN  ] Successfully read
>>>>> config from /etc/cluster/cluster.conf
>>>>> Jul 14 10:28:07 wh00 corosync[2716]:   [MAIN  ] Successfully parsed cman
>>>>> config
>>>>> Jul 14 10:28:07 wh00 corosync[2716]:   [TOTEM ] Initializing transport
>>>>> (UDP/IP Multicast).
>>>>> Jul 14 10:28:07 wh00 corosync[2716]:   [TOTEM ] Initializing
>>>>> transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
>>>>> Jul 14 10:28:07 wh00 corosync[2716]:   [TOTEM ] The network interface is
>>>>> down.
>>>>
>>>>
>>>> ^^^ This line is important. This means, corosync was unable to find
>>>> interface with given IPv6 address. There was regression in v1.4.5 causing
>>>> this behavior. It's fixed in v1.4.6 (patch is
>>>>
>>>> https://github.com/corosync/corosync/commit/d76759ec26ecaeb9cc01f49e9eb0749b61454d27).
>>>> So you can ether apply patch or (recommended) upgrade to 1.4.7.
>>>>
>>>> Regards,
>>>>    Honza
>>>>
>>>>
>>>>
>>>>> Jul 14 10:28:10 wh00 pacemaker: Aborting startup of Pacemaker Cluster
>>>>> Manager
>>>>> ...
>>>>>
>>>>> Te
>>>>>
>>>>> On Mon, Jul 14, 2014 at 10:07 AM, Teerapatr Kittiratanachai
>>>>> <maillist.tk at gmail.com> wrote:
>>>>>>
>>>>>>
>>>>>> Dear Honza,
>>>>>>
>>>>>> Sorry for late reply.
>>>>>> After I have tested with all new configuration.
>>>>>> On IPv6 only, and with no altname.
>>>>>>
>>>>>> I face with error below,
>>>>>>
>>>>>> Starting cluster:
>>>>>>      Checking if cluster has been disabled at boot...        [  OK  ]
>>>>>>      Checking Network Manager...                             [  OK  ]
>>>>>>      Global setup...                                         [  OK  ]
>>>>>>      Loading kernel modules...                               [  OK  ]
>>>>>>      Mounting configfs...                                    [  OK  ]
>>>>>>      Starting cman... corosync died with signal: 6 Check cluster logs
>>>>>> for
>>>>>> details
>>>>>>                                                              [FAILED]
>>>>>>
>>>>>> And, exactly, there are no any enabled firewall, I also configure the
>>>>>> Multicast address as manual.
>>>>>> Could you advise me the solution?
>>>>>>
>>>>>> Many thanks in advance.
>>>>>> Te
>>>>>>
>>>>>> On Thu, Jul 10, 2014 at 6:14 PM, Jan Friesse <jfriesse at redhat.com>
>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>> Teerapatr,
>>>>>>>
>>>>>>>> Hi Honza,
>>>>>>>>
>>>>>>>> As you said I use the nodename identify by hostname (which be
>>>>>>>> accessed
>>>>>>>> via IPv6) and the node also has the altname (which be IPv4 address).
>>>>>>>>
>>>>>>>
>>>>>>> This doesn't work. Both hostname and altname have to be same IP
>>>>>>> version.
>>>>>>>
>>>>>>>> Now, I configure the mcast address for both nodename and altname
>>>>>>>> manually. The CMAN and Pacemaker can start ad well. But they don't
>>>>>>>> communicate to another node.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> PLease make sure (as I've wrote in previous email) your firewall
>>>>>>> doesn't
>>>>>>> block mcast and corosync traffic (just disable it) and switch doesn't
>>>>>>> block multicast (this is very often the case). If these are VMs, make
>>>>>>> sure to properly configure bridge (just disable firewall) and allow
>>>>>>> mcast_querier.
>>>>>>>
>>>>>>> Honza
>>>>>>>
>>>>>>>> On node0, crm_mon show node1 offline. In the same way, node one show
>>>>>>>> node0 is down. So the split brain problem occur here.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Te
>>>>>>>>
>>>>>>>> On Thu, Jul 10, 2014 at 2:50 PM, Jan Friesse <jfriesse at redhat.com>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Teerapatr,
>>>>>>>>>
>>>>>>>>>> OK, some problems are solved.
>>>>>>>>>> I use the incorrect hostname.
>>>>>>>>>>
>>>>>>>>>> For now, the new problem has occured.
>>>>>>>>>>
>>>>>>>>>>     Starting cman... Node address family does not match multicast
>>>>>>>>>> address family
>>>>>>>>>> Unable to get the configuration
>>>>>>>>>> Node address family does not match multicast address family
>>>>>>>>>> cman_tool: corosync daemon didn't start Check cluster logs for
>>>>>>>>>> details
>>>>>>>>>>
>>>>>>>>>> [FAILED]
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This looks like one of your node is also reachable via ipv4 and ipv4
>>>>>>>>> resolving is proffered. Please make sure to set only ipv6 address
>>>>>>>>> and
>>>>>>>>> try it again. Of course set mcast addr by hand maybe helpful
>>>>>>>>> (even-tho
>>>>>>>>> I
>>>>>>>>> don't believe it will solve problem you are hitting)).
>>>>>>>>>
>>>>>>>>> Also make sure ip6tables are properly configured and your switch is
>>>>>>>>> able
>>>>>>>>> to pass ipv6 mcast traffic.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>     Honza
>>>>>>>>>
>>>>>>>>>> How can i fix it? Or just assigned the multicast address in the
>>>>>>>>>> configuration?
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Te
>>>>>>>>>>
>>>>>>>>>> On Thu, Jul 10, 2014 at 7:52 AM, Teerapatr Kittiratanachai
>>>>>>>>>> <maillist.tk at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I not found any LOG message
>>>>>>>>>>>
>>>>>>>>>>> /var/log/messages
>>>>>>>>>>> ...
>>>>>>>>>>> Jul 10 07:44:19 nwh00 kernel: : DLM (built Jun 19 2014 21:16:01)
>>>>>>>>>>> installed
>>>>>>>>>>> Jul 10 07:44:22 nwh00 pacemaker: Aborting startup of Pacemaker
>>>>>>>>>>> Cluster Manager
>>>>>>>>>>> ...
>>>>>>>>>>>
>>>>>>>>>>> and this is what display when I try to start pacemaker
>>>>>>>>>>>
>>>>>>>>>>> # /etc/init.d/pacemaker start
>>>>>>>>>>> Starting cluster:
>>>>>>>>>>>      Checking if cluster has been disabled at boot...        [  OK
>>>>>>>>>>> ]
>>>>>>>>>>>      Checking Network Manager...                             [  OK
>>>>>>>>>>> ]
>>>>>>>>>>>      Global setup...                                         [  OK
>>>>>>>>>>> ]
>>>>>>>>>>>      Loading kernel modules...                               [  OK
>>>>>>>>>>> ]
>>>>>>>>>>>      Mounting configfs...                                    [  OK
>>>>>>>>>>> ]
>>>>>>>>>>>      Starting cman... Cannot find node name in cluster.conf
>>>>>>>>>>> Unable to get the configuration
>>>>>>>>>>> Cannot find node name in cluster.conf
>>>>>>>>>>> cman_tool: corosync daemon didn't start Check cluster logs for
>>>>>>>>>>> details
>>>>>>>>>>>
>>>>>>>>>>> [FAILED]
>>>>>>>>>>> Stopping cluster:
>>>>>>>>>>>      Leaving fence domain...                                 [  OK
>>>>>>>>>>> ]
>>>>>>>>>>>      Stopping gfs_controld...                                [  OK
>>>>>>>>>>> ]
>>>>>>>>>>>      Stopping dlm_controld...                                [  OK
>>>>>>>>>>> ]
>>>>>>>>>>>      Stopping fenced...                                      [  OK
>>>>>>>>>>> ]
>>>>>>>>>>>      Stopping cman...                                        [  OK
>>>>>>>>>>> ]
>>>>>>>>>>>      Unloading kernel modules...                             [  OK
>>>>>>>>>>> ]
>>>>>>>>>>>      Unmounting configfs...                                  [  OK
>>>>>>>>>>> ]
>>>>>>>>>>> Aborting startup of Pacemaker Cluster Manager
>>>>>>>>>>>
>>>>>>>>>>> another one thing, according to the happened problem, I remove the
>>>>>>>>>>> AAAA record from DNS for now and map it in to /etc/hosts files
>>>>>>>>>>> instead, as shown below.
>>>>>>>>>>>
>>>>>>>>>>> /etc/hosts
>>>>>>>>>>> ...
>>>>>>>>>>> 2001:db8:0:1::1   node0.example.com
>>>>>>>>>>> 2001:db8:0:1::2   node1.example.com
>>>>>>>>>>> ...
>>>>>>>>>>>
>>>>>>>>>>> Is there any configure that help me to got more log ?
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Jul 10, 2014 at 5:06 AM, Andrew Beekhof
>>>>>>>>>>> <andrew at beekhof.net>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 9 Jul 2014, at 9:15 pm, Teerapatr Kittiratanachai
>>>>>>>>>>>> <maillist.tk at gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Dear All,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I has implemented the HA on dual stack servers,
>>>>>>>>>>>>> Firstly, I doesn't deploy IPv6 record on DNS yet. The CMAN and
>>>>>>>>>>>>> PACEMAKER can work as normal.
>>>>>>>>>>>>> But, after I create AAAA record on DNS server, i found the error
>>>>>>>>>>>>> that
>>>>>>>>>>>>> cann't start CMAN.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Are CMAN and PACEMAKER  support the IPv6?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I don;t think pacemaker cares.
>>>>>>>>>>>> What errors did you get?
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>>>>
>>>>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>>>>> Getting started:
>>>>>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>>
>>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>>> Getting started:
>>>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>
>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>> Getting started:
>>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>
>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>> Getting started:
>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>
>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>> Getting started:
>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>
>>>>> Project Home: http://www.clusterlabs.org
>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>>
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>>
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org