[Pacemaker] CMAN and Pacemaker with IPv6

Wed Jul 16 21:06:01 EDT 2014

I remove it out, and let's Corosync design itself what address will be used.
It's work well.

Teenigma

On Wed, Jul 16, 2014 at 7:05 PM, Jan Friesse <jfriesse at redhat.com> wrote:
> Teerapatr
>
>> Dear Honza,
>>
>> Sorry to say this, but I found new error again. LOL
>>
>> This time, I already install the 1.4.1-17 as your advice.
>> And the nodename, without altname, is map to IPv6 using hosts file.
>> Everything is fine, but the 2 node can't communicate to each other.
>> So I add the multicast address manually, using command `ccs -f
>> /etc/cluster/cluster.conf --setmulticast ff::597` on both node.
>> After that the CMAN cannot start.
>
> ff:: is not valid ipv6 multicast address. Use something like ff3e::597.
>
>
>>
>> Starting cluster:
>>    Checking if cluster has been disabled at boot...        [  OK  ]
>>    Checking Network Manager...                             [  OK  ]
>>    Global setup...                                         [  OK  ]
>>    Loading kernel modules...                               [  OK  ]
>>    Mounting configfs...                                    [  OK  ]
>>    Starting cman... Timed-out waiting for cluster Check cluster logs for details
>>                                                            [FAILED]
>>
>> I also found a lot of LOG, but I think that this is where the problem has occur.
>>
>> Jul 15 13:36:14 corosync [MAIN  ] Corosync Cluster Engine ('1.4.1'):
>> started and ready to provide service.
>> Jul 15 13:36:14 corosync [MAIN  ] Corosync built-in features: nss dbus rdma snmp
>> Jul 15 13:36:14 corosync [MAIN  ] Successfully read config from
>> /etc/cluster/cluster.conf
>> Jul 15 13:36:14 corosync [MAIN  ] Successfully parsed cman config
>> Jul 15 13:36:14 corosync [TOTEM ] Initializing transport (UDP/IP Multicast).
>> Jul 15 13:36:14 corosync [TOTEM ] Initializing transmit/receive
>> security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
>> Jul 15 13:36:14 corosync [TOTEM ] Unable to bind the socket to receive
>> multicast packets: Cannot assign requested address (99)
>> Jul 15 13:36:14 corosync [TOTEM ] Could not set traffic priority:
>> Socket operation on non-socket (88)
>> Jul 15 13:36:14 corosync [TOTEM ] The network interface
>> [2001:db8::151] is now up.
>> Jul 15 13:36:14 corosync [QUORUM] Using quorum provider quorum_cman
>> Jul 15 13:36:14 corosync [SERV  ] Service engine loaded: corosync
>> cluster quorum service v0.1
>> Jul 15 13:36:14 corosync [CMAN  ] CMAN 3.0.12.1 (built Apr 14 2014
>> 09:36:10) started
>> Jul 15 13:36:14 corosync [SERV  ] Service engine loaded: corosync CMAN
>> membership service 2.90
>> Jul 15 13:36:14 corosync [SERV  ] Service engine loaded: openais
>> checkpoint service B.01.01
>> Jul 15 13:36:14 corosync [SERV  ] Service engine loaded: corosync
>> extended virtual synchrony service
>> Jul 15 13:36:14 corosync [SERV  ] Service engine loaded: corosync
>> configuration service
>> Jul 15 13:36:14 corosync [SERV  ] Service engine loaded: corosync
>> cluster closed process group service v1.01
>> Jul 15 13:36:14 corosync [SERV  ] Service engine loaded: corosync
>> cluster config database access v1.01
>> Jul 15 13:36:14 corosync [SERV  ] Service engine loaded: corosync
>> profile loading service
>> Jul 15 13:36:14 corosync [QUORUM] Using quorum provider quorum_cman
>> Jul 15 13:36:14 corosync [SERV  ] Service engine loaded: corosync
>> cluster quorum service v0.1
>> Jul 15 13:36:14 corosync [MAIN  ] Compatibility mode set to whitetank.
>> Using V1 and V2 of the synchronization engine.
>> Jul 15 13:36:17 corosync [MAIN  ] Totem is unable to form a cluster
>> because of an operating system or network fault. The most common cause
>> of this message is that the local firewall is configured improperly.
>> Jul 15 13:36:19 corosync [MAIN  ] Totem is unable to form a cluster
>> because of an operating system or network fault. The most common cause
>> of this message is that the local firewall is configured improperly.
>> Jul 15 13:36:20 corosync [MAIN  ] Totem is unable to form a cluster
>> because of an operating system or network fault. The most common cause
>> of this message is that the local firewall is configured improperly.
>>
>> I cannot find the solution on Internet about "[TOTEM ] Unable to bind
>> the socket to receive multicast packets: Cannot assign requested
>> address (99)".
>> Do you have any idea?
>>
>> Teenigma
>>
>> On Tue, Jul 15, 2014 at 10:02 AM, Teerapatr Kittiratanachai
>> <maillist.tk at gmail.com> wrote:
>>> Honza
>>>
>>> Great, Thank you very much.
>>>
>>> But the terrible thing for me is I'm using the package from OpenSUSE repo.
>>> When i turn back to CentOS repo, which store lower version, the
>>> Dependency problem has occurred.
>>>
>>> Anyway, thank you for your help.
>>>
>>> Teenigma
>>>
>>> On Mon, Jul 14, 2014 at 8:51 PM, Jan Friesse <jfriesse at redhat.com> wrote:
>>>>> Honza,
>>>>>
>>>>> How do I include the patch with my CentOS package?
>>>>> Do I need to compile them manually?
>>>>
>>>>
>>>> Yes. Also official CentOS version was never 1.4.5. If you are using CentOS,
>>>> just use stock 1.4.1-17.1. Patch is included there.
>>>>
>>>> Honza
>>>>
>>>>
>>>>>
>>>>> TeEniGMa
>>>>>
>>>>> On Mon, Jul 14, 2014 at 3:21 PM, Jan Friesse <jfriesse at redhat.com> wrote:
>>>>>>
>>>>>> Teerapatr,
>>>>>>
>>>>>>
>>>>>>> For more information,
>>>>>>>
>>>>>>>
>>>>>>> these are LOG from /var/log/messages
>>>>>>> ...
>>>>>>> Jul 14 10:28:07 wh00 kernel: : DLM (built Mar 25 2014 20:01:13)
>>>>>>> installed
>>>>>>> Jul 14 10:28:07 wh00 corosync[2716]:   [MAIN  ] Corosync Cluster
>>>>>>> Engine ('1.4.5'): started and ready to provide service.
>>>>>>> Jul 14 10:28:07 wh00 corosync[2716]:   [MAIN  ] Corosync built-in
>>>>>>> features: nss
>>>>>>> Jul 14 10:28:07 wh00 corosync[2716]:   [MAIN  ] Successfully read
>>>>>>> config from /etc/cluster/cluster.conf
>>>>>>> Jul 14 10:28:07 wh00 corosync[2716]:   [MAIN  ] Successfully parsed cman
>>>>>>> config
>>>>>>> Jul 14 10:28:07 wh00 corosync[2716]:   [TOTEM ] Initializing transport
>>>>>>> (UDP/IP Multicast).
>>>>>>> Jul 14 10:28:07 wh00 corosync[2716]:   [TOTEM ] Initializing
>>>>>>> transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
>>>>>>> Jul 14 10:28:07 wh00 corosync[2716]:   [TOTEM ] The network interface is
>>>>>>> down.
>>>>>>
>>>>>>
>>>>>> ^^^ This line is important. This means, corosync was unable to find
>>>>>> interface with given IPv6 address. There was regression in v1.4.5 causing
>>>>>> this behavior. It's fixed in v1.4.6 (patch is
>>>>>>
>>>>>> https://github.com/corosync/corosync/commit/d76759ec26ecaeb9cc01f49e9eb0749b61454d27).
>>>>>> So you can ether apply patch or (recommended) upgrade to 1.4.7.
>>>>>>
>>>>>> Regards,
>>>>>>    Honza
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Jul 14 10:28:10 wh00 pacemaker: Aborting startup of Pacemaker Cluster
>>>>>>> Manager
>>>>>>> ...
>>>>>>>
>>>>>>> Te
>>>>>>>
>>>>>>> On Mon, Jul 14, 2014 at 10:07 AM, Teerapatr Kittiratanachai
>>>>>>> <maillist.tk at gmail.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> Dear Honza,
>>>>>>>>
>>>>>>>> Sorry for late reply.
>>>>>>>> After I have tested with all new configuration.
>>>>>>>> On IPv6 only, and with no altname.
>>>>>>>>
>>>>>>>> I face with error below,
>>>>>>>>
>>>>>>>> Starting cluster:
>>>>>>>>      Checking if cluster has been disabled at boot...        [  OK  ]
>>>>>>>>      Checking Network Manager...                             [  OK  ]
>>>>>>>>      Global setup...                                         [  OK  ]
>>>>>>>>      Loading kernel modules...                               [  OK  ]
>>>>>>>>      Mounting configfs...                                    [  OK  ]
>>>>>>>>      Starting cman... corosync died with signal: 6 Check cluster logs
>>>>>>>> for
>>>>>>>> details
>>>>>>>>                                                              [FAILED]
>>>>>>>>
>>>>>>>> And, exactly, there are no any enabled firewall, I also configure the
>>>>>>>> Multicast address as manual.
>>>>>>>> Could you advise me the solution?
>>>>>>>>
>>>>>>>> Many thanks in advance.
>>>>>>>> Te
>>>>>>>>
>>>>>>>> On Thu, Jul 10, 2014 at 6:14 PM, Jan Friesse <jfriesse at redhat.com>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Teerapatr,
>>>>>>>>>
>>>>>>>>>> Hi Honza,
>>>>>>>>>>
>>>>>>>>>> As you said I use the nodename identify by hostname (which be
>>>>>>>>>> accessed
>>>>>>>>>> via IPv6) and the node also has the altname (which be IPv4 address).
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This doesn't work. Both hostname and altname have to be same IP
>>>>>>>>> version.
>>>>>>>>>
>>>>>>>>>> Now, I configure the mcast address for both nodename and altname
>>>>>>>>>> manually. The CMAN and Pacemaker can start ad well. But they don't
>>>>>>>>>> communicate to another node.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> PLease make sure (as I've wrote in previous email) your firewall
>>>>>>>>> doesn't
>>>>>>>>> block mcast and corosync traffic (just disable it) and switch doesn't
>>>>>>>>> block multicast (this is very often the case). If these are VMs, make
>>>>>>>>> sure to properly configure bridge (just disable firewall) and allow
>>>>>>>>> mcast_querier.
>>>>>>>>>
>>>>>>>>> Honza
>>>>>>>>>
>>>>>>>>>> On node0, crm_mon show node1 offline. In the same way, node one show
>>>>>>>>>> node0 is down. So the split brain problem occur here.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Te
>>>>>>>>>>
>>>>>>>>>> On Thu, Jul 10, 2014 at 2:50 PM, Jan Friesse <jfriesse at redhat.com>
>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Teerapatr,
>>>>>>>>>>>
>>>>>>>>>>>> OK, some problems are solved.
>>>>>>>>>>>> I use the incorrect hostname.
>>>>>>>>>>>>
>>>>>>>>>>>> For now, the new problem has occured.
>>>>>>>>>>>>
>>>>>>>>>>>>     Starting cman... Node address family does not match multicast
>>>>>>>>>>>> address family
>>>>>>>>>>>> Unable to get the configuration
>>>>>>>>>>>> Node address family does not match multicast address family
>>>>>>>>>>>> cman_tool: corosync daemon didn't start Check cluster logs for
>>>>>>>>>>>> details
>>>>>>>>>>>>
>>>>>>>>>>>> [FAILED]
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> This looks like one of your node is also reachable via ipv4 and ipv4
>>>>>>>>>>> resolving is proffered. Please make sure to set only ipv6 address
>>>>>>>>>>> and
>>>>>>>>>>> try it again. Of course set mcast addr by hand maybe helpful
>>>>>>>>>>> (even-tho
>>>>>>>>>>> I
>>>>>>>>>>> don't believe it will solve problem you are hitting)).
>>>>>>>>>>>
>>>>>>>>>>> Also make sure ip6tables are properly configured and your switch is
>>>>>>>>>>> able
>>>>>>>>>>> to pass ipv6 mcast traffic.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>>     Honza
>>>>>>>>>>>
>>>>>>>>>>>> How can i fix it? Or just assigned the multicast address in the
>>>>>>>>>>>> configuration?
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Te
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Jul 10, 2014 at 7:52 AM, Teerapatr Kittiratanachai
>>>>>>>>>>>> <maillist.tk at gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I not found any LOG message
>>>>>>>>>>>>>
>>>>>>>>>>>>> /var/log/messages
>>>>>>>>>>>>> ...
>>>>>>>>>>>>> Jul 10 07:44:19 nwh00 kernel: : DLM (built Jun 19 2014 21:16:01)
>>>>>>>>>>>>> installed
>>>>>>>>>>>>> Jul 10 07:44:22 nwh00 pacemaker: Aborting startup of Pacemaker
>>>>>>>>>>>>> Cluster Manager
>>>>>>>>>>>>> ...
>>>>>>>>>>>>>
>>>>>>>>>>>>> and this is what display when I try to start pacemaker
>>>>>>>>>>>>>
>>>>>>>>>>>>> # /etc/init.d/pacemaker start
>>>>>>>>>>>>> Starting cluster:
>>>>>>>>>>>>>      Checking if cluster has been disabled at boot...        [  OK
>>>>>>>>>>>>> ]
>>>>>>>>>>>>>      Checking Network Manager...                             [  OK
>>>>>>>>>>>>> ]
>>>>>>>>>>>>>      Global setup...                                         [  OK
>>>>>>>>>>>>> ]
>>>>>>>>>>>>>      Loading kernel modules...                               [  OK
>>>>>>>>>>>>> ]
>>>>>>>>>>>>>      Mounting configfs...                                    [  OK
>>>>>>>>>>>>> ]
>>>>>>>>>>>>>      Starting cman... Cannot find node name in cluster.conf
>>>>>>>>>>>>> Unable to get the configuration
>>>>>>>>>>>>> Cannot find node name in cluster.conf
>>>>>>>>>>>>> cman_tool: corosync daemon didn't start Check cluster logs for
>>>>>>>>>>>>> details
>>>>>>>>>>>>>
>>>>>>>>>>>>> [FAILED]
>>>>>>>>>>>>> Stopping cluster:
>>>>>>>>>>>>>      Leaving fence domain...                                 [  OK
>>>>>>>>>>>>> ]
>>>>>>>>>>>>>      Stopping gfs_controld...                                [  OK
>>>>>>>>>>>>> ]
>>>>>>>>>>>>>      Stopping dlm_controld...                                [  OK
>>>>>>>>>>>>> ]
>>>>>>>>>>>>>      Stopping fenced...                                      [  OK
>>>>>>>>>>>>> ]
>>>>>>>>>>>>>      Stopping cman...                                        [  OK
>>>>>>>>>>>>> ]
>>>>>>>>>>>>>      Unloading kernel modules...                             [  OK
>>>>>>>>>>>>> ]
>>>>>>>>>>>>>      Unmounting configfs...                                  [  OK
>>>>>>>>>>>>> ]
>>>>>>>>>>>>> Aborting startup of Pacemaker Cluster Manager
>>>>>>>>>>>>>
>>>>>>>>>>>>> another one thing, according to the happened problem, I remove the
>>>>>>>>>>>>> AAAA record from DNS for now and map it in to /etc/hosts files
>>>>>>>>>>>>> instead, as shown below.
>>>>>>>>>>>>>
>>>>>>>>>>>>> /etc/hosts
>>>>>>>>>>>>> ...
>>>>>>>>>>>>> 2001:db8:0:1::1   node0.example.com
>>>>>>>>>>>>> 2001:db8:0:1::2   node1.example.com
>>>>>>>>>>>>> ...
>>>>>>>>>>>>>
>>>>>>>>>>>>> Is there any configure that help me to got more log ?
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Jul 10, 2014 at 5:06 AM, Andrew Beekhof
>>>>>>>>>>>>> <andrew at beekhof.net>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 9 Jul 2014, at 9:15 pm, Teerapatr Kittiratanachai
>>>>>>>>>>>>>> <maillist.tk at gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Dear All,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I has implemented the HA on dual stack servers,
>>>>>>>>>>>>>>> Firstly, I doesn't deploy IPv6 record on DNS yet. The CMAN and
>>>>>>>>>>>>>>> PACEMAKER can work as normal.
>>>>>>>>>>>>>>> But, after I create AAAA record on DNS server, i found the error
>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>> cann't start CMAN.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Are CMAN and PACEMAKER  support the IPv6?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I don;t think pacemaker cares.
>>>>>>>>>>>>>> What errors did you get?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>>>>>>> Getting started:
>>>>>>>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>>>>
>>>>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>>>>> Getting started:
>>>>>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>>>
>>>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>>>> Getting started:
>>>>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>>
>>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>>> Getting started:
>>>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>
>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>> Getting started:
>>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>
>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>
>>>>>> Project Home: http://www.clusterlabs.org
>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>
>>>>> Project Home: http://www.clusterlabs.org
>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org