[Pacemaker] CMAN and Pacemaker with IPv6

Mon Jul 14 09:51:56 EDT 2014

> Honza,
>
> How do I include the patch with my CentOS package?
> Do I need to compile them manually?

Yes. Also official CentOS version was never 1.4.5. If you are using 
CentOS, just use stock 1.4.1-17.1. Patch is included there.

Honza

>
> TeEniGMa
>
> On Mon, Jul 14, 2014 at 3:21 PM, Jan Friesse <jfriesse at redhat.com> wrote:
>> Teerapatr,
>>
>>
>>> For more information,
>>>
>>>
>>> these are LOG from /var/log/messages
>>> ...
>>> Jul 14 10:28:07 wh00 kernel: : DLM (built Mar 25 2014 20:01:13) installed
>>> Jul 14 10:28:07 wh00 corosync[2716]:   [MAIN  ] Corosync Cluster
>>> Engine ('1.4.5'): started and ready to provide service.
>>> Jul 14 10:28:07 wh00 corosync[2716]:   [MAIN  ] Corosync built-in
>>> features: nss
>>> Jul 14 10:28:07 wh00 corosync[2716]:   [MAIN  ] Successfully read
>>> config from /etc/cluster/cluster.conf
>>> Jul 14 10:28:07 wh00 corosync[2716]:   [MAIN  ] Successfully parsed cman
>>> config
>>> Jul 14 10:28:07 wh00 corosync[2716]:   [TOTEM ] Initializing transport
>>> (UDP/IP Multicast).
>>> Jul 14 10:28:07 wh00 corosync[2716]:   [TOTEM ] Initializing
>>> transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
>>> Jul 14 10:28:07 wh00 corosync[2716]:   [TOTEM ] The network interface is
>>> down.
>>
>> ^^^ This line is important. This means, corosync was unable to find
>> interface with given IPv6 address. There was regression in v1.4.5 causing
>> this behavior. It's fixed in v1.4.6 (patch is
>> https://github.com/corosync/corosync/commit/d76759ec26ecaeb9cc01f49e9eb0749b61454d27).
>> So you can ether apply patch or (recommended) upgrade to 1.4.7.
>>
>> Regards,
>>    Honza
>>
>>
>>
>>> Jul 14 10:28:10 wh00 pacemaker: Aborting startup of Pacemaker Cluster
>>> Manager
>>> ...
>>>
>>> Te
>>>
>>> On Mon, Jul 14, 2014 at 10:07 AM, Teerapatr Kittiratanachai
>>> <maillist.tk at gmail.com> wrote:
>>>>
>>>> Dear Honza,
>>>>
>>>> Sorry for late reply.
>>>> After I have tested with all new configuration.
>>>> On IPv6 only, and with no altname.
>>>>
>>>> I face with error below,
>>>>
>>>> Starting cluster:
>>>>      Checking if cluster has been disabled at boot...        [  OK  ]
>>>>      Checking Network Manager...                             [  OK  ]
>>>>      Global setup...                                         [  OK  ]
>>>>      Loading kernel modules...                               [  OK  ]
>>>>      Mounting configfs...                                    [  OK  ]
>>>>      Starting cman... corosync died with signal: 6 Check cluster logs for
>>>> details
>>>>                                                              [FAILED]
>>>>
>>>> And, exactly, there are no any enabled firewall, I also configure the
>>>> Multicast address as manual.
>>>> Could you advise me the solution?
>>>>
>>>> Many thanks in advance.
>>>> Te
>>>>
>>>> On Thu, Jul 10, 2014 at 6:14 PM, Jan Friesse <jfriesse at redhat.com> wrote:
>>>>>
>>>>> Teerapatr,
>>>>>
>>>>>> Hi Honza,
>>>>>>
>>>>>> As you said I use the nodename identify by hostname (which be accessed
>>>>>> via IPv6) and the node also has the altname (which be IPv4 address).
>>>>>>
>>>>>
>>>>> This doesn't work. Both hostname and altname have to be same IP version.
>>>>>
>>>>>> Now, I configure the mcast address for both nodename and altname
>>>>>> manually. The CMAN and Pacemaker can start ad well. But they don't
>>>>>> communicate to another node.
>>>>>
>>>>>
>>>>> PLease make sure (as I've wrote in previous email) your firewall doesn't
>>>>> block mcast and corosync traffic (just disable it) and switch doesn't
>>>>> block multicast (this is very often the case). If these are VMs, make
>>>>> sure to properly configure bridge (just disable firewall) and allow
>>>>> mcast_querier.
>>>>>
>>>>> Honza
>>>>>
>>>>>> On node0, crm_mon show node1 offline. In the same way, node one show
>>>>>> node0 is down. So the split brain problem occur here.
>>>>>>
>>>>>> Regards,
>>>>>> Te
>>>>>>
>>>>>> On Thu, Jul 10, 2014 at 2:50 PM, Jan Friesse <jfriesse at redhat.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> Teerapatr,
>>>>>>>
>>>>>>>> OK, some problems are solved.
>>>>>>>> I use the incorrect hostname.
>>>>>>>>
>>>>>>>> For now, the new problem has occured.
>>>>>>>>
>>>>>>>>     Starting cman... Node address family does not match multicast
>>>>>>>> address family
>>>>>>>> Unable to get the configuration
>>>>>>>> Node address family does not match multicast address family
>>>>>>>> cman_tool: corosync daemon didn't start Check cluster logs for
>>>>>>>> details
>>>>>>>>                                                              [FAILED]
>>>>>>>>
>>>>>>>
>>>>>>> This looks like one of your node is also reachable via ipv4 and ipv4
>>>>>>> resolving is proffered. Please make sure to set only ipv6 address and
>>>>>>> try it again. Of course set mcast addr by hand maybe helpful (even-tho
>>>>>>> I
>>>>>>> don't believe it will solve problem you are hitting)).
>>>>>>>
>>>>>>> Also make sure ip6tables are properly configured and your switch is
>>>>>>> able
>>>>>>> to pass ipv6 mcast traffic.
>>>>>>>
>>>>>>> Regards,
>>>>>>>     Honza
>>>>>>>
>>>>>>>> How can i fix it? Or just assigned the multicast address in the
>>>>>>>> configuration?
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Te
>>>>>>>>
>>>>>>>> On Thu, Jul 10, 2014 at 7:52 AM, Teerapatr Kittiratanachai
>>>>>>>> <maillist.tk at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> I not found any LOG message
>>>>>>>>>
>>>>>>>>> /var/log/messages
>>>>>>>>> ...
>>>>>>>>> Jul 10 07:44:19 nwh00 kernel: : DLM (built Jun 19 2014 21:16:01)
>>>>>>>>> installed
>>>>>>>>> Jul 10 07:44:22 nwh00 pacemaker: Aborting startup of Pacemaker
>>>>>>>>> Cluster Manager
>>>>>>>>> ...
>>>>>>>>>
>>>>>>>>> and this is what display when I try to start pacemaker
>>>>>>>>>
>>>>>>>>> # /etc/init.d/pacemaker start
>>>>>>>>> Starting cluster:
>>>>>>>>>      Checking if cluster has been disabled at boot...        [  OK  ]
>>>>>>>>>      Checking Network Manager...                             [  OK  ]
>>>>>>>>>      Global setup...                                         [  OK  ]
>>>>>>>>>      Loading kernel modules...                               [  OK  ]
>>>>>>>>>      Mounting configfs...                                    [  OK  ]
>>>>>>>>>      Starting cman... Cannot find node name in cluster.conf
>>>>>>>>> Unable to get the configuration
>>>>>>>>> Cannot find node name in cluster.conf
>>>>>>>>> cman_tool: corosync daemon didn't start Check cluster logs for
>>>>>>>>> details
>>>>>>>>>                                                              [FAILED]
>>>>>>>>> Stopping cluster:
>>>>>>>>>      Leaving fence domain...                                 [  OK  ]
>>>>>>>>>      Stopping gfs_controld...                                [  OK  ]
>>>>>>>>>      Stopping dlm_controld...                                [  OK  ]
>>>>>>>>>      Stopping fenced...                                      [  OK  ]
>>>>>>>>>      Stopping cman...                                        [  OK  ]
>>>>>>>>>      Unloading kernel modules...                             [  OK  ]
>>>>>>>>>      Unmounting configfs...                                  [  OK  ]
>>>>>>>>> Aborting startup of Pacemaker Cluster Manager
>>>>>>>>>
>>>>>>>>> another one thing, according to the happened problem, I remove the
>>>>>>>>> AAAA record from DNS for now and map it in to /etc/hosts files
>>>>>>>>> instead, as shown below.
>>>>>>>>>
>>>>>>>>> /etc/hosts
>>>>>>>>> ...
>>>>>>>>> 2001:db8:0:1::1   node0.example.com
>>>>>>>>> 2001:db8:0:1::2   node1.example.com
>>>>>>>>> ...
>>>>>>>>>
>>>>>>>>> Is there any configure that help me to got more log ?
>>>>>>>>>
>>>>>>>>> On Thu, Jul 10, 2014 at 5:06 AM, Andrew Beekhof <andrew at beekhof.net>
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 9 Jul 2014, at 9:15 pm, Teerapatr Kittiratanachai
>>>>>>>>>> <maillist.tk at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Dear All,
>>>>>>>>>>>
>>>>>>>>>>> I has implemented the HA on dual stack servers,
>>>>>>>>>>> Firstly, I doesn't deploy IPv6 record on DNS yet. The CMAN and
>>>>>>>>>>> PACEMAKER can work as normal.
>>>>>>>>>>> But, after I create AAAA record on DNS server, i found the error
>>>>>>>>>>> that
>>>>>>>>>>> cann't start CMAN.
>>>>>>>>>>>
>>>>>>>>>>> Are CMAN and PACEMAKER  support the IPv6?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I don;t think pacemaker cares.
>>>>>>>>>> What errors did you get?
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>>
>>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>>> Getting started:
>>>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>
>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>> Getting started:
>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>
>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>> Getting started:
>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>
>>>>>> Project Home: http://www.clusterlabs.org
>>>>>> Getting started:
>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>
>>>>> Project Home: http://www.clusterlabs.org
>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>> Bugs: http://bugs.clusterlabs.org
>>>
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>>
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>