[Pacemaker] unknown third node added to a 2 node cluster?
Andrew Beekhof
andrew at beekhof.net
Sun Oct 12 21:51:35 EDT 2014
On 11 Oct 2014, at 1:35 am, Brian J. Murrell (brian) <brian at interlinx.bc.ca> wrote:
> On Wed, 2014-10-08 at 12:39 +1100, Andrew Beekhof wrote:
>> On 8 Oct 2014, at 2:09 am, Brian J. Murrell (brian) <brian-SquOHqY54CVWr29BmMi2cA at public.gmane.org> wrote:
>>
>>> Given a 2 node pacemaker-1.1.10-14.el6_5.3 cluster with nodes "node5"
>>> and "node6" I saw an "unknown" third node being added to the cluster,
>>> but only on node5:
>>
>> Is either node using dhcp?
>
> Yes, they both are. The server is the ISC DHCP server (on EL6) and the
> address pool is much more plentiful than the node count. That is all
> just to say that the DHCP server serving these nodes abides by the DHCP
> RFC's recommendation to allow clients to continue to use addresses they
> have already been assigned when making a renewal request. And indeed,
> give them the same address they had previously after a lease expiry, as
> long as the pool is not constrained and address needed to satisfy a
> request from a different machine.
>
>> I would guess node6 got a new IP address
>
> These nodes are using the ISC DHCP client. That DHCP client logs in the
> same log (/var/log/messages) as was posted in my prior message when it
> renews a lease with messages such as:
>
> Oct 10 05:56:19 node6 dhclient[1026]: DHCPREQUEST on eth0 to 10.14.80.6 port 67 (xid=0x4f11c576)
> Oct 10 05:56:19 node6 dhclient[1026]: DHCPACK from 10.14.80.6 (xid=0x4f11c576)
> Oct 10 05:56:20 node6 dhclient[1026]: bound to 10.14.82.141 -- renewal in 8546 seconds.
>
> In the logs that I pasted the messages from in my previous message, such
> messages don't even exist because the nodes are not left up long enough
> to even get to a lease expiry. These are tests nodes and so are
> rebooted frequently.
>
> TL;DR: I am quite certain the node did not get a new/different address.
Even the same address can be a problem. That brief window where things were getting renewed can screw up corosync.
Never ever use dhcp for a cluster node. Ever. Really, never.
>
>> (or that corosync decided to bind to a different one)
>
> Bind to a different what? Address?
Yes. That is what nodeid's are calculated from.
Different nodeid == different address
> As in binding to an address that
> was not even configured on the machine?
localhost is the most common one
>
> b.
>
>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20141013/64782306/attachment-0007.sig>
More information about the Pacemaker
mailing list