[Pacemaker] IPaddr2 cloned address doesn't survive node standby

Fri May 17 16:07:11 EDT 2013

----- Original Message -----
> From: "Andreas Ntaflos" <daff at pseudoterminal.org>
> To: "The Pacemaker cluster resource manager" <pacemaker at oss.clusterlabs.org>
> Sent: Friday, May 17, 2013 3:25:32 PM
> Subject: [Pacemaker] IPaddr2 cloned address doesn't survive node standby
> 
> In a two-node cluster I am trying to use a cloned IP address with a
> cloned Bind 9 instance, in an active-active way. Why? Because simple
> IP
> failover does not work well with Bind, as it only answers queries on
> the
> addresses that are bound to the NIC when starting up (I know about
> Bind's "interface-interval" setting, but the minimum of one minute is
> far too long). Using Ubuntu 12.04.2, Corosync 1.4.2 and Pacemaker
> 1.1.6.
> 
> So my configuration sees to it that the cloned address is set on both
> nodes and Bind is started afterwards (op params omitted for
> readability):
> 
> node dns01
> node dns02
> primitive p_bind9 lsb:bind9
> primitive p_ip_service_ns ocf:heartbeat:IPaddr2 \
>    params ip="192.168.114.17" cidr_netmask="24" nic="eth0" \
>      clusterip_hash="sourceip-sourceport"

netmask should be 32 if that's supposed to be a single IP load balanced.

> clone cl_bind9 p_bind9 \
>    meta interleave="false"
> clone cl_ip_service_ns p_ip_service_ns \
>    meta globally-unique="true" clone-max="2" \
>      clone-node-max="2" interleave="true"
> order o_ip_before_bind9 inf: cl_ip_service_ns cl_bind9
> 
> (suggestions to improve or correct this configuration gladly
> accepted)
> 
> After Corosync starts up the first time everything seems correct, I
> can
> see the cluster/cloned/service IP address and the CLUSTERIP iptables
> rules on both nodes.
> 
> But after putting dns01 in standby and then bringing it online again
> the
> cloned address is no longer present on dns01, only on dns02. iptables
> rules are also gone from dns01.
> 
> Then, putting dns02 into standby the IP address is moved to dns01,
> and
> after going online again no longer present on dns01 (neither are
> iptables rules).
> 
> So the IP address is moved between the nodes, each move accompanied
> by a
> restart of the Bind service (cl_bind9/p_bind9).
> 
> All of this doesn't seem right to me. Shouldn't the cloned IP address
> always be present on *both* nodes when they are online?
> 
> Andreas

Without thinking too hard about these might help:

Don't you need colocation also between the clones so that bind can only start on a node that has already started an ip instance?

For the number of restarts it's likely because of the interleaving settings.  True for both would likely help that but wouldn't work in your case - more here: http://www.hastexo.com/resources/hints-and-kinks/interleaving-pacemaker-clones

When you put dns01 in standby does dns02 have both instances of the IP there?
If not it should be (you are just load balancing a single IP correct?).  You need clone-node-max=2 for the ip clone.
If so one just doesn't move back to dns01 when you bring it out of standby?  I would look at resource stickiness=0 for the ip close resource only so the cluster will redistribute when the node comes out of standby (I think that would work).  Clones have a default stickiness of 1 if you don't have a default set for the cluster.

And/or you can write location constraints for the clone instances of ip to prefer one node over the other causing them to fail back if the node returns i.e. location ip0_prefers_dns01 cl_ip_service_ns:0 200: dns01 and location ip1_prefers_dns02 cl_ip_service_ns:1 200: dns02

HTH

Jake

> 
> PS: In the end this configuration works since the Bind 9 service is
> always available to answer queries on the cluster address (as long as
> there is one node online) but it seems that the Bind 9 clones are
> restarted too often and too liberally when things change. This,
> however,
> may be a separate issue, possibly related to the order directive and
> the
> interleave meta params.
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 
>