[Pacemaker] IPaddr2 cloned address doesn't survive node standby

Tue May 21 00:18:55 EDT 2013

On 20/05/2013, at 8:51 AM, Andreas Ntaflos <daff at pseudoterminal.org> wrote:

> On 2013-05-17 22:07, Jake Smith wrote:
>>> primitive p_ip_service_ns ocf:heartbeat:IPaddr2 \
>>>   params ip="192.168.114.17" cidr_netmask="24" nic="eth0" \
>>>     clusterip_hash="sourceip-sourceport"
>> 
>> netmask should be 32 if that's supposed to be a single IP load balanced.
> 
> I've been wondering about that, but I think 24 is correct. The address
> is recognized as "secondary" by Linux, as can be seen in this "ip addr"
> output:
> 
> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
> state UP qlen 1000
>    inet 192.168.114.16/24 brd 192.168.114.255 scope global eth0
>    inet 192.168.114.17/24 brd 192.168.114.255 scope global secondary eth0
> 
> Setting it this way has been working fine for a long time now. *shrug*
> 
>> Don't you need colocation also between the clones so that bind can only start on a node that has already started an ip instance?
> 
> I thought since clones are started on all nodes anyway that a simple
> "order" directive would suffice.

Agreed

> But I've added a colocation constraint
> as well, to be sure. Thanks for the hint.
> 
>> For the number of restarts it's likely because of the interleaving settings.  True for both would likely help that but wouldn't work in your case - more here: http://www.hastexo.com/resources/hints-and-kinks/interleaving-pacemaker-clones

One of these days i should remove the clone-node-max limitation

> 
> Yes, there doesn't seem to be a way to interleave these cloned resources
> in a way that avoids restarting Bind on such cluster state changes.
> 
>> When you put dns01 in standby does dns02 have both instances of the IP there?
>> If not it should be (you are just load balancing a single IP correct?).  You need clone-node-max=2 for the ip clone.
> 
> clone-node-max was always set to "2", yes.
> 
>> If so one just doesn't move back to dns01 when you bring it out of standby?  I would look at resource stickiness=0 for the ip close resource only so the cluster will redistribute when the node comes out of standby (I think that would work).  Clones have a default stickiness of 1 if you don't have a default set for the cluster.
> 
> Bingo, the resource stickiness was the problem! I've set it to 0 and now
> the IP resource gets started again when the node comes back online.
> 
> Thanks a lot, I would not have thought of that. As stated above,
> shouldn't cloned resources be (re-)started on all nodes by definition?
> 
>> And/or you can write location constraints for the clone instances of ip to prefer one node over the other causing them to fail back if the node returns i.e. location ip0_prefers_dns01 cl_ip_service_ns:0 200: dns01 and location ip1_prefers_dns02 cl_ip_service_ns:1 200: dns02
> 
> That doesn't seem necessary, now with resource-stickiness="0".
> 
> Thanks again!
> 
> Andreas
> 
> PS: Here's the complete configuration for the archives, in case someone
> might be interested in the future:
> 
> node dns01
> node dns02
> primitive p_bind9 lsb:bind9 \
>        op monitor interval="10s" timeout="15s" \
>        op start interval="0" timeout="15s" \
>        op stop interval="0" timeout="15s" \
>        meta target-role="Started"
> primitive p_ip_service_ns ocf:heartbeat:IPaddr2 \
>        params ip="192.168.114.17" cidr_netmask="24" nic="eth0"
> clusterip_hash="sourceip-sourceport" \
>        op monitor interval="10s" \
>        op start interval="0" timeout="20s" \
>        op stop interval="0" timeout="20s"
> clone cl_bind9 p_bind9 \
>        meta globally-unique="false" clone-max="2" clone-node-max="1"
> interleave="false" target-role="Started"
> clone cl_ip_service_ns p_ip_service_ns \
>        meta globally-unique="true" clone-max="2" clone-node-max="2"
> interleave="false" target-role="Started"
> colocation co_ip_before_bind9 inf: cl_ip_service_ns cl_bind9
> order o_ip_before_bind9 inf: cl_ip_service_ns cl_bind9
> property $id="cib-bootstrap-options" \
>        dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
>        cluster-infrastructure="openais" \
>        expected-quorum-votes="2" \
>        no-quorum-policy="ignore" \
>        stonith-enabled="no" \
>        last-lrm-refresh="1368814808"
> rsc_defaults $id="rsc-options" \
>        resource-stickiness="0"
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org