[ClusterLabs] IP clone issue

Fri Sep 15 13:53:21 EDT 2017

On Tue, 2017-09-05 at 21:28 +0300, Vladislav Bogdanov wrote:
> 05.09.2017 17:15, Octavian Ciobanu wrote:
> > Based on ocf:heartbeat:IPaddr2 man page it can be used without an
> static
> > IP address if the kernel has
> net.ipv4.conf.all.promote_secondaries=1.
> >
> > "There must be at least one static IP address, which is not managed
> by
> > the cluster, assigned to the network interface. If you can not
> assign
> > any static IP address on the interface, modify this kernel
> parameter:
> > sysctl -w net.ipv4.conf.all.promote_secondaries=1 (or per device)"
> >
> > This kernel parameter is set by default in CentOS 7.3.
> >
> > With clone-node-max="1" it works as it should be but with
> > clone-node-max="2" both instances of VIP are started on the same
> node
> > even if the other node is online.
> 
> That actually is not a new issue.
> 
> Try raising resource priority 
> (http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explai
> ned/s-resource-options.html#_resource_meta_attributes). 
> That _may_ help.
> Iirc, currently it is the only method to spread globally-unique
> clones 
> across all the nodes at least at the start-up (with higher priority
> they 
> are allocated first, so they land to nodes which have less
> resources).
> 
> But, after the cluster state change (rebooted/fenced node gets
> online) 
> pacemaker tries to preserve resource placement if several nodes have
> the 
> equal 'score' for the given resource. That applies to globally-
> unique 
> clones as well. Changing placement-strategy to utilization or
> balanced 
> does not help as well.
> 
> The only (IMHO) bullet-proof way to make them spread across the
> cluster 
> after node reboot is to make 'synthetic' full-mesh anti-colocation 
> between globally-unique clone instances. Unfortunately, that can be
> made 
> probably only in the pacemaker source code. A possible hack would be
> to 
> anti-colocate clone with itself, but I didn't try that (although the
> is 
> on my todo list) and honestly do not expect that to work. I will
> need 
> the same functionality for the upcoming project (many-nodes 
> active-active cluster with clusterip), so hopefully find a way to 
> achieve that goal in several months.
> 
> (I'm cc'ing Ken directly to draw his attention to this topic).

Yes, unfortunately there is no reliable way at the moment. The priority
suggestion is a good one, though as you mentioned, if failover causes
the instances to land on the same node, they'll stay there even if the
other node comes back up.

There is already a bug report for allowing placement strategy to handle
this:

  https://bugs.clusterlabs.org/show_bug.cgi?id=5220

Unfortunately developer time is extremely limited, so there is no time
frame for dealing with it.

> > Pacemaker 1.1 Cluster from Scratch say that
> > "|clone-node-max=2| says that one node can run up to 2 instances of
> the
> > clone. This should also equal the number of nodes that can host the
> IP,
> > so that if any node goes down, another node can take over the
> failed
> > node’s "request bucket". Otherwise, requests intended for the
> failed
> > node would be discarded."
> >
> > To have this functionality do I must have a static IP set on the
> > interfaces ?
> >
> >
> >
> > On Tue, Sep 5, 2017 at 4:54 PM, emmanuel segura <emi2fast at gmail.com
> > <mailto:emi2fast at gmail.com>> wrote:
> >
> >     I never tried to set an virtual ip in one interfaces without
> ip,
> >     because the vip is a secondary ip that switch between nodes, 

To clarify, cloning an IP does not switch it between nodes (a regular,
non-cloned IP resource would do that). Cloning an IP load-balances
requests across the clone instances (which may be spread out across one
or more nodes). Cloning an IP requires multicast Ethernet MAC
addresses, which not all switches support or have enabled.

> not
> >     primary ip
> >
> >     2017-09-05 15:41 GMT+02:00 Octavian Ciobanu <coctavian1979 at gmai
> l.com
> >     <mailto:coctavian1979 at gmail.com>>:
> >
> >         Hello all,
> >
> >         I've encountered an issue with IP cloning.
> >
> >         Based the "Pacemaker 1.1 Clusters from Scratch" I've
> configured
> >         a test configuration with 2 nodes based on CentOS 7.3. The
> nodes
> >         have 2 Ethernet cards one for cluster communication with
> private
> >         IP network and second for public access to services. The
> public
> >         Ethernet has no IP assigned at boot.
> >
> >         I've created an IP resource with clone using the following
> command
> >
> >         pcs resource create ClusterIP ocf:heartbeat:IPaddr2 params
> >         nic="ens192" ip="xxx.yyy.zzz.www" cidr_netmask="24"
> >         clusterip_hash="sourceip" op start interval="0"
> timeout="20" op
> >         stop interval="0" timeout="20" op monitor interval="10"
> >         timeout="20" meta resource-stickiness=0 clone meta clone-
> max="2"
> >         clone-node-max="2" interleave="true" globally-unique="true"
> >
> >         The xxx.yyy.zzz.www is public IP not a private one.
> >
> >         With the above command the IP clone is created but it is
> started
> >         only on one node. This is the output of pcs status command
> >
> >         Clone Set: ClusterIP-clone [ClusterIP] (unique)
> >              ClusterIP:0    (ocf::heartbeat:IPaddr2):    Started
> node02
> >              ClusterIP:1    (ocf::heartbeat:IPaddr2):    Started
> node02

By default, pacemaker will spread out all resources (including unique
clone instances) evenly across nodes. So if the other node already has
more resources, the above can be the result.

The suggestion of raising the priority on ClusterIP would make the
cluster place it first, so it will be spread out first. Stickiness can
affect it, though.

> >         If I modify the clone-node-max to 1 then the resource is
> started
> >         on both nodes as seen in this pcs status output:
> >
> >         Clone Set: ClusterIP-clone [ClusterIP] (unique)
> >              ClusterIP:0    (ocf::heartbeat:IPaddr2):    Started
> node02
> >              ClusterIP:1    (ocf::heartbeat:IPaddr2):    Started
> node01
> >
> >         But if one node fails the IP resource is not migrated to
> active
> >         node as is said in documentation.
> >
> >         Clone Set: ClusterIP-clone [ClusterIP] (unique)
> >              ClusterIP:0    (ocf::heartbeat:IPaddr2):    Started
> node02
> >              ClusterIP:1    (ocf::heartbeat:IPaddr2):    Stopped

This is surprising. I'd have to see the logs and/or pe-input to know
why both can't be started.

> >
> >         When the IP is active on both nodes the services are
> accessible
> >         so there is not an issue with the fact that the interface
> dose
> >         not have an IP allocated at boot. The gateway is set with
> >         another pcs command and it is working.
> >
> >         Thank in advance for any info.
> >
> >         Best regards
> >         Octavian Ciobanu

-- 
Ken Gaillot <kgaillot at redhat.com>