[Pacemaker] Best setup for lots and lots of IPs

Fri Jan 20 08:30:34 UTC 2012

Hi,

On Thu, Jan 19, 2012 at 9:49 PM, Anton Melser <melser.anton at gmail.com> wrote:
> Hi,
> I want to set up a very simple NAT device for natting around 2000
> internal /24 networks to around 2000 external IPs (1 /24 = 1 public
> IP). That part works fine (and is *extremely* efficient, I have it on
> a pretty powerful machine but cpu is 0% with 2gbps going through!)
> with iproute2 and iptables. I want it to have some failover though...
> I am discovering everything here (including iproute2 and iptables),
> and someone suggested I look at corosync + pacemaker. I did the
> tutorial (btw if I end up using this I'll translate it into French if
> you would like) and things seemed to work fine for a few IPs...
> However, my
>
> crm configure primitive ClusterIP.ABC ocf:heartbeat:IPaddr2 params
> ip=10.A.B.C cidr_netmask=32 op monitor interval=120s
>
> commands started to slow down around 200 IPs and then to a crawl at
> 500-600 or so. It got to around 1000 before I stopped the VMs I was
> testing on to move them onto a much more powerful VM host. It is
> taking an absolute age to get back up again. This may be normal, and
> there may be no way around it with any decent solution - I simply have
> no idea.
> Am I trying to achieve something with the wrong tools here? I don't
> need any sort of connection tracking or anything - we can handle up to
> even maybe 5 minutes of downtime (as long as it's not regularly
> happening). The need is relatively simple but the numbers of
> networks/IPs may make this unwieldy using these tools.
> Any pointers?

There are a couple of performance related topics that you can look at
for further reference.

http://www.gossamer-threads.com/lists/linuxha/pacemaker/77382?do=post_view_threaded
http://www.gossamer-threads.com/lists/linuxha/pacemaker/77384?do=post_view_threaded

However the way I see it in your scenario I would take another
approach. Mind you this is just an opinion on the matter, nothing
else, but I would either update the IPaddr2 script or create a new one
based on it that would either:

a) take 1000 parameters (and internally do a for loop, because I'd
rather have 1 script with 1000 parameters than 1000 scripts with 1
parameter)

b) (based on the use case of 2000 IP's I'd guess you have at least a
/21 public subnet available - or even larger - and based on good
practice I'd also guess these IP's are given from a continuous range,
in which case the script would) take a start IP and end IP as
parameters, and perform a for loop for the resulting range (thus using
only 2 parameters for the IP definition, and the other parameters I've
seen in the example were netmask and monitoring interval, a grand
total of 4).

>From my point of view, such a high number of resources in a Pacemaker
cluster for the sole purpose of adding/removing IP addresses is an
overkill, and another solution, such as the one I suggested makes more
sense. Of course, I went on the assumption that all of these IP's are
either needed all together or not at all, but even if this is not the
case, I doubt you need individual rules per IP, more along the line of
needing to control a large range + some corner cases with individual
assignments, the latter being possible with IPaddr2 just as usual
whilst keeping the total number of resources significantly lower.

The problem with 1000 resources is that when going into the monitoring
part, you can only monitor $LRMD_MAX_CHILDREN resources at a time
(which by default is 4), so you can increase this number and have n
monitor operations run in parallel. You'll have to see how the
timeouts fit in with the increased monitor operations and if there is
a negative effect on performance due to the increased number of
monitor operations.

HTH,
Dan

> Thanks heaps,
> Anton
>
> --
> echo '16i[q]sa[ln0=aln100%Pln100/snlbx]sbA0D4D465452snlbxq' | dc
> This will help you for 99.9% of your problems ...
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-- 
Dan Frincu
CCNA, RHCE