[Pacemaker] load balancing in a 3-node cluster

Tue Sep 27 18:52:15 EDT 2011

Hi all,

Here at Bump we currently have our handset traffic routed through a
single server.  For obvious reasons, we want to expand this to
multiple nodes for redundancy.  The load balancer is doing two tasks:
TLS termination and then directing traffic to one of our internal
application servers.

We want to split the single load balancer into an HA cluster.  Our
chosen solution involves creating one public facing VIP for each
machine, and then floating those VIPs between the load balancer
machines.  Ideally there is one public IP per machine and we use DNS
round robin to send traffic to the IPs.

We considered having two nodes and floating a single VIP between them,
the canonical heartbeat setup, but would prefer to avoid that because
we know we're going to run into the situation where our TLS
termination takes more CPU than we have available on a single node.
Balancing across N nodes seems the most obvious way to address that.

We have allocated three (3) nodes to our cluster.  I want to run our
design by this group and tell you our problems and see if anybody has
some advice.

* no-quorum-policy set to ignore.  We would, ideally, like to have our
cluster continue to operate even if we lose the majority of nodes.
Even if we're in a CPU limited situation, it would be better to serve
slowly than to drop 33% or 66% of our traffic on the floor because we
lost quorum and the floating VIPs weren't migrated to the remaining
nodes.

* STONITH disabled.  Originally I tried to enable this, but with the
no-quorum-policy set to ignore, it seems to go on killing sprees.  It
has fenced healthy nodes for no reason I could determine:

   - "node standby lb1"
        * resources properly migrate to lb2, lb3
        * everything looks stable and correct
   - "node online lb1"
        * resources start migrating back to lb1
        * lb2 gets fenced!  (why?  it was healthy)
        * resources migrating off of lb2

I have seen it double-fence, too, with lb1 being the only surviving
node and lb2 and lb3 being unceremoniously rebooted.  I'm not sure
why.  STONITH seems to be suboptimal (heh) in this particular set up.

Anyway -- that means our configuration is very, very simple:

node $id="65c71911-737e-4848-b7d7-897d0ede172a" patron
node $id="b5f2fd18-acf1-4b25-a571-a0827e07188b" oldfashioned
node $id="ef11cced-0062-411b-93dd-d03c2b8b198c" nattylight
primitive cluster-monitor ocf:pacemaker:ClusterMon \
        params extra_options="--mail-to blah" htmlfile="blah" \
        meta target-role="Started"
primitive floating_216 ocf:heartbeat:IPaddr \
        params ip="173.192.13.216" cidr_netmask="255.255.255.252" nic="eth1" \
        op monitor interval="60s" timeout="30s" \
        meta target-role="Started"
primitive floating_217 ocf:heartbeat:IPaddr \
        params ip="173.192.13.217" cidr_netmask="255.255.255.252" nic="eth1" \
        op monitor interval="60s" timeout="30s" \
        meta target-role="Started"
primitive floating_218 ocf:heartbeat:IPaddr \
        params ip="173.192.13.218" cidr_netmask="255.255.255.252" nic="eth1" \
        op monitor interval="60s" timeout="30s" \
        meta target-role="Started"
property $id="cib-bootstrap-options" \
        dc-version="1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd" \
        cluster-infrastructure="Heartbeat" \
        stonith-enabled="false" \
        no-quorum-policy="ignore" \
        symmetric-cluster="true" \
        last-lrm-refresh="1317079926"

Am I on the right track with this?  Am I missing something obvious?
Am I misapplying this tool to our problem and should I go in a
different direction?

In the real world, I would use ECMP (or something like that) between
the router and my load balancers.  However, I'm living in the world of
managed server hosting (we're not quite big enough to colo) so I don't
have that option.  :-)

-- 
Mark Smith // Operations Lead
mark at bumptechnologies.com