[Pacemaker] large cluster design questions

Andrew Beekhof andrew at beekhof.net
Mon Jan 16 01:47:57 EST 2012

On Fri, Jan 6, 2012 at 10:10 PM, Christian Parpart <trapni at gentoo.org> wrote:
> Hey all,
> I am also about to evaluate whether or not Pacemaker+Corosync is the
> way to go for our
> infrastructure.
> We are currently having about 45 physical nodes (plus about 60 more
> virtual containers)
> with a statically historically grown setup of services.

You should be able to get totem (corosync's membership algorithm) to
scale to 32 nodes, but it will need some tweaking of the timing

> I am now to restructure this historically grown system into something
> clean and well
> maintainable with HA and scalability in mind (there is no hurry, we've
> some time to design it).
> So here is what we mainly have or will have:
> -> HAproxy (tcp/80, tcp/443, master + (hot) failover)
> -> http frontend server(s) (doing SSL and static files, in case of
> performance issues -> clone resource).
> -> Varnish (backend accelerator)
> -> HAproxy (load-balancing backend app)
> -> Rails (app nodes, clones)
> ----------------------------------------------------------------
> - sharded memcache cluster (5 nodes), no failover currently (memcache
> cannot replicate :( )
> - redis nodes
> - mysql (3 nodes: active master, master, slave)
> - Solr (1 master, 2 slaves)
> - resque (many nodes)
> - NFS file storage pool (master/slave DRBD + ext3 fs currently, want
> to use GFS2/OCFS2 however)
> Now, I read alot about ppl saying a pacemaker cluster should not
> exceed 16 nodes, and many
> others saying this statement is bullsh**. While I now feel more with
> the latter, I still want to know:
>    is it still wise to built up a single pacemaker/corosync driven
> cluster out of all the services above?
> One question I also have, is, when pacemaker is managing your
> resources, and migrates
> one resource from one host (because this one went down) to another,
> then this service should
> be actually able to access all data on that node, too.
> Which leads to the assumption, that you have to install *everything*
> on every node, to be actually able
> to start anything anywhere (depending on where pacemaker is about to
> put it and the scores the admin
> has defined).

Well you can tell us not to put the service on a particular (set of) node(s).
Just make sure you have something recent and we should gracefully
detect that the RA/software isn't available and move on somewhere

> Many thanks for your thoughts on this,
> Christian.
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

More information about the Pacemaker mailing list