[Pacemaker] Help with N+1 configuration
Phil Frost
phil at macprofessionals.com
Thu Jul 26 14:35:09 EDT 2012
On 07/26/2012 02:16 PM, Cal Heldenbrand wrote:
> That seems very handy -- and I don't need to specify 3 clones? Once
> my memcached OCF script reports a downed service, one of them will
> automatically transition to the current failover node?
There are options for the clone on how many instances of the cloned
resource to create, but they default to the number of nodes in the
cluster. See:
http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/ch10s02s02.html
> Is there any reason you specified just a single memcache_clone,
> instead of both the memcache primitive and memcached_clone? I might
> not be understanding exactly how a clone works. Is it like... maybe a
> "symbolic link" to a primitive, with the ability to specify different
> metadata and parameters?
Once you make a clone, the underlying primitive isn't referenced
anywhere else (that I can think of). If you want to stop memcache, you
don't stop the primitive; you add a location constraint forbidding the
clone from running on the node where you want to stop memcache ("crm
resource migrate" is easiest). I can't find the relevant documentation,
but this is just how they work. The same is true for groups -- the
member primitives are never referenced except by the group. I believe in
most cases if you try to reference the primitive, you will get an error.
> Despite the advertisement of consistent hashing with memcache clients,
> I've found that they still have long timeouts waiting on connecting to
> an IP. So, keeping the clustered IPs up at all times is more
> important than having a seasoned cache behind them.
I don't know a whole lot about memcache, but it sounds like you might
even want to reduce the colocation score for the ips on memcache to be a
large number, but not infinity. This way in the case that memcache is
broken everywhere, the ips are still permitted to run. This might also
cover you in the case that a bug in your resource agent thinks memcache
has failed everywhere, but actually it's still running fine. The
decision depends which failure the memcache clients handle better: the
IP being down, or the IP being up but not having a working memcache
server behind it.
More information about the Pacemaker
mailing list