[ClusterLabs] Running several instances of a Corosync/Pacemaker cluster on a node

Mon May 2 17:26:13 EDT 2016

On 04/26/2016 03:33 AM, Bogdan Dobrelya wrote:
> Is it possible to run several instances of a Corosync/Pacemaker clusters
> on a node? Can a node be a member of several clusters, so they could put
> resources there? I'm sure it's doable with separate nodes or containers,
> but that's not the case.
> 
> My case is to separate data-critical resources, like storage or VIPs,
> from the complex resources like DB or MQ clusters.
> 
> The latter should run with no-quorum-policy=ignore as they know how to
> deal with network partitions/split-brain, use own techniques to protect
> data and don't want external fencing from a Pacemaker, which
> no-quorum-policy/STONITH is.
> 
> The former must use STONITH (or a stop policy, if it's only a VIP), as
> they don't know how to deal with split-brain, for example.

I don't think it's possible, though I could be wrong, if separate
IPs/ports, chroots and node names are used (just shy of a container ...).

However I suspect it would not meet your goal in any case. DB and MQ
software generally do NOT have sufficient techniques to deal with a
split-brain situation -- either you lose high availability or you
corrupt data. Using no-quorum-policy=stop is fine for handling network
splits, but it does not help if a node becomes unresponsive.

Also note that pacemaker does provide the ability to treat different
resources differently with respect to quorum and fencing, without
needing to run separate clusters. See the "required" meta-attribute:

http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#_resource_meta_attributes

I suspect your motive for this is to be able to run a cluster without
fencing. There are certain failure scenarios that simply are not
recoverable without fencing, regardless of what the application software
can do. There is really only one case in which doing without fencing is
reasonable: when you're willing to lose your data and/or have downtime
when a situation arises that requires fencing.