[ClusterLabs] One cluster with two groups of nodes

Thu Nov 9 16:39:00 CET 2017

On Wed, 2017-11-08 at 23:04 -0400, Alberto Mijares wrote:
> Hi guys, nice to say hello here.
> 
> I've been assigned with a very particular task: There's a
> pacemaker-based cluster with 6 nodes. A system runs on three nodes
> (group A), while the other three are hot-standby spares (group B).
> 
> Resources from group A are never supposed to me relocated
> individually
> into nodes from group B. However, if any of the resources from group
> A
> fails, all resources must be relocated into group B. It's an "all or
> nothing" failover.
> 
> Ideally, you would split the cluster into two clusters and implement
> Cluster-Sites and Tickets Management; however, it's not possible.
> 
> Taking all this into account, can you kindly suggest an strategy for
> achieving the goal? I have some ideas but I'd like to hear from those
> who have a lot more experience than me.
> 
> Thanks in advance,
> 
> 
> Alberto Mijares

The first thing I'd mention is that a 6-node cluster can only survive
the loss of two nodes, as 3 nodes don't have quorum. You can tweak that
behavior with corosync quorum options, or you could add a quorum-only
node, or use corosync's new qdevice capability to have an arbiter node.

Coincidentally, I recently stumbled across a long-time Pacemaker
feature that I wasn't aware of, that can handle this type of situation.
It's not documented yet but will be when 1.1.18 is released soon.

Colocation constraints may take a "node-attribute" parameter, that
basically means, "Put this resource on a node of the same class as the
one running resource X".

In this case, you might set a "group" node attribute on all nodes, to
"1" on the three primary nodes and "2" on the three failover nodes.
Pick one resource as your base resource that everything else should go
along with. Configure colocation constraints for all the other
resources with that one, using "node-attribute=group". That means that
all the other resources must be one a node with the same "group"
attribute value as the node that the base resource is running on.

"node-attribute" defaults to "#uname" (node name), this giving the
usual behavior of colocation constraints: put the resource only on a
node with the same name, i.e. the same node.

The remaining question is, how do you want the base resource to fail
over? If the base resource can fail over to any other node, whether in
the same group or not, then you're done. If the base resource can only
run on one node in each group, ban it from the other nodes using
-INFINITY location constraints. If the base resource should only fail
over to the opposite group, that's trickier, but something roughly
similar would be to prefer one node in each group with an equal
positive score location constraint, and migration-threshold=1.
-- 
Ken Gaillot <kgaillot at redhat.com>