[ClusterLabs] Sub-clusters / super-clusters?
Antony Stone
Antony.Stone at ha.open.source.it
Tue Aug 3 04:40:28 EDT 2021
On Tuesday 11 May 2021 at 12:56:01, Strahil Nikolov wrote:
> Here is the example I had promised:
>
> pcs node attribute server1 city=LA
> pcs node attribute server2 city=NY
>
> # Don't run on any node that is not in LA
> pcs constraint location DummyRes1 rule score=-INFINITY city ne LA
>
> #Don't run on any node that is not in NY
> pcs constraint location DummyRes2 rule score=-INFINITY city ne NY
>
> The idea is that if you add a node and you forget to specify the attribute
> with the name 'city' , DummyRes1 & DummyRes2 won't be started on it.
>
> For resources that do not have a constraint based on the city -> they will
> run everywhere unless you specify a colocation constraint between the
> resources.
Excellent - thanks. I happen to use crmsh rather than pcs, but I've adapted
the above and got it working.
Unfortunately, there is a problem.
My current setup is:
One 3-machine cluster in city A running a bunch of resources between them, the
most important of which for this discussion is Asterisk telephony.
One 3-machine cluster in city B doing exactly the same thing.
The two clusters have no knowledge of each other.
I have high-availability routing between my clusters and my upstream telephony
provider, such that a call can be handled by Cluster A or Cluster B, and if
one is unavailable, the call gets routed to the other.
Thus, a total failure of Cluster A means I still get phone calls, via Cluster
B.
To implement the above "one resource which can run anywhere, but only a single
instance", I joined together clusters A and B, and placed the corresponding
location constraints on the resources I want only at A and the ones I want
only at B. I then added the resource with no location constraint, and it runs
anywhere, just once.
So far, so good.
The problem is:
With the two independent clusters, if two machines in city A fail, then
Cluster A fails completely (no quorum), and Cluster B continues working. That
means I still get phone calls.
With the new setup, if two machines in city A fail, then _both_ clusters stop
working and I have no functional resources anywhere.
So, my question now is:
How can I have a 3-machine Cluster A running local resources, and a 3-machine
Cluster B running local resources, plus one resource running on either Cluster
A or Cluster B, but without a failure of one cluster causing _everything_ to
stop?
Thanks,
Antony.
--
One tequila, two tequila, three tequila, floor.
Please reply to the list;
please *don't* CC me.
More information about the Users
mailing list