[ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters - working :)
Ulrich Windl
Ulrich.Windl at rz.uni-regensburg.de
Thu Aug 5 09:44:18 EDT 2021
Hi!
Nice to hear. What could be "interesting" is how stable the WAN-type of
corosync communication works.
If it's not that stable, the cluster could try to fence nodes rather
frequently. OK, you disabled fencing; maybe it works without.
Did you tune the parameters?
Regards,
Ulrich
>>> Antony Stone <Antony.Stone at ha.open.source.it> schrieb am 05.08.2021 um
14:44 in
Nachricht <202108051444.39919.Antony.Stone at ha.open.source.it>:
> On Thursday 05 August 2021 at 10:51:37, Antony Stone wrote:
>
>> On Thursday 05 August 2021 at 07:48:37, Ulrich Windl wrote:
>> >
>> > Have you ever tried to find out why this happens? (Talking about logs)
>>
>> Not in detail, no, but just in case there's a chance of getting this
>> working as suggested simply using location constraints, I shall look
>> further.
>
> I now have a working solution ‑ thank you to everyone who has helped.
>
> The answer to the problem above was simple ‑ with a 6‑node cluster, 3 votes
is
>
> not quorum.
>
> I added a 7th node (in "city C") and adjusted the location constraints to
> ensure that cluster A resources run in city A, cluster B resources run in
> city
> B, and the "anywhere" resource runs in either city A or city B.
>
> I've even added a colocation constraint to ensure that the "anywhere"
> resource
> runs on the same machine in either city A or city B as is running the local
> resources there (which wasn't a strict requirement, but is very useful).
>
> For anyone interested in the detail of how to do this (without needing
> booth),
> here is my cluster.conf file, as in "crm configure load replace
> cluster.conf":
>
> ‑‑‑‑‑‑‑‑
> node tom attribute site=cityA
> node dick attribute site=cityA
> node harry attribute site=cityA
>
> node fred attribute site=cityB
> node george attribute site=cityB
> node ron attribute site=cityB
>
> primitive A‑float IPaddr2 params ip=192.168.32.250 cidr_netmask=24 meta
> migration‑threshold=3 failure‑timeout=60 op monitor interval=5 timeout=20
on‑
> fail=restart
> primitive B‑float IPaddr2 params ip=192.168.42.250 cidr_netmask=24 meta
> migration‑threshold=3 failure‑timeout=60 op monitor interval=5 timeout=20
on‑
> fail=restart
> primitive Asterisk asterisk meta migration‑threshold=3 failure‑timeout=60 op
> monitor interval=5 timeout=20 on‑fail=restart
>
> group GroupA A‑float4 resource‑stickiness=100
> group GroupB B‑float4 resource‑stickiness=100
> group Anywhere Asterisk resource‑stickiness=100
>
> location pref_A GroupA rule ‑inf: site ne cityA
> location pref_B GroupB rule ‑inf: site ne cityB
> location no_pref Anywhere rule ‑inf: site ne cityA and site ne cityB
>
> colocation Ast 100: Anywhere [ cityA cityB ]
>
> property cib‑bootstrap‑options: stonith‑enabled=no no‑quorum‑policy=stop
> start‑failure‑is‑fatal=false cluster‑recheck‑interval=60s
> ‑‑‑‑‑‑‑‑
>
> Of course, the group definitions are not needed for single resources, but I
> shall in practice be using multiple resources which do need groups, so I
> wanted to ensure I was creating something which would work with that.
>
> I have tested it by:
>
> ‑ bringing up one node at a time: as soon as any 4 nodes are running, all
> possible resources are running
>
> ‑ bringing up 5 or more nodes: all resources run
>
> ‑ taking down one node at a time to a maximum of three nodes offline: if at
> least one node in a given city is running, the resources at that city are
> running
>
> ‑ turning off (using "halt", so that corosync dies nicely) all three nodes
> in
> a city simultaneously: that city's resources stop running, the other city
> continues working, as well as the "anywhere" resource
>
> ‑ causing a network failure at one city (so it simply disappears without
> stopping corosync neatly): the other city continues its resources (plus the
> "anywhere" resource), the isolated city stops
>
> For me, this is the solution I wanted, and in fact it's even slightly better
>
> than the previous two isolated 3‑node clusters I had, because I can now have
> resources running on a single active node in cityA (provided it can see at
> least 3 other nodes in cityB or cityC), which wasn't possible before.
>
>
> Once again, thanks to everyone who has helped me to achieve this result :)
>
>
> Antony.
>
> ‑‑
> "The future is already here. It's just not evenly distributed yet."
>
> ‑ William Gibson
>
> Please reply to the
list;
> please *don't* CC
> me.
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
More information about the Users
mailing list