[ClusterLabs] Antw: Re: Antw: [EXT] resource cloned group colocations

Thu Mar 2 08:51:30 EST 2023

>>> Gerald Vogt <vogt at spamcop.net> schrieb am 02.03.2023 um 14:43 in Nachricht
<9ba5cd78-7b3d-32ef-38cf-5c5632c46b9a at spamcop.net>:
> On 02.03.23 14:30, Ulrich Windl wrote:
>>>>> Gerald Vogt <vogt at spamcop.net> schrieb am 02.03.2023 um 08:41 in Nachricht
>> <624d0b70-5983-4d21-6777-55be91688bbe at spamcop.net>:
>>> Hi,
>>>
>>> I am setting up a mail relay cluster which main purpose is to maintain
>>> the service ips via IPaddr2 and move them between cluster nodes when
>>> necessary.
>>>
>>> The service ips should only be active on nodes which are running all
>>> necessary mail (systemd) services.
>>>
>>> So I have set up a resource for each of those services, put them into a
>>> group in order they should start, cloned the group as they are normally
>>> supposed to run on the nodes at all times.
>>>
>>> Then I added an order constraint
>>>     start mail-services-clone then start mail1-ip
>>>     start mail-services-clone then start mail2-ip
>>>
>>> and colocations to prefer running the ips on different nodes but only
>>> with the clone running:
>>>
>>>     colocation add mail2-ip with mail1-ip -1000
>>>     colocation mail1-ip with mail-services-clone
>>>     colocation mail2-ip with mail-services-clone
>>>
>>> as well as a location constraint to prefer running the first ip on the
>>> first node and the second on the second
>>>
>>>     location mail1-ip prefers ha1=2000
>>>     location mail2-ip prefers ha2=2000
>>>
>>> Now if I stop pacemaker on one of those nodes, e.g. on node ha2, it's
>>> fine. mail2-ip will be moved immediately to ha3. Good.
>>>
>>> However, if pacemaker on ha2 starts up again, it will immediately remove
>>> mail2-ip from ha3 and keep it offline, while the services in the group are
>>> starting on ha2. As the services unfortunately take some time to come
>>> up, mail2-ip is offline for more than a minute.
>> 
>> That is because you wanted "mail2-ip prefers ha2=2000", so if the cluster 
> _can_ run it there, then it will, even if it's running elsewhere.
>> 
>> Maybe explain what you really want.
> 
> As I wrote before: (and I have "fixed" my copy&paste error above to use 
> consistent resource names now)
> 
> 1. I want to run all required services on all running nodes at all times.
> 
> 2. I want two service IPs mail1-ip (ip1) and mail2-ip (ip2) running on 
> the cluster but only on nodes where all required services are already 
> running (and not just starting)
> 
> 3. Both IPs should be running on two different nodes if possible.
> 
> 4. Preferably mail1-ip should be on node ha1 if ha1 is running with all 
> required services.
> 
> 5. Preferably mail2-ip should be on node ha2 if ha1 is running with all 
> required services.
> 
> So most importantly: I want ip resources mail1-ip and mail2-ip only be 
> active on nodes which are already running all services. They should only 
> be moved to nodes on which all services are already running.

Hi!

Usually I prefer simple solutions over hightly complex ones.
WOuld it work to use a negative colocation for both IPs, as well as a stickiness of maybe 500, then reducing the "prefer" value to something small as 5 or 10.
Then the IP will stay elsewhere as long as the "basement services" run there.

This approach does not change the order of resource operations; instead it kind of minimizes them.
In my experience most people overspecify what the cluster should do.

Kimnd regards,
Ulrich Windl

> 
> Thanks,
> 
> Gerald
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/