[ClusterLabs] Antw: Re: Antw: Re: Antw: [EXT] resource cloned group colocations

Fri Mar 3 02:39:22 EST 2023

>>> Gerald Vogt <vogt at spamcop.net> schrieb am 02.03.2023 um 17:27 in Nachricht
<b3336d78-225d-2077-9300-8528efb06017 at spamcop.net>:
> On 02.03.23 14:51, Ulrich Windl wrote:
>>>>> Gerald Vogt <vogt at spamcop.net> schrieb am 02.03.2023 um 14:43 in Nachricht
>> <9ba5cd78-7b3d-32ef-38cf-5c5632c46b9a at spamcop.net>:
>>> On 02.03.23 14:30, Ulrich Windl wrote:
>>>>>>> Gerald Vogt <vogt at spamcop.net> schrieb am 02.03.2023 um 08:41 in Nachricht
>>>> <624d0b70-5983-4d21-6777-55be91688bbe at spamcop.net>:
>>>>> Hi,
>>>>>
>>>>> I am setting up a mail relay cluster which main purpose is to maintain
>>>>> the service ips via IPaddr2 and move them between cluster nodes when
>>>>> necessary.
>>>>>
>>>>> The service ips should only be active on nodes which are running all
>>>>> necessary mail (systemd) services.
>>>>>
>>>>> So I have set up a resource for each of those services, put them into a
>>>>> group in order they should start, cloned the group as they are normally
>>>>> supposed to run on the nodes at all times.
>>>>>
>>>>> Then I added an order constraint
>>>>>      start mail-services-clone then start mail1-ip
>>>>>      start mail-services-clone then start mail2-ip
>>>>>
>>>>> and colocations to prefer running the ips on different nodes but only
>>>>> with the clone running:
>>>>>
>>>>>      colocation add mail2-ip with mail1-ip -1000
>>>>>      colocation mail1-ip with mail-services-clone
>>>>>      colocation mail2-ip with mail-services-clone
>>>>>
>>>>> as well as a location constraint to prefer running the first ip on the
>>>>> first node and the second on the second
>>>>>
>>>>>      location mail1-ip prefers ha1=2000
>>>>>      location mail2-ip prefers ha2=2000
>>>>>
>>>>> Now if I stop pacemaker on one of those nodes, e.g. on node ha2, it's
>>>>> fine. mail2-ip will be moved immediately to ha3. Good.
>>>>>
>>>>> However, if pacemaker on ha2 starts up again, it will immediately remove
>>>>> mail2-ip from ha3 and keep it offline, while the services in the group are
>>>>> starting on ha2. As the services unfortunately take some time to come
>>>>> up, mail2-ip is offline for more than a minute.
>>>>
>>>> That is because you wanted "mail2-ip prefers ha2=2000", so if the cluster
>>> _can_ run it there, then it will, even if it's running elsewhere.
>>>>
>>>> Maybe explain what you really want.
>>>
>>> As I wrote before: (and I have "fixed" my copy&paste error above to use
>>> consistent resource names now)
>>>
>>> 1. I want to run all required services on all running nodes at all times.
>>>
>>> 2. I want two service IPs mail1-ip (ip1) and mail2-ip (ip2) running on
>>> the cluster but only on nodes where all required services are already
>>> running (and not just starting)
>>>
>>> 3. Both IPs should be running on two different nodes if possible.
>>>
>>> 4. Preferably mail1-ip should be on node ha1 if ha1 is running with all
>>> required services.
>>>
>>> 5. Preferably mail2-ip should be on node ha2 if ha1 is running with all
>>> required services.
>>>
>>> So most importantly: I want ip resources mail1-ip and mail2-ip only be
>>> active on nodes which are already running all services. They should only
>>> be moved to nodes on which all services are already running.
>> 
>> Hi!
>> 
>> Usually I prefer simple solutions over hightly complex ones.
>> WOuld it work to use a negative colocation for both IPs, as well as a 
> stickiness of maybe 500, then reducing the "prefer" value to something small 
> as 5 or 10.
>> Then the IP will stay elsewhere as long as the "basement services" run 
> there.
>> 
>> This approach does not change the order of resource operations; instead it 
> kind of minimizes them.
>> In my experience most people overspecify what the cluster should do.
> 
> Well, I guess it's not possible using a group. A group resource seems to 
> be good for a colocation the moment the first resource in the group has 
> been started (or even: is starting?). For a group which takes longer 
> time to completely start, that just doesn't work.
> 
> So I suppose the only two options would be to ungroup everything and 
> create colocation constraints between each invidual service and the ip 
> address. Although I not sure if that would just have the same issue, 
> just on a smaller scale.
> 
> The other alternative would be to start the services through systemd and 
> make pacemaker a dependency to start only after all services are 
> running. Pacemaker then only handles the ip addresses...

Well,

actually I was wondering why you wouldn't run a load balancer on top.
The load balancer will find out which nodes run the software stack (that could be controlled by systemd, but controlling it via cluster is probably easier to monitor).

Regards,
Ulrich

> 
> Thanks,
> 
> Gerald
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/