[ClusterLabs] Antw: [EXT] resource cloned group colocations

Vladislav Bogdanov bubble at hoster-ok.com
Thu Mar 2 08:43:03 EST 2023


On Thu, 2023-03-02 at 14:30 +0100, Ulrich Windl wrote:
> > > > Gerald Vogt <vogt at spamcop.net> schrieb am 02.03.2023 um 08:41
> > > > in Nachricht
> <624d0b70-5983-4d21-6777-55be91688bbe at spamcop.net>:
> > Hi,
> > 
> > I am setting up a mail relay cluster which main purpose is to
> > maintain 
> > the service ips via IPaddr2 and move them between cluster nodes
> > when 
> > necessary.
> > 
> > The service ips should only be active on nodes which are running
> > all 
> > necessary mail (systemd) services.
> > 
> > So I have set up a resource for each of those services, put them
> > into a 
> > group in order they should start, cloned the group as they are
> > normally 
> > supposed to run on the nodes at all times.
> > 
> > Then I added an order constraint
> >    start mail-services-clone then start mail1-ip
> >    start mail-services-clone then start mail2-ip
> > 
> > and colocations to prefer running the ips on different nodes but
> > only 
> > with the clone running:
> > 
> >    colocation add mail2-ip with mail1-ip -1000
> >    colocation ip1 with mail-services-clone
> >    colocation ip2 with mail-services-clone
> > 
> > as well as a location constraint to prefer running the first ip on
> > the 
> > first node and the second on the second
> > 
> >    location ip1 prefers ha1=2000
> >    location ip2 prefers ha2=2000
> > 
> > Now if I stop pacemaker on one of those nodes, e.g. on node ha2,
> > it's 
> > fine. ip2 will be moved immediately to ha3. Good.
> > 
> > However, if pacemaker on ha2 starts up again, it will immediately
> > remove 
> > ip2 from ha3 and keep it offline, while the services in the group
> > are 
> > starting on ha2. As the services unfortunately take some time to
> > come 
> > up, ip2 is offline for more than a minute.
> 
> That is because you wanted "ip2 prefers ha2=2000", so if the cluster
> _can_ run it there, then it will, even if it's running elsewhere.
> 

Pacemaker sometime places actions in the transition in a suboptimal
order (prom the humans point of view).
So instead of

start group on nodeB
stop vip on nodeA
start vip on nodeB

it runs

stop vip on nodeA
start group on nodeB
start vip on nodeB

So, if start of group takes a lot of time, then vip is not available on
any node during that start.

One more techniques to minimize the time during which vip is stopped
would be to add resource migration support to IPAddr2.
That could help, but I'm not sure.
At least I know for sure pacemaker behaves differently with migratable
resources and MAY decide to use the first order I provided..

> Maybe explain what you really want.
> 
> > 
> > It seems the colocations with the clone are already good once the
> > clone 
> > group begins to start services and thus allows the ip to be removed
> > from 
> > the current node.
> > 
> > I was wondering how can I define the colocation to be accepted only
> > if 
> > all services in the clone have been started? And not once the first
> > service in the clone is starting?
> > 
> > Thanks,
> > 
> > Gerald
> > 
> > 
> > _______________________________________________
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users 
> > 
> > ClusterLabs home: https://www.clusterlabs.org/ 
> 
> 
> 
> 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20230302/1f5de923/attachment.htm>


More information about the Users mailing list