<html><head></head><body><div>On Thu, 2023-03-02 at 08:41 +0100, Gerald Vogt wrote:</div><blockquote type="cite" style="margin:0 0 0 .8ex; border-left:2px #729fcf solid;padding-left:1ex"><div>Hi,<br></div><div><br></div><div>I am setting up a mail relay cluster which main purpose is to maintain <br></div><div>the service ips via IPaddr2 and move them between cluster nodes when <br></div><div>necessary.<br></div><div><br></div><div>The service ips should only be active on nodes which are running all <br></div><div>necessary mail (systemd) services.<br></div><div><br></div><div>So I have set up a resource for each of those services, put them into a <br></div><div>group in order they should start, cloned the group as they are normally <br></div><div>supposed to run on the nodes at all times.<br></div><div><br></div><div>Then I added an order constraint<br></div><div>   start mail-services-clone then start mail1-ip<br></div><div>   start mail-services-clone then start mail2-ip<br></div><div><br></div><div>and colocations to prefer running the ips on different nodes but only <br></div><div>with the clone running:<br></div><div><br></div><div>   colocation add mail2-ip with mail1-ip -1000<br></div><div>   colocation ip1 with mail-services-clone<br></div><div>   colocation ip2 with mail-services-clone<br></div><div><br></div><div>as well as a location constraint to prefer running the first ip on the <br></div><div>first node and the second on the second<br></div><div><br></div><div>   location ip1 prefers ha1=2000<br></div><div>   location ip2 prefers ha2=2000<br></div><div><br></div><div>Now if I stop pacemaker on one of those nodes, e.g. on node ha2, it's <br></div><div>fine. ip2 will be moved immediately to ha3. Good.<br></div><div><br></div><div>However, if pacemaker on ha2 starts up again, it will immediately remove <br></div><div>ip2 from ha3 and keep it offline, while the services in the group are <br></div><div>starting on ha2. As the services unfortunately take some time to come <br></div><div>up, ip2 is offline for more than a minute.<br></div><div><br></div><div>It seems the colocations with the clone are already good once the clone <br></div><div>group begins to start services and thus allows the ip to be removed from <br></div><div>the current node.<br></div><div><br></div><div>I was wondering how can I define the colocation to be accepted only if <br></div><div>all services in the clone have been started? And not once the first <br></div><div>service in the clone is starting?<br></div><div><br></div><div>Thanks,<br></div><div><br></div><div>Gerald<br></div><div><br></div></blockquote><div><br></div><div>I noticed such behavior many years ago - it is especially visible with a long-starting resources, and one of techniques</div><div>to deal with that is to use transient node attributes instead of colocation/order between group and vip.</div><div>I'm not sure there is a suitable open-source resource agent which just manages specified node attribute, but it should be</div><div>not hard to compose one which implements a pseudo-resource handler together with atrrd_updater calls.</div><div>Probably you can trim all ethernet-related from a ethmonitor to make such almost-dummy resource agent.</div><div><br></div><div>Once RA is there, you can add it as the last resource in the group, and then rely on the attribute it manages to start your VIP.</div><div>That is done with location constraints, just use score-attribute in their rules - <a href="https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/singlehtml/index.html#rule-properties">https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/singlehtml/index.html#rule-properties</a></div><div><br></div><div>So, the idea is: your custom RA sets attribute 'mail-clone-started' to something like 1,</div><div>and you have a location constraint which prevents cluster from starting your VIP resource on a node if value of  <span style="font-size: 13.333333px;">'mail-clone-started' attribute on a node is less then 1 or not defined.</span></div><div>Once node has that attribute set (which happens at the very end of a start sequence of a group) then (and only then) it decides to move your VIP</div><div>to that node (because of other location constraints with preferences you already have).</div><div><br></div><div>Just make sure attributes are transient (not stored into CIB).</div><div><br></div><div><br></div><div><span></span></div></body></html>