Hi Reid,<div id="yMail_cursorElementTracker_1617396731933"><br></div><div id="yMail_cursorElementTracker_1617396732139">I will check it out in Monday, but I'm pretty sure I created an order set that first stops the topology and only then it stops the nfs-active.</div><div id="yMail_cursorElementTracker_1617397061968"><br></div><div id="yMail_cursorElementTracker_1617397062205">Yet, I made the stupid decision to prevent ocf:heartbeat:Filesystem (and setting a huge timeout for the stop operation) from killing those 2 SAP processes which led to 'I can't umount, giving up'-like notification and of course fenced the entire cluster :D . </div><div id="yMail_cursorElementTracker_1617397102517"><br></div><div id="yMail_cursorElementTracker_1617397102722">Note taken, stonith has now different delays , and Filesystem can kill the processes.</div><div id="yMail_cursorElementTracker_1617397108253"><br></div><div id="yMail_cursorElementTracker_1617397108436">As per the SAP note from Andrei, these could really be 'fast restart' mechanisms in HANA 2.0 and it looks safe to be killed (will check with SAP about that).</div><div id="yMail_cursorElementTracker_1617397305438"><br></div><div id="yMail_cursorElementTracker_1617397410929"><br></div><div id="yMail_cursorElementTracker_1617397411097">P.S: Is there a way to remove a whole set in pcs , cause it's really irritating when the stupid command wipes the resource from multiple order constraints?</div><div id="yMail_cursorElementTracker_1617397365528"><br></div><div id="yMail_cursorElementTracker_1617397365722">Best Regards,</div><div id="yMail_cursorElementTracker_1617397372224">Strahil Nikolov</div><div id="yMail_cursorElementTracker_1617397191544"><br></div><div id="yMail_cursorElementTracker_1617397191750"><br> <br> <blockquote style="margin: 0 0 20px 0;"> <div style="font-family:Roboto, sans-serif; color:#6D00F6;"> <div>On Fri, Apr 2, 2021 at 23:44, Reid Wahl</div><div><nwahl@redhat.com> wrote:</div> </div> <div style="padding: 10px 0 0 20px; margin: 10px 0 0 0; border-left: 1px solid #6D00F6;"> <div id="yiv9894828817"><div><div dir="ltr"><div>Hi, Strahil.</div><div><br clear="none"></div><div>Based on the constraints documented in the article you're following (RH KB solution 5423971), I think I see what's happening.</div><div><br clear="none"></div><div>The SAPHanaTopology resource requires the appropriate nfs-active attribute in order to run. That means that if the nfs-active attribute is set to false, the SAPHanaTopology resource must stop.</div><div><br clear="none"></div><div>However, there's no rule saying SAPHanaTopology must finish stopping before the nfs-active attribute resource stops. In fact, it's quite the opposite: the SAPHanaTopology resource stops only after the nfs-active resource stops.</div><div><br clear="none"></div><div>At the same time, the NFS resources are allowed to stop after the nfs-active attribute resource has stopped. So the NFS resources are stopping while the SAPHana* resources are likely still active.</div><div><br clear="none"></div><div>Try something like this:</div><div> # pcs constraint order hana_nfs1_active-clone then SAPHanaTopology_<SID>_<instance_num>-clone kind=Optional<br clear="none"></div><div> # pcs constraint order hana_nfs2_active-clone then SAPHanaTopology_<SID>_<instance_num>-clone kind=Optional<br clear="none"></div><div><br clear="none"></div><div>This says "if both hana_nfs1_active and SAPHanaTopology are scheduled to start, then make hana_nfs1_active start first. If both are scheduled to stop, then make SAPHanaTopology stop first."</div><div><br clear="none"></div><div>"kind=Optional" means there's no order dependency unless both resources are already going to be scheduled for the action. I'm using kind=Optional here even though kind=Mandatory (the default) would make sense, because IIRC there were some unexpected interactions with ordering constraints for clones, where events on one node had unwanted effects on other nodes.</div><div><br clear="none"></div><div>I'm not able to test right now since setting up an environment for this even with dummy resources is non-trivial -- but you're welcome to try this both with and without kind=Optional if you'd like.</div><div><br clear="none"></div><div>Please let us know how this goes.<br clear="none"></div></div><br clear="none"><div class="yiv9894828817gmail_quote"><div class="yiv9894828817gmail_attr" dir="ltr">On Fri, Apr 2, 2021 at 2:20 AM Strahil Nikolov <<a rel="nofollow noopener noreferrer" shape="rect" ymailto="mailto:hunter86_bg@yahoo.com" target="_blank" href="mailto:hunter86_bg@yahoo.com">hunter86_bg@yahoo.com</a>> wrote:<br clear="none"></div><blockquote class="yiv9894828817gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex;">Hello All,<div id="yiv9894828817gmail-m_7372687452019563461yMail_cursorElementTracker_1617354778410"><br clear="none"></div><div id="yiv9894828817gmail-m_7372687452019563461yMail_cursorElementTracker_1617354778660">I am testing the newly built HANA (Scale-out) cluster and it seems that:</div><div id="yiv9894828817gmail-m_7372687452019563461yMail_cursorElementTracker_1617354821725">Neither SAPHanaController, nor SAPHanaTopology are stopping the HANA when I put the nodes (same DC = same HANA) in standby. This of course leads to a situation where the NFS cannot be umounted and despite the stop timeout - leads to fencing(on-fail=fence).</div><div id="yiv9894828817gmail-m_7372687452019563461yMail_cursorElementTracker_1617354872797"><br clear="none"></div><div id="yiv9894828817gmail-m_7372687452019563461yMail_cursorElementTracker_1617354873002">I thought that the Controller resource agent is stopping the HANA and the slave role should not be 'stopped' before that .</div><div id="yiv9894828817gmail-m_7372687452019563461yMail_cursorElementTracker_1617355125304"><br clear="none"></div><div id="yiv9894828817gmail-m_7372687452019563461yMail_cursorElementTracker_1617355125510">Maybe my expectations are wrong ?</div><div id="yiv9894828817gmail-m_7372687452019563461yMail_cursorElementTracker_1617355148982"><br clear="none"></div><div id="yiv9894828817gmail-m_7372687452019563461yMail_cursorElementTracker_1617355149213">Best Regards,</div><div id="yiv9894828817gmail-m_7372687452019563461yMail_cursorElementTracker_1617355153489">Strahil Nikolov</div><div id="yiv9894828817gmail-m_7372687452019563461yMail_cursorElementTracker_1617354902362"><br clear="none"></div>_______________________________________________<br clear="none">
Manage your subscription:<br clear="none">
<a rel="nofollow noopener noreferrer" shape="rect" target="_blank" href="https://lists.clusterlabs.org/mailman/listinfo/users">https://lists.clusterlabs.org/mailman/listinfo/users</a><br clear="none">
<br clear="none">
ClusterLabs home: <a rel="nofollow noopener noreferrer" shape="rect" target="_blank" href="https://www.clusterlabs.org/">https://www.clusterlabs.org/</a><div class="yiv9894828817yqt2560296716" id="yiv9894828817yqtfd57270"><br clear="none">
</div></blockquote></div><div class="yiv9894828817yqt2560296716" id="yiv9894828817yqtfd02234"><br clear="all"><br clear="none">-- <br clear="none"></div><div class="yiv9894828817gmail_signature" dir="ltr"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div><div class="yiv9894828817yqt2560296716" id="yiv9894828817yqtfd91867">Regards,</div><br clear="none"><br clear="none"></div>Reid Wahl, RHCA<br clear="none"></div><div>Senior Software Maintenance Engineer, Red Hat<br clear="none"></div>CEE - Platform Support Delivery - ClusterHA</div></div></div></div></div></div></div></div></div></div></div></div></div></div><div class="yiv9894828817yqt2560296716" id="yiv9894828817yqtfd82566">
</div></div></div> </div> </blockquote></div>