<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Mar 25, 2021 at 10:20 PM Andrei Borzenkov <<a href="mailto:arvidjaar@gmail.com">arvidjaar@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 25.03.2021 21:45, Reid Wahl wrote:<br>

> FWIW we have this KB article (I seem to remember Strahil is a Red Hat<br>

> customer):<br>

>   - How do I configure SAP HANA Scale-Up System Replication in a Pacemaker<br>

> cluster when the HANA filesystems are on NFS shares?(<br>

> <a href="https://access.redhat.com/solutions/5156571" rel="noreferrer" target="_blank">https://access.redhat.com/solutions/5156571</a>)<br>

> <br>

<br>

"How do I make the cluster resources recover when one node loses access<br>

to the NFS server?"<br>

<br>

If node loses access to NFS server then monitor operations for resources<br>

that depend on NFS availability will fail or timeout and pacemaker will<br>

recover (likely by rebooting this node). That's how similar<br>

configurations have been handled for the past 20 years in other HA<br>

managers. I am genuinely interested, have you encountered the case where<br>

it was not enough?<br></blockquote><div><br></div><div>Yes, and I was perplexed by this at the time too.</div><div><br></div><div>I just went back and checked the notes from the support case that led to this article, since it's been nearly a year now. Apparently there were situations in which the SAPHana resource wasn't failing over when connectivity was lost with the NFS share that contained the hdb* binaries and the HANA data. I don't remember the exact details (whether demotion was failing, or whether it wasn't even trying to demote on the primary and promote on the secondary, or what). Either way, I was surprised that this procedure was necessary, but it seemed to be.</div><div><br></div><div>The whole situation is a bit of a corner case in the first place. IIRC this procedure only makes a difference if the primary loses contact with the NFS server but the secondary can still access the NFS server. I expect that to be relatively rare. If neither node can access the NFS server, then we're stuck.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

> I can't remember if there was some valid reason why we had to use an<br>

> attribute resource, or if we simply didn't think about the sequential=false<br>

> require-all=false constraint set approach when planning this out.<br>

> <br>

<br>

Because as I already replied, this has different semantic - it will<br>

start HANA on both nodes if NFS comes up on any one node.<br></blockquote><div><br></div><div>Ah yes, that sounds right.</div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

But thank you for the pointer, it demonstrates really interesting<br>

technique. It also confirms that pacemaker does not have native means to<br>

express such ordering dependency/constraints. May be it should.<br></blockquote><div><br></div><div>I occasionally find that I have to use hacks like this to achieve certain complex constraint behavior -- especially when it comes to colocation. I don't know how many of these complex cases would be feasible to make possible natively via RFE. Sometimes the way colocation is currently implemented is incompatible with what users want to do. Probably requires considerable effort to change it, though such requests are worth documenting in RFEs.</div><div><br></div><div>/me makes a note to do that and annoy Ken<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

> On Thu, Mar 25, 2021 at 3:39 AM Strahil Nikolov <<a href="mailto:hunter86_bg@yahoo.com" target="_blank">hunter86_bg@yahoo.com</a>><br>

> wrote:<br>

> <br>

>> OCF_CHECK_LEVEL 20<br>

>> NFS sometimes fails to start (systemd racing condition with dnsmasq)<br>

>><br>

>> Best Regards,<br>

>> Strahil Nikolov<br>

>><br>

>> On Thu, Mar 25, 2021 at 12:18, Andrei Borzenkov<br>

>> <<a href="mailto:arvidjaar@gmail.com" target="_blank">arvidjaar@gmail.com</a>> wrote:<br>

>> On Thu, Mar 25, 2021 at 10:31 AM Strahil Nikolov <<a href="mailto:hunter86_bg@yahoo.com" target="_blank">hunter86_bg@yahoo.com</a>><br>

>> wrote:<br>

>>><br>

>>> Use Case:<br>

>>><br>

>>> nfsA is shared filesystem for HANA running in site A<br>

>>> nfsB is shared filesystem for HANA running  in site B<br>

>>><br>

>>> clusterized resource of type SAPHanaTopology must run on all systems if<br>

>> the FS for the HANA is running<br>

>>><br>

>><br>

>> And the reason you put NFS under pacemaker control in the first place?<br>

>> It is not going to switch over, just put it in /etc/fstab.<br>

>><br>

>>> Yet, if siteA dies for some reason, I want to make SAPHanaTopology to<br>

>> still start on the nodes in site B.<br>

>>><br>

>>> I think that it's a valid use case.<br>

>>><br>

>>> Best Regards,<br>

>>> Strahil Nikolov<br>

>>><br>

>>> On Thu, Mar 25, 2021 at 8:59, Ulrich Windl<br>

>>> <<a href="mailto:Ulrich.Windl@rz.uni-regensburg.de" target="_blank">Ulrich.Windl@rz.uni-regensburg.de</a>> wrote:<br>

>>>>>> Ken Gaillot <<a href="mailto:kgaillot@redhat.com" target="_blank">kgaillot@redhat.com</a>> schrieb am 24.03.2021 um 18:56 in<br>

>>> Nachricht<br>

>>> <<a href="mailto:5bffded9c6e614919981dcc7d0b2903220bae19d.camel@redhat.com" target="_blank">5bffded9c6e614919981dcc7d0b2903220bae19d.camel@redhat.com</a>>:<br>

>>>> On Wed, 2021‑03‑24 at 09:27 +0000, Strahil Nikolov wrote:<br>

>>>>> Hello All,<br>

>>>>><br>

>>>>> I have a trouble creating an order set .<br>

>>>>> The end goal is to create a 2 node cluster where nodeA will mount<br>

>>>>> nfsA , while nodeB will mount nfsB.On top of that a depended cloned<br>

>>>>> resource should start on the node only if nfsA or nfsB has started<br>

>>>>> locally.<br>

>>><br>

>>> This looks like ad odd design to me, and I wonder: What is the use case?<br>

>>> (We are using "NFS loop-mounts" for many years, where the cluster needs<br>

>> the<br>

>>> NFS service it provides, but that's a different design)<br>

>>><br>

>>> Regards,<br>

>>> Ulrich<br>

>>><br>

>>><br>

>>><br>

>>>>><br>

>>>>> A prototype code would be something like:<br>

>>>>> pcs constraint order start (nfsA or nfsB) then start resource‑clone<br>

>>>>><br>

>>>>> I tried to create a set like this, but it works only on nodeB:<br>

>>>>> pcs constraint order set nfsA nfsB resource‑clone<br>

>>>>><br>

>>>>> Any idea how to implement that order constraint ?<br>

>>>>> Thanks in advance.<br>

>>>>><br>

>>>>> Best Regards,<br>

>>>>> Strahil Nikolov<br>

>>>><br>

>>>> Basically you want two sets, one with nfsA and nfsB with no ordering<br>

>>>> between them, and a second set with just resource‑clone, ordered after<br>

>>>> the first set.<br>

>>>><br>

>>>> I believe the pcs syntax is:<br>

>>>><br>

>>>> pcs constraint order set nfsA nfsB sequential=false require‑all=false<br>

>>>> set resource‑clone<br>

>>>><br>

>>>> sequential=false says nfsA and nfsB have no ordering between them, and<br>

>>>> require‑all=false says that resource‑clone only needs one of them.<br>

>>>><br>

>>>> (I don't remember for sure the order of the sets in the command, i.e.<br>

>>>> whether it's the primary set first or the dependent set first, but I<br>

>>>> think that's right.)<br>

>>>> ‑‑<br>

>>>> Ken Gaillot <<a href="mailto:kgaillot@redhat.com" target="_blank">kgaillot@redhat.com</a><br>

>>>><br>

>>>><br>

>>>> _______________________________________________<br>

>>>> Manage your subscription:<br>

>>>> <a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>

>>>><br>

>>>> ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a><br>

>><br>

>>><br>

>>><br>

>>><br>

>>> _______________________________________________<br>

>>> Manage your subscription:<br>

>>> <a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>

>>><br>

>>> ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a><br>

>>><br>

>>> _______________________________________________<br>

>>> Manage your subscription:<br>

>>> <a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>

>>><br>

>>> ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a><br>

>><br>

>> _______________________________________________<br>

>> Manage your subscription:<br>

>> <a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>

>><br>

>> ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a><br>

>><br>

> <br>

> <br>

<br>

</blockquote></div><br clear="all"><br>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div>Regards,<br><br></div>Reid Wahl, RHCA<br></div><div>Senior Software Maintenance Engineer, Red Hat<br></div>CEE - Platform Support Delivery - ClusterHA</div></div></div></div></div></div></div></div></div></div></div></div></div></div></div>