<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Mar 26, 2021 at 12:17 AM Ulrich Windl <<a href="mailto:Ulrich.Windl@rz.uni-regensburg.de">Ulrich.Windl@rz.uni-regensburg.de</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">>>> Andrei Borzenkov <<a href="mailto:arvidjaar@gmail.com" target="_blank">arvidjaar@gmail.com</a>> schrieb am 26.03.2021 um 06:19 in<br>

Nachricht <<a href="mailto:534274b3-a6de-5fac-0ae4-d02c305f1a3f@gmail.com" target="_blank">534274b3-a6de-5fac-0ae4-d02c305f1a3f@gmail.com</a>>:<br>

> On 25.03.2021 21:45, Reid Wahl wrote:<br>

>> FWIW we have this KB article (I seem to remember Strahil is a Red Hat<br>

>> customer):<br>

>>   - How do I configure SAP HANA Scale-Up System Replication in a Pacemaker<br>

>> cluster when the HANA filesystems are on NFS shares?(<br>

>> <a href="https://access.redhat.com/solutions/5156571" rel="noreferrer" target="_blank">https://access.redhat.com/solutions/5156571</a>)<br>

>> <br>

> <br>

> "How do I make the cluster resources recover when one node loses access<br>

> to the NFS server?"<br>

> <br>

> If node loses access to NFS server then monitor operations for resources<br>

> that depend on NFS availability will fail or timeout and pacemaker will<br>

> recover (likely by rebooting this node). That's how similar<br>

> configurations have been handled for the past 20 years in other HA<br>

> managers. I am genuinely interested, have you encountered the case where<br>

> it was not enough?<br>

<br>

That's a big problem with the SAP design (basically it's just too complex).<br></blockquote><div><br></div><div>+1000 to this.</div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

In the past I had written a kind of resource agent that worked without that<br>

overly complex overhead, but since those days SAP has added much more<br>

complexity.<br>

If the NFS server is external, pacemaker could fence your nodes when the NFS<br>

server is down as first the monitor operation will fail (hanging on NFS), the<br>

the recover (stop/start) will fail (also hanging on NFS). Even when fencing the<br>

node it would not help (resources cannot start) if the NFS server is still<br>

down. So you may end up with all your nodes being fenced and the fail counts<br>

disabling any automatic resource restart.<br>

<br>

> <br>

>> I can't remember if there was some valid reason why we had to use an<br>

>> attribute resource, or if we simply didn't think about the<br>

sequential=false<br>

>> require-all=false constraint set approach when planning this out.<br>

>> <br>

> <br>

> Because as I already replied, this has different semantic - it will<br>

> start HANA on both nodes if NFS comes up on any one node.<br>

> <br>

> But thank you for the pointer, it demonstrates really interesting<br>

> technique. It also confirms that pacemaker does not have native means to<br>

> express such ordering dependency/constraints. May be it should.<br>

> <br>

>> On Thu, Mar 25, 2021 at 3:39 AM Strahil Nikolov <<a href="mailto:hunter86_bg@yahoo.com" target="_blank">hunter86_bg@yahoo.com</a>><br>

>> wrote:<br>

>> <br>

>>> OCF_CHECK_LEVEL 20<br>

>>> NFS sometimes fails to start (systemd racing condition with dnsmasq)<br>

>>><br>

>>> Best Regards,<br>

>>> Strahil Nikolov<br>

>>><br>

>>> On Thu, Mar 25, 2021 at 12:18, Andrei Borzenkov<br>

>>> <<a href="mailto:arvidjaar@gmail.com" target="_blank">arvidjaar@gmail.com</a>> wrote:<br>

>>> On Thu, Mar 25, 2021 at 10:31 AM Strahil Nikolov <<a href="mailto:hunter86_bg@yahoo.com" target="_blank">hunter86_bg@yahoo.com</a>><br>

>>> wrote:<br>

>>>><br>

>>>> Use Case:<br>

>>>><br>

>>>> nfsA is shared filesystem for HANA running in site A<br>

>>>> nfsB is shared filesystem for HANA running  in site B<br>

>>>><br>

>>>> clusterized resource of type SAPHanaTopology must run on all systems if<br>

>>> the FS for the HANA is running<br>

>>>><br>

>>><br>

>>> And the reason you put NFS under pacemaker control in the first place?<br>

>>> It is not going to switch over, just put it in /etc/fstab.<br>

>>><br>

>>>> Yet, if siteA dies for some reason, I want to make SAPHanaTopology to<br>

>>> still start on the nodes in site B.<br>

>>>><br>

>>>> I think that it's a valid use case.<br>

>>>><br>

>>>> Best Regards,<br>

>>>> Strahil Nikolov<br>

>>>><br>

>>>> On Thu, Mar 25, 2021 at 8:59, Ulrich Windl<br>

>>>> <<a href="mailto:Ulrich.Windl@rz.uni-regensburg.de" target="_blank">Ulrich.Windl@rz.uni-regensburg.de</a>> wrote:<br>

>>>>>>> Ken Gaillot <<a href="mailto:kgaillot@redhat.com" target="_blank">kgaillot@redhat.com</a>> schrieb am 24.03.2021 um 18:56 in<br>

>>>> Nachricht<br>

>>>> <<a href="mailto:5bffded9c6e614919981dcc7d0b2903220bae19d.camel@redhat.com" target="_blank">5bffded9c6e614919981dcc7d0b2903220bae19d.camel@redhat.com</a>>:<br>

>>>>> On Wed, 2021‑03‑24 at 09:27 +0000, Strahil Nikolov wrote:<br>

>>>>>> Hello All,<br>

>>>>>><br>

>>>>>> I have a trouble creating an order set .<br>

>>>>>> The end goal is to create a 2 node cluster where nodeA will mount<br>

>>>>>> nfsA , while nodeB will mount nfsB.On top of that a depended cloned<br>

>>>>>> resource should start on the node only if nfsA or nfsB has started<br>

>>>>>> locally.<br>

>>>><br>

>>>> This looks like ad odd design to me, and I wonder: What is the use case?<br>

>>>> (We are using "NFS loop-mounts" for many years, where the cluster needs<br>

>>> the<br>

>>>> NFS service it provides, but that's a different design)<br>

>>>><br>

>>>> Regards,<br>

>>>> Ulrich<br>

>>>><br>

>>>><br>

>>>><br>

>>>>>><br>

>>>>>> A prototype code would be something like:<br>

>>>>>> pcs constraint order start (nfsA or nfsB) then start resource‑clone<br>

>>>>>><br>

>>>>>> I tried to create a set like this, but it works only on nodeB:<br>

>>>>>> pcs constraint order set nfsA nfsB resource‑clone<br>

>>>>>><br>

>>>>>> Any idea how to implement that order constraint ?<br>

>>>>>> Thanks in advance.<br>

>>>>>><br>

>>>>>> Best Regards,<br>

>>>>>> Strahil Nikolov<br>

>>>>><br>

>>>>> Basically you want two sets, one with nfsA and nfsB with no ordering<br>

>>>>> between them, and a second set with just resource‑clone, ordered after<br>

>>>>> the first set.<br>

>>>>><br>

>>>>> I believe the pcs syntax is:<br>

>>>>><br>

>>>>> pcs constraint order set nfsA nfsB sequential=false require‑all=false<br>

>>>>> set resource‑clone<br>

>>>>><br>

>>>>> sequential=false says nfsA and nfsB have no ordering between them, and<br>

>>>>> require‑all=false says that resource‑clone only needs one of them.<br>

>>>>><br>

>>>>> (I don't remember for sure the order of the sets in the command, i.e.<br>

>>>>> whether it's the primary set first or the dependent set first, but I<br>

>>>>> think that's right.)<br>

>>>>> ‑‑<br>

>>>>> Ken Gaillot <<a href="mailto:kgaillot@redhat.com" target="_blank">kgaillot@redhat.com</a> <br>

>>>>><br>

>>>>><br>

>>>>> _______________________________________________<br>

>>>>> Manage your subscription:<br>

>>>>> <a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a> <br>

>>>>><br>

>>>>> ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a> <br>

>>><br>

>>>><br>

>>>><br>

>>>><br>

>>>> _______________________________________________<br>

>>>> Manage your subscription:<br>

>>>> <a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a> <br>

>>>><br>

>>>> ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a> <br>

>>>><br>

>>>> _______________________________________________<br>

>>>> Manage your subscription:<br>

>>>> <a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a> <br>

>>>><br>

>>>> ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a> <br>

>>><br>

>>> _______________________________________________<br>

>>> Manage your subscription:<br>

>>> <a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a> <br>

>>><br>

>>> ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a> <br>

>>><br>

>> <br>

>> <br>

> <br>

> _______________________________________________<br>

> Manage your subscription:<br>

> <a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a> <br>

> <br>

> ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a> <br>

<br>

<br>

<br>

_______________________________________________<br>

Manage your subscription:<br>

<a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>

<br>

ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a><br>

</blockquote></div><br clear="all"><br>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div>Regards,<br><br></div>Reid Wahl, RHCA<br></div><div>Senior Software Maintenance Engineer, Red Hat<br></div>CEE - Platform Support Delivery - ClusterHA</div></div></div></div></div></div></div></div></div></div></div></div></div></div></div>