[ClusterLabs] Antw: Re: Antw: [EXT] Re: Order set troubles

Reid Wahl nwahl at redhat.com
Fri Mar 26 03:19:40 EDT 2021


On Fri, Mar 26, 2021 at 12:17 AM Ulrich Windl <
Ulrich.Windl at rz.uni-regensburg.de> wrote:

> >>> Andrei Borzenkov <arvidjaar at gmail.com> schrieb am 26.03.2021 um 06:19
> in
> Nachricht <534274b3-a6de-5fac-0ae4-d02c305f1a3f at gmail.com>:
> > On 25.03.2021 21:45, Reid Wahl wrote:
> >> FWIW we have this KB article (I seem to remember Strahil is a Red Hat
> >> customer):
> >>   - How do I configure SAP HANA Scale-Up System Replication in a
> Pacemaker
> >> cluster when the HANA filesystems are on NFS shares?(
> >> https://access.redhat.com/solutions/5156571)
> >>
> >
> > "How do I make the cluster resources recover when one node loses access
> > to the NFS server?"
> >
> > If node loses access to NFS server then monitor operations for resources
> > that depend on NFS availability will fail or timeout and pacemaker will
> > recover (likely by rebooting this node). That's how similar
> > configurations have been handled for the past 20 years in other HA
> > managers. I am genuinely interested, have you encountered the case where
> > it was not enough?
>
> That's a big problem with the SAP design (basically it's just too complex).
>

+1000 to this.

In the past I had written a kind of resource agent that worked without that
> overly complex overhead, but since those days SAP has added much more
> complexity.
> If the NFS server is external, pacemaker could fence your nodes when the
> NFS
> server is down as first the monitor operation will fail (hanging on NFS),
> the
> the recover (stop/start) will fail (also hanging on NFS). Even when
> fencing the
> node it would not help (resources cannot start) if the NFS server is still
> down. So you may end up with all your nodes being fenced and the fail
> counts
> disabling any automatic resource restart.
>
> >
> >> I can't remember if there was some valid reason why we had to use an
> >> attribute resource, or if we simply didn't think about the
> sequential=false
> >> require-all=false constraint set approach when planning this out.
> >>
> >
> > Because as I already replied, this has different semantic - it will
> > start HANA on both nodes if NFS comes up on any one node.
> >
> > But thank you for the pointer, it demonstrates really interesting
> > technique. It also confirms that pacemaker does not have native means to
> > express such ordering dependency/constraints. May be it should.
> >
> >> On Thu, Mar 25, 2021 at 3:39 AM Strahil Nikolov <hunter86_bg at yahoo.com>
> >> wrote:
> >>
> >>> OCF_CHECK_LEVEL 20
> >>> NFS sometimes fails to start (systemd racing condition with dnsmasq)
> >>>
> >>> Best Regards,
> >>> Strahil Nikolov
> >>>
> >>> On Thu, Mar 25, 2021 at 12:18, Andrei Borzenkov
> >>> <arvidjaar at gmail.com> wrote:
> >>> On Thu, Mar 25, 2021 at 10:31 AM Strahil Nikolov <
> hunter86_bg at yahoo.com>
> >>> wrote:
> >>>>
> >>>> Use Case:
> >>>>
> >>>> nfsA is shared filesystem for HANA running in site A
> >>>> nfsB is shared filesystem for HANA running  in site B
> >>>>
> >>>> clusterized resource of type SAPHanaTopology must run on all systems
> if
> >>> the FS for the HANA is running
> >>>>
> >>>
> >>> And the reason you put NFS under pacemaker control in the first place?
> >>> It is not going to switch over, just put it in /etc/fstab.
> >>>
> >>>> Yet, if siteA dies for some reason, I want to make SAPHanaTopology to
> >>> still start on the nodes in site B.
> >>>>
> >>>> I think that it's a valid use case.
> >>>>
> >>>> Best Regards,
> >>>> Strahil Nikolov
> >>>>
> >>>> On Thu, Mar 25, 2021 at 8:59, Ulrich Windl
> >>>> <Ulrich.Windl at rz.uni-regensburg.de> wrote:
> >>>>>>> Ken Gaillot <kgaillot at redhat.com> schrieb am 24.03.2021 um 18:56
> in
> >>>> Nachricht
> >>>> <5bffded9c6e614919981dcc7d0b2903220bae19d.camel at redhat.com>:
> >>>>> On Wed, 2021‑03‑24 at 09:27 +0000, Strahil Nikolov wrote:
> >>>>>> Hello All,
> >>>>>>
> >>>>>> I have a trouble creating an order set .
> >>>>>> The end goal is to create a 2 node cluster where nodeA will mount
> >>>>>> nfsA , while nodeB will mount nfsB.On top of that a depended cloned
> >>>>>> resource should start on the node only if nfsA or nfsB has started
> >>>>>> locally.
> >>>>
> >>>> This looks like ad odd design to me, and I wonder: What is the use
> case?
> >>>> (We are using "NFS loop-mounts" for many years, where the cluster
> needs
> >>> the
> >>>> NFS service it provides, but that's a different design)
> >>>>
> >>>> Regards,
> >>>> Ulrich
> >>>>
> >>>>
> >>>>
> >>>>>>
> >>>>>> A prototype code would be something like:
> >>>>>> pcs constraint order start (nfsA or nfsB) then start resource‑clone
> >>>>>>
> >>>>>> I tried to create a set like this, but it works only on nodeB:
> >>>>>> pcs constraint order set nfsA nfsB resource‑clone
> >>>>>>
> >>>>>> Any idea how to implement that order constraint ?
> >>>>>> Thanks in advance.
> >>>>>>
> >>>>>> Best Regards,
> >>>>>> Strahil Nikolov
> >>>>>
> >>>>> Basically you want two sets, one with nfsA and nfsB with no ordering
> >>>>> between them, and a second set with just resource‑clone, ordered
> after
> >>>>> the first set.
> >>>>>
> >>>>> I believe the pcs syntax is:
> >>>>>
> >>>>> pcs constraint order set nfsA nfsB sequential=false require‑all=false
> >>>>> set resource‑clone
> >>>>>
> >>>>> sequential=false says nfsA and nfsB have no ordering between them,
> and
> >>>>> require‑all=false says that resource‑clone only needs one of them.
> >>>>>
> >>>>> (I don't remember for sure the order of the sets in the command, i.e.
> >>>>> whether it's the primary set first or the dependent set first, but I
> >>>>> think that's right.)
> >>>>> ‑‑
> >>>>> Ken Gaillot <kgaillot at redhat.com
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> Manage your subscription:
> >>>>> https://lists.clusterlabs.org/mailman/listinfo/users
> >>>>>
> >>>>> ClusterLabs home: https://www.clusterlabs.org/
> >>>
> >>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Manage your subscription:
> >>>> https://lists.clusterlabs.org/mailman/listinfo/users
> >>>>
> >>>> ClusterLabs home: https://www.clusterlabs.org/
> >>>>
> >>>> _______________________________________________
> >>>> Manage your subscription:
> >>>> https://lists.clusterlabs.org/mailman/listinfo/users
> >>>>
> >>>> ClusterLabs home: https://www.clusterlabs.org/
> >>>
> >>> _______________________________________________
> >>> Manage your subscription:
> >>> https://lists.clusterlabs.org/mailman/listinfo/users
> >>>
> >>> ClusterLabs home: https://www.clusterlabs.org/
> >>>
> >>
> >>
> >
> > _______________________________________________
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> >
> > ClusterLabs home: https://www.clusterlabs.org/
>
>
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>


-- 
Regards,

Reid Wahl, RHCA
Senior Software Maintenance Engineer, Red Hat
CEE - Platform Support Delivery - ClusterHA
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20210326/deee157f/attachment.htm>


More information about the Users mailing list