[ClusterLabs] Antw: Re: Antw: Re: Antw: Re: Antw: [EXT] Re: Order set troubles

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Tue Mar 30 02:17:24 EDT 2021


>>> Andrei Borzenkov <arvidjaar at gmail.com> schrieb am 29.03.2021 um 18:02 in
Nachricht <4e638e03-936e-40c9-3f4e-8e641a5ed65a at gmail.com>:
> On 29.03.2021 11:11, Ulrich Windl wrote:
>>>>> Andrei Borzenkov <arvidjaar at gmail.com> schrieb am 27.03.2021 um 06:37
in
>> Nachricht <7c294034-56c3-baab-73c6-7909ab554555 at gmail.com>:
>>> On 26.03.2021 22:18, Reid Wahl wrote:
>>>> On Fri, Mar 26, 2021 at 6:27 AM Andrei Borzenkov <arvidjaar at gmail.com>
>>>> wrote:
>>>>
>>>>> On Fri, Mar 26, 2021 at 10:17 AM Ulrich Windl
>>>>> <Ulrich.Windl at rz.uni‑regensburg.de> wrote:
>>>>>>
>>>>>>>>> Andrei Borzenkov <arvidjaar at gmail.com> schrieb am 26.03.2021 um
>>>>> 06:19 in
>>>>>> Nachricht <534274b3‑a6de‑5fac‑0ae4‑d02c305f1a3f at gmail.com>:
>>>>>>> On 25.03.2021 21:45, Reid Wahl wrote:
>>>>>>>> FWIW we have this KB article (I seem to remember Strahil is a Red
Hat
>>>>>>>> customer):
>>>>>>>>   ‑ How do I configure SAP HANA Scale‑Up System Replication in a
>>>>> Pacemaker
>>>>>>>> cluster when the HANA filesystems are on NFS shares?(
>>>>>>>> https://access.redhat.com/solutions/5156571)
>>>>>>>>
>>>>>>>
>>>>>>> "How do I make the cluster resources recover when one node loses
access
>>>>>>> to the NFS server?"
>>>>>>>
>>>>>>> If node loses access to NFS server then monitor operations for
>>>>> resources
>>>>>>> that depend on NFS availability will fail or timeout and pacemaker
will
>>>>>>> recover (likely by rebooting this node). That's how similar
>>>>>>> configurations have been handled for the past 20 years in other HA
>>>>>>> managers. I am genuinely interested, have you encountered the case
>>>>> where
>>>>>>> it was not enough?
>>>>>>
>>>>>> That's a big problem with the SAP design (basically it's just too
>>>>> complex).
>>>>>> In the past I had written a kind of resource agent that worked without
>>>>> that
>>>>>> overly complex overhead, but since those days SAP has added much more
>>>>>> complexity.
>>>>>> If the NFS server is external, pacemaker could fence your nodes when
the
>>>>> NFS
>>>>>> server is down as first the monitor operation will fail (hanging on
>>>>> NFS), the
>>>>>> the recover (stop/start) will fail (also hanging on NFS).
>>>>>
>>>>> And how exactly placing NFS resource under pacemaker control is going
>>>>> to change it?
>>>>>
>>>>
>>>> I noted earlier based on the old case notes:
>>>>
>>>> "Apparently there were situations in which the SAPHana resource wasn't
>>>> failing over when connectivity was lost with the NFS share that
contained
>>>> the hdb* binaries and the HANA data. I don't remember the exact details
>>>> (whether demotion was failing, or whether it wasn't even trying to
demote
>>>> on the primary and promote on the secondary, or what). Either way, I was
>>>> surprised that this procedure was necessary, but it seemed to be."
>>>>
>>>> Strahil may be dealing with a similar situation, not sure. I get where
>>>> you're coming from ‑‑ I too would expect the application that depends on
>>>> NFS to simply fail when NFS connectivity is lost, which in turn leads to
>>>> failover and recovery. For whatever reason, due to some weirdness of the
>>>> SAPHana resource agent, that didn't happen.
>>>>
>>>
>>> Yes. The only reason to use this workaround would be if resource agent
>>> monitor still believes that application is up when required NFS is down.
>>> Which is a bug in resource agent or possibly in application itself.
>> 
>> I think it's getting philosophical now:
>> For example a web server using documents from an NFS server:
>> Is the webserver down, when access to NFS hangs?
> 
> From end user point of view, web server is down when it cannot complete
> user request. From HA manager point of view, web server is down when
> agent says it is down. Whether agent just checks for web server PID or
> actually attempts to fetch something from web server is up to the agent.
> 
> I know that SAP HANA agent is using SAP HANA binaries to query SAP HANA
> database so I /expect/ that in this case this attempt ends up as failure
> from HA manager point of view.
> 
>> Would restarting ("recover")
>> the web server help in that situation?
> 
> No. But it is irrelevant here. If web server depends on NFS mount and
> NFS mount is reported failed, HA manager will attempt to recover by
> first stopping web server. Whether error indication comes from web
> server or NFS mount is irrelevant. It is very unlikely that HA manager
> will ever reach "starting" step because stopping will either fail or
> time out and node will be fenced.

Still, it wouldn't make it any better, as you will usually have an ordering
constraint that says the dependent resource needs NFS (started after NFS,
stopped before NFS).
So when restarting NFS, the dependent resource will be tried to be stopped,
too. Thus typically resulting in a node fence as the dependent resource hangs
on NFS.

> 
>> Maybe the OCF_CHECK_LEVEL could be used: High levels could query whether 
> that
>> resource is not only "running", but also that the resource is responding, 
> etc.
>> 
>>>
>>> While using this workaround in this case is perfectly reasonable, none
>>> of reasons listed in the message I was replying to are applicable.
>>>
>>> So far the only reason OP wanted to do it was some obscure race
>>> condition on startup outside of pacemaker. In which case this workaround
>>> simply delays NFS mount, sidestepping race.
>>>
>>> I also remember something about racing with dnsmasq, at which point I'd
>>> say that making cluster depend on availability of DNS is e‑h‑h‑h unwise.
>>> _______________________________________________
>>> Manage your subscription:
>>> https://lists.clusterlabs.org/mailman/listinfo/users 
>>>
>>> ClusterLabs home: https://www.clusterlabs.org/ 
>> 
>> 
>> 
>> _______________________________________________
>> Manage your subscription:
>> https://lists.clusterlabs.org/mailman/listinfo/users 
>> 
>> ClusterLabs home: https://www.clusterlabs.org/ 
>> 
> 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 





More information about the Users mailing list