[ClusterLabs] Antw: Re: Antw: Re: Antw: [EXT] Re: Order set troubles

Mon Mar 29 12:02:50 EDT 2021

On 29.03.2021 11:11, Ulrich Windl wrote:
>>>> Andrei Borzenkov <arvidjaar at gmail.com> schrieb am 27.03.2021 um 06:37 in
> Nachricht <7c294034-56c3-baab-73c6-7909ab554555 at gmail.com>:
>> On 26.03.2021 22:18, Reid Wahl wrote:
>>> On Fri, Mar 26, 2021 at 6:27 AM Andrei Borzenkov <arvidjaar at gmail.com>
>>> wrote:
>>>
>>>> On Fri, Mar 26, 2021 at 10:17 AM Ulrich Windl
>>>> <Ulrich.Windl at rz.uni‑regensburg.de> wrote:
>>>>>
>>>>>>>> Andrei Borzenkov <arvidjaar at gmail.com> schrieb am 26.03.2021 um
>>>> 06:19 in
>>>>> Nachricht <534274b3‑a6de‑5fac‑0ae4‑d02c305f1a3f at gmail.com>:
>>>>>> On 25.03.2021 21:45, Reid Wahl wrote:
>>>>>>> FWIW we have this KB article (I seem to remember Strahil is a Red Hat
>>>>>>> customer):
>>>>>>>   ‑ How do I configure SAP HANA Scale‑Up System Replication in a
>>>> Pacemaker
>>>>>>> cluster when the HANA filesystems are on NFS shares?(
>>>>>>> https://access.redhat.com/solutions/5156571)
>>>>>>>
>>>>>>
>>>>>> "How do I make the cluster resources recover when one node loses access
>>>>>> to the NFS server?"
>>>>>>
>>>>>> If node loses access to NFS server then monitor operations for
>>>> resources
>>>>>> that depend on NFS availability will fail or timeout and pacemaker will
>>>>>> recover (likely by rebooting this node). That's how similar
>>>>>> configurations have been handled for the past 20 years in other HA
>>>>>> managers. I am genuinely interested, have you encountered the case
>>>> where
>>>>>> it was not enough?
>>>>>
>>>>> That's a big problem with the SAP design (basically it's just too
>>>> complex).
>>>>> In the past I had written a kind of resource agent that worked without
>>>> that
>>>>> overly complex overhead, but since those days SAP has added much more
>>>>> complexity.
>>>>> If the NFS server is external, pacemaker could fence your nodes when the
>>>> NFS
>>>>> server is down as first the monitor operation will fail (hanging on
>>>> NFS), the
>>>>> the recover (stop/start) will fail (also hanging on NFS).
>>>>
>>>> And how exactly placing NFS resource under pacemaker control is going
>>>> to change it?
>>>>
>>>
>>> I noted earlier based on the old case notes:
>>>
>>> "Apparently there were situations in which the SAPHana resource wasn't
>>> failing over when connectivity was lost with the NFS share that contained
>>> the hdb* binaries and the HANA data. I don't remember the exact details
>>> (whether demotion was failing, or whether it wasn't even trying to demote
>>> on the primary and promote on the secondary, or what). Either way, I was
>>> surprised that this procedure was necessary, but it seemed to be."
>>>
>>> Strahil may be dealing with a similar situation, not sure. I get where
>>> you're coming from ‑‑ I too would expect the application that depends on
>>> NFS to simply fail when NFS connectivity is lost, which in turn leads to
>>> failover and recovery. For whatever reason, due to some weirdness of the
>>> SAPHana resource agent, that didn't happen.
>>>
>>
>> Yes. The only reason to use this workaround would be if resource agent
>> monitor still believes that application is up when required NFS is down.
>> Which is a bug in resource agent or possibly in application itself.
> 
> I think it's getting philosophical now:
> For example a web server using documents from an NFS server:
> Is the webserver down, when access to NFS hangs?

>From end user point of view, web server is down when it cannot complete
user request. From HA manager point of view, web server is down when
agent says it is down. Whether agent just checks for web server PID or
actually attempts to fetch something from web server is up to the agent.

I know that SAP HANA agent is using SAP HANA binaries to query SAP HANA
database so I /expect/ that in this case this attempt ends up as failure
from HA manager point of view.

> Would restarting ("recover")
> the web server help in that situation?

No. But it is irrelevant here. If web server depends on NFS mount and
NFS mount is reported failed, HA manager will attempt to recover by
first stopping web server. Whether error indication comes from web
server or NFS mount is irrelevant. It is very unlikely that HA manager
will ever reach "starting" step because stopping will either fail or
time out and node will be fenced.

> Maybe the OCF_CHECK_LEVEL could be used: High levels could query whether that
> resource is not only "running", but also that the resource is responding, etc.
> 
>>
>> While using this workaround in this case is perfectly reasonable, none
>> of reasons listed in the message I was replying to are applicable.
>>
>> So far the only reason OP wanted to do it was some obscure race
>> condition on startup outside of pacemaker. In which case this workaround
>> simply delays NFS mount, sidestepping race.
>>
>> I also remember something about racing with dnsmasq, at which point I'd
>> say that making cluster depend on availability of DNS is e‑h‑h‑h unwise.
>> _______________________________________________
>> Manage your subscription:
>> https://lists.clusterlabs.org/mailman/listinfo/users 
>>
>> ClusterLabs home: https://www.clusterlabs.org/ 
> 
> 
> 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
>