[ClusterLabs] Antw: Re: Antw: [EXT] Re: Order set troubles

Sat Mar 27 01:37:19 EDT 2021

On 26.03.2021 22:18, Reid Wahl wrote:
> On Fri, Mar 26, 2021 at 6:27 AM Andrei Borzenkov <arvidjaar at gmail.com>
> wrote:
> 
>> On Fri, Mar 26, 2021 at 10:17 AM Ulrich Windl
>> <Ulrich.Windl at rz.uni-regensburg.de> wrote:
>>>
>>>>>> Andrei Borzenkov <arvidjaar at gmail.com> schrieb am 26.03.2021 um
>> 06:19 in
>>> Nachricht <534274b3-a6de-5fac-0ae4-d02c305f1a3f at gmail.com>:
>>>> On 25.03.2021 21:45, Reid Wahl wrote:
>>>>> FWIW we have this KB article (I seem to remember Strahil is a Red Hat
>>>>> customer):
>>>>>   - How do I configure SAP HANA Scale-Up System Replication in a
>> Pacemaker
>>>>> cluster when the HANA filesystems are on NFS shares?(
>>>>> https://access.redhat.com/solutions/5156571)
>>>>>
>>>>
>>>> "How do I make the cluster resources recover when one node loses access
>>>> to the NFS server?"
>>>>
>>>> If node loses access to NFS server then monitor operations for
>> resources
>>>> that depend on NFS availability will fail or timeout and pacemaker will
>>>> recover (likely by rebooting this node). That's how similar
>>>> configurations have been handled for the past 20 years in other HA
>>>> managers. I am genuinely interested, have you encountered the case
>> where
>>>> it was not enough?
>>>
>>> That's a big problem with the SAP design (basically it's just too
>> complex).
>>> In the past I had written a kind of resource agent that worked without
>> that
>>> overly complex overhead, but since those days SAP has added much more
>>> complexity.
>>> If the NFS server is external, pacemaker could fence your nodes when the
>> NFS
>>> server is down as first the monitor operation will fail (hanging on
>> NFS), the
>>> the recover (stop/start) will fail (also hanging on NFS).
>>
>> And how exactly placing NFS resource under pacemaker control is going
>> to change it?
>>
> 
> I noted earlier based on the old case notes:
> 
> "Apparently there were situations in which the SAPHana resource wasn't
> failing over when connectivity was lost with the NFS share that contained
> the hdb* binaries and the HANA data. I don't remember the exact details
> (whether demotion was failing, or whether it wasn't even trying to demote
> on the primary and promote on the secondary, or what). Either way, I was
> surprised that this procedure was necessary, but it seemed to be."
> 
> Strahil may be dealing with a similar situation, not sure. I get where
> you're coming from -- I too would expect the application that depends on
> NFS to simply fail when NFS connectivity is lost, which in turn leads to
> failover and recovery. For whatever reason, due to some weirdness of the
> SAPHana resource agent, that didn't happen.
> 

Yes. The only reason to use this workaround would be if resource agent
monitor still believes that application is up when required NFS is down.
Which is a bug in resource agent or possibly in application itself.

While using this workaround in this case is perfectly reasonable, none
of reasons listed in the message I was replying to are applicable.

So far the only reason OP wanted to do it was some obscure race
condition on startup outside of pacemaker. In which case this workaround
simply delays NFS mount, sidestepping race.

I also remember something about racing with dnsmasq, at which point I'd
say that making cluster depend on availability of DNS is e-h-h-h unwise.