[ClusterLabs] Failing over NFSv4/TCP exports

Wed Aug 17 21:42:43 UTC 2016

On Wed, Aug 17, 2016 at 3:16 PM, Andreas Kurz <andreas.kurz at gmail.com>
wrote:

>
>
> On Wed, Aug 17, 2016 at 3:44 PM, Patrick Zwahlen <paz at navixia.com> wrote:
>
>>
>>
>> The problem I see is what a lot of people have already mentioned:
>> Failover works nicely but failback takes a very long time.
>>
>> This is a known problem ... have a look into the portblock RA - it has
> the feature to send out TCP tickle ACKs to reset such hanging sessions.
> So you can configure a portblock resource that blocks the tcp port before
> starting the VIP and another portblock resource that unblocks the port
> afterwards and sends out that tickle ACKs.
>

I have also noticed the same thing, when the Pacemaker cluster is on the
client side. The storage device is a NexentaStor system that has it's own
HA setup with two controller nodes. Things work fine when the Pacemaker
cluster fails over, because both Pacemaker nodes already have the NFS
mounts in place, but when we fail the NexentaStor over to the other node
once, things work fine, but if we then fail back, all the NFS mounts on the
Pacemaker nodes will hang for as long as 15 minutes, then suddenly recover.

I wonder if there is a way to address this when the Pacemaker cluster is
the client instead of the server?

--Greg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20160817/aa462437/attachment-0002.html>