[Pacemaker] nfsv4 grace period
Dennis Jacobfeuerborn
dennisml at conversis.de
Sun Feb 9 19:40:16 EST 2014
On 09.02.2014 20:51, Dennis Jacobfeuerborn wrote:
> On 09.02.2014 13:58, Dennis Jacobfeuerborn wrote:
>> On 09.02.2014 08:33, Michael Schwartzkopff wrote:
>>> Am Sonntag, 9. Februar 2014, 02:56:55 schrieb Dennis Jacobfeuerborn:
>>>> Hi,
>>>> i have setup a nfsv3 HA cluster before and that works fine but now I'm
>>>> trying to move to v4 and run into problems with the lease grace period.
>>>> The grace period on CentOS 6 is 90 seconds and that limits how quickly
>>>> the fail-over can happen. The file /etc/sysconfig/nfs contains a
>>>> variable NFSD_V4_GRACE to control this but it doesn't get applied. The
>>>> reason is that the file /proc/fs/nfsd/nfsv4gracetime is not writable:
>>>>
>>>> [root at nfs1 init.d]# echo 10 > /proc/fs/nfsd/nfsv4gracetime
>>>> -bash: echo: write error: Device or resource busy
>
> Apparently the nfs daemon is the culprit here. If it is running you get
> this error. The reason for why the value is not set probably even right
> after the kernel module is loaded seems to have a different cause (see
> below).
>
>>>
>>> You can write the paramerter only after the nfsd kernel module is
>>> loaded. As
>>> far as I can remember the init script sets the leasetime according to
>>> the
>>> config. You have to add a config option and patch your init script to
>>> set the
>>> gracetime at the same point in the script.
>>>
>>>> Does anyone know what the proper way is to reduce this value?
>>>
>>> In RHEL 6.5 it is a option in the config file.
>>
>> Yes this is the NFSD_V4_GRACE option in /etc/sysconfig/nfs I was
>> mentioning above and in the init script this gets set right after the
>> kernel module is loaded but this doesn't seem to work. When I uncomment
>> this option and set it to 10 seconds after the service is started the
>> value for the grace time is still 90 for /proc/fs/nfsd/nfsv4gracetime.
>> I've also looked for a parameter for the kernel module itself to set
>> this when the module is loaded but that doesn't exist and now I'm
>> wondering how to set this value at all.
>>
>> Do you know what the criterion is that determines when this value is
>> settable? The module is loaded the whole time so that alone can't be the
>> only factor here and there must be some additional constraint that now
>> prevents me from updating this value.
>
> So I investigated this further and there seem to be two distinct issues:
>
> 1) NFSD_V4_GRACE only gets applied the second time the nfs service is
> started
>
> Check out the following sequence that I executed right after booting the
> system (pacemaker is not running at this point):
>
> [root at nfs2 ~]# cat /etc/sysconfig/nfs |grep GRACE
> NFSD_V4_GRACE=10
> [root at nfs2 ~]# service nfs start
> Starting NFS services: [ OK ]
> Starting NFS quotas: [ OK ]
> Starting NFS mountd: [ OK ]
> Starting NFS daemon: [ OK ]
> Starting RPC idmapd: [ OK ]
> [root at nfs2 ~]# cat /proc/fs/nfsd/nfsv4gracetime
> 90
> [root at nfs2 ~]# service nfs stop
> Shutting down NFS daemon: [ OK ]
> Shutting down NFS mountd: [ OK ]
> Shutting down NFS quotas: [ OK ]
> Shutting down RPC idmapd: [ OK ]
> [root at nfs2 ~]# cat /proc/fs/nfsd/nfsv4gracetime
> 90
> [root at nfs2 ~]# service nfs start
> Starting NFS services: [ OK ]
> Starting NFS quotas: [ OK ]
> Starting NFS mountd: [ OK ]
> Starting NFS daemon: [ OK ]
> Starting RPC idmapd: [ OK ]
> [root at nfs2 ~]# cat /proc/fs/nfsd/nfsv4gracetime
> 10
>
> As you can see the grace time is only applied after I stopped and
> started the service the second time. Not sure why this is the case.
>
> 2) When started via Pacemaker the value of /proc/fs/nfsd/nfsv4gracetime
> is always set to the default value of 90.
>
> It appears that starting the nfs service from the shell vs. starting it
> using Pacemaker makes a difference and when started with pacemaker the
> value from /etc/sysconfig/nfs is not applied and even when the value has
> been set manually to e.g. 20 after pacemaker starts the service it will
> be reset to the default 90.
> What is strange about this is that the init script simply sources the
> sysconfig file so I'm not sure how the environment could prevent the
> value from getting applied.
Finally found the culprit: The nfs daemon is modifying the grace time
value every few seconds without informing the admin about this.
Apparently every few seconds nfsd checks if grace time < lease time and
if that is the case it sets grace time = lease time without telling
anyone. So if you set the grace time using the sysconfig file this value
will be applied correctly but after about 5-10 seconds nfsd will notice
that grace time < lease time and modify it.
The fix is to add a line to the nfs init script that also sets the lease
time to the value of NFSD_V4_GRACE. After that change everything works
as expected.
I filed to bugs for this one for the rather ugly magic modification of
the grace time value by nfsd and another for the modification of the
init script as in its current incarnation the NFSD_V4_GRACE variable is
rather useless and deceptive:
https://bugzilla.redhat.com/show_bug.cgi?id=1063087
https://bugzilla.redhat.com/show_bug.cgi?id=1063088
Regards,
Dennis
More information about the Pacemaker
mailing list