[Pacemaker] NFSv4 90 sec grace period

Michael Schwartzkopff misch at clusterbau.com
Fri Jul 15 04:17:46 EDT 2011


> On 07/14/2011 09:46 PM, Michael Schwartzkopff wrote:
> >> On 2011-07-13 23:08, Michael Schwartzkopff wrote:
> >>> Hi,
> >>> 
> >>> I am trying to set up a NFSv4 server cluster. After failover my client
> >>> has to wait 90 sec to be able to access the data agin.
> >>> 
> >>> I already set /proc/fs/nfsd/nfsv4leasetime to 10 but the result ist the
> >>> same. The server now logs:
> >>> 
> >>> Jul 13 22:44:27 debian1 kernel: [ 3512.102296] NFSD: starting 10-second
> >>> grace period
> >>> 
> >>> but the client waits exactly for 90 sec.
> >>> 
> >>> Any idea what might be wrong?
> >> 
> >> Cosmic rays.
> >> 
> >> More seriously, post the configuration and explain _precisely_ how you
> >> are testing this. May well be a bug, but no way to tell for sure (and
> >> hence, no way to fix) with the scarce information given. Thanks.
> > 
> > </long description>
> > 
> > I somehow tried so set up a NFSv4 server along your SLES11 tutorial.
> 
> Can you please post your source? I'll assume for a moment that you are
> referring to the NFS tech guide on our web site, found here:
> 
> http://www.linbit.com/en/education/tech-guides/highly-available-nfs-with-dr
> bd-and-pacemaker/

http://www.novell.com/documentation/sle_ha/book_sleha_techguides/?page=/documentation/sle_ha/book_sleha_techguides/data/sec_ha_quick_nfs_failover.html


> > But I
> > user Debian Squeeze and tried to keep it simple without the clone
> > resources
> 
> Yeah, bad idea. The clones are in the example config for a reason.

Ok. will try.

> > So here is my config:
> > 
> > primitive resDRBD ocf:linbit:drbd \
> > 
> > 	params drbd_resource="r0"
> > 
> > primitive resExportHome ocf:heartbeat:exportfs \
> > 
> > 	params clientspec="192.168.87.0/24" directory="/srv/nfs/export/home" \
> > 	
> >                     fsid="1001" options="no_root_squash,rw"
> > 
> > primitive resExportRoot ocf:heartbeat:exportfs \
> > 
> > 	params clientspec="192.168.87.0/24" \
> > 	
> >                    options="rw,crossmnt,no_root_squash" fsid="0" \
> >                    directory="/srv/nfs/export"
> > 
> > primitive resFilesystem ocf:heartbeat:Filesystem \
> > 
> > 	params device="/dev/drbd0" fstype="ext4" directory="/srv/nfs"
> > 
> > primitive resIP ocf:heartbeat:IPaddr2 \
> > 
> > 	params ip="192.168.87.33" nic="eth0" cidr_netmask="24"
> > 
> > primitive resNFScommon lsb:nfs-common \
> > 
> > 	op monitor interval="60s"
> > 
> > primitive resNFSserver ocf:heartbeat:nfsserver \
> > 
> > 	params nfs_init_script="/etc/init.d/nfs-kernel-server" \
> > 	
> >                     nfs_notify_cmd="/sbin/sm-notify" \
> >                     nfs_shared_infodir="/srv/nfs/infodir" \
> >                     nfs_ip="192.168.87.33"
> 
> exportfs is designed such that you don't need to use nfsserver.
> 
> > group groupNFS resFilesystem resNFScommon resNFSserver \
> > 
> >          resExportRoot resExportHome resIP
> > 
> > ms msDRBD resDRBD \
> > 
> > 	meta notify="true"
> > 
> > colocation col_FS_DRBD inf: groupNFS:Started msDRBD:Master
> > order ord_DRBD_FS inf: msDRBD:promote groupNFS:start
> > 
> > I can mount the NFS on a client
> > 
> > mount -t nfs4 -o udp 192.168.87.33:/home /mnt/home
> 
> NFSv4 over UDP? Why?
> 
> > accessing the NFS share every second during a failover results in:
> > 
> > while : ; do date; cat /mnt/home/testfile ; sleep 1 ; done
> > 
> > Thu Jul 14 20:59:11 CEST 2011
> > Hello world
> > Thu Jul 14 20:59:12 CEST 2011
> > 
> > cat: /mnt/home/testfile: Socket operation on non-socket
> > Thu Jul 14 20:59:15 CEST 2011
> > cat: /mnt/home/testfile: Socket operation on non-socket
> > Thu Jul 14 20:59:17 CEST 2011
> > cat: /mnt/home/testfile: Socket operation on non-socket
> > Thu Jul 14 20:59:20 CEST 2011
> > cat: /mnt/home/testfile: Socket operation on non-socket
> > Thu Jul 14 20:59:22 CEST 2011
> > (...)
> > Hello world
> > Thu Jul 14 21:01:03 CEST 2011
> > Hello world
> > 
> > That means there is a 90 sec gap accesing the file. I changed
> > nfsv4leasetime already to 10 sec. But It did not help.
> 
> The NFS lease time grace period is only relevant to open file handles.
> Assuming your "cat" takes a few millisecs and you are sleeping 1s, I'd
> guess there's a very high chance your failover occurs while you have no
> handles open on that filesystem at all. Thus the grace period shouldn't
> be a factor here.
> 
> > Any other ideas? Are the LVS as in your tutorial nescessesary?
> 
> I suppose you mean LVs as in Logical Volumes, not LVS as in Linux
> Virtual Server. No, the LVM based approach is not strictly necessary,
> it's just practical to set things up that way.
> 
> As for other ideas, please post your "mount" output on the NFSv4 client
> so we can get an idea of the default NFS mount options in effect on your
> system.

192.168.10.16:/home on /mnt/home type nfs4 
(rw,udp,clientaddr=192.168.10.133,addr=192.168.10.16)

-- 
Dr. Michael Schwartzkopff
Guardinistr. 63
81375 München

Tel: (0163) 172 50 98
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20110715/cf1bd967/attachment-0003.sig>


More information about the Pacemaker mailing list