[ClusterLabs] Antw: [EXT] Multiple nfsserver resource groups

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Mon Mar 9 07:39:31 EDT 2020


>>> "Christoforos Christoforou" <christoforos at globalreach.com> schrieb am
09.03.2020 um 12:05 in Nachricht
<030101d5f602$9c9e3880$d5daa980$@globalreach.com>:
>>  Start: RAID1, LVM, fs, nsfsserver, exportfs, IP address (stop is the other

> way 'round)
> If your IP resource stops before the nfsserver, I would think it's possible

> that connections and file handles are left hanging.

That's intended: I should look as if the server crashed. The clients would
continue to try to reach the server. Then when the sever comes up on a
different node and set's the IP address the clients would think the original
server has recovered. I also think non-crashed clients should try to recover
their locks then (we use NFS hard mounts).

> If you don't have an nfsnotify service, then this could be your problem.

OK, I thought the server would use nfsnotify to inform clients once it's up,
but it looks like it does not; I see:
 /usr/sbin/rpc.statd --no-notify

> Does your nfsserver resource also have nfs_no_notify=true?

The parameter isn't set, so it should have the default value.

> 
> I'd say test the following:
> Test1: Set nfs_no_notify=true on your nfsserver resource and create an 
> nfsnotify resource that starts after the IP and stops first.
> Something like this: pcs resource create r_nfsnotify nfsnotify 
> source_host=IPADDRESS
> Test2: Set the IP to stop after the nfsserver resource and have 
> nfs_no_notify=false (default)

The interesting thing is that I've set up a test NFS server and that works
perfectly. Only the real productive one has the problem. It's apoosible that
some user (including systemd) does some stupid thing I'm not aware of.
One indicateor is that a filesystem unmounted during the stop sequence seems
to be mounted again before the stop sequence is complete.

Regards,
Ulrich

> 
> Christoforos Christoforou
> Senior Systems Administrator
> Global Reach Internet Productions
> Twitter | Facebook | LinkedIn
> p (515) 996-0996 | globalreach.com
> 
> -----Original Message-----
> From: Ulrich Windl <Ulrich.Windl at rz.uni-regensburg.de> 
> Sent: Monday, March 9, 2020 12:37 PM
> To: users at clusterlabs.org; christoforos at globalreach.com 
> Subject: RE: Antw: [EXT] [ClusterLabs] Multiple nfsserver resource groups
> 
>>>> "Christoforos Christoforou" <christoforos at globalreach.com> schrieb 
>>>> am
> 09.03.2020 um 10:38 in Nachricht
> <02ff01d5f5f6$7024af70$506e0e50$@globalreach.com>:
>> Thanks for the advice.
>> We haven’t had any issues with the time it takes to prepare the 
>> exportfs resources so far, and we've been running this setup for 2 
>> years now, but I will keep it in mind as we increase the number of 
>> exportfs resources. I have
> 
>> already implemented the solution discussed and merged all filesystems 
>> and exports into one resource group and everything looks good.
>> 
>> For your problem, what is the order in which your resources 
>> startup/shutdown?
> 
> Start: RAID1, LVM, fs, nsfsserver, exportfs, IP address (stop is the other 
> way
> 'round)
> 
>> Is your nfs info dir a filesystem resource or an LVM resource?
> 
> We have everything as LVs (see above)
> 
>> Do you have an nfsnotify resource in place?
> 
> Not explicitly. We are using NFSv3 only.
> 
>> We have found that the order in which resources startup/shutdown 
>> without any
> 
>> problems is to have the nfsnotify resource stop first, then stop the
> exportfs 
>> resources, nfsserver after that, filesystems (one of which is the nfs 
>> shared
> 
>> info dir) and finally the virtual IP resource. Startup is the reverse 
>> of that.
> 
> Regards,
> Ulrich
> 
> 
>> 
>> -----Original Message-----
>> From: Ulrich Windl <Ulrich.Windl at rz.uni-regensburg.de>
>> Sent: Monday, March 9, 2020 9:26 AM
>> To: users at clusterlabs.org; christoforos at globalreach.com 
>> Subject: Antw: [EXT] [ClusterLabs] Multiple nfsserver resource groups
>> 
>>>>> "Christoforos Christoforou" <christoforos at globalreach.com> schrieb 
>>>>> am
>> 06.03.2020 um 18:56 in Nachricht
>>
>
<25205_1583517421_5E628EE5_25205_75_1_01e001d5f3e0$804a6240$80df26c0$@globalr
>> eac
>> .com>:
>>> Hello,
>>> 
>>>  
>>> 
>>> We have a PCS cluster running on 2 CentOS 7 nodes, exposing 2 NFSv3 
>>> volumes which are then mounted to multiple servers (around 8).
>>> 
>>> We want to have 2 more sets of additional shared NFS volumes, for a 
>>> total
>> of
>>> 6.
>>> 
>>>  
>>> 
>>> I have successfully configured 3 resource groups, with each group 
>>> having
>> the
>>> following resources:
>>> 
>>> *	1x ocf_heartbeat_IPaddr2 resource for the Virtual IP that exposes
>>> the NFS share assigned to its own NIC.
>>> *	3x ocf_heartbeat_Filesystem resources (1 is for the
>>> nfs_shared_infodir and the other 2 are the ones exposed via the NFS
> server)
>>> *	1x ocf_heartbeat_nfsserver resource that uses the aforementioned
>>> nfs_shared_infodir.
>>> *	2x ocf_heartbeat_exportfs resources that expose the other 2
>>> filesystems as NFS shares.
>>> *	1x ocf_heartbeat_nfsnotify resource that has the Virtual IP set as
>>> its own source_host.
>>> 
>>>  
>>> 
>>> All 9 filesystem volumes are mounted via iSCSI to the PCS nodes in 
>>> /dev/mapper/mpathX
>>> 
>>> So the structure is like so:
>>> 
>>> Resource group 1:
>>> 
>>> *	/dev/mapper/mpatha ‑ shared volume 1
>>> *	/dev/mapper/mpathb ‑ shared volume 2
>>> *	/dev/mapper/mpathc ‑ nfs_shared_infodir for resource group 1
>>> 
>>> Resource group 2:
>>> 
>>> *	/dev/mapper/mpathd ‑ shared volume 3
>>> *	/dev/mapper/mpathe ‑ shared volume 4
>>> *	/dev/mapper/mpathf ‑ nfs_shared_infodir for resource group 2
>>> 
>>> Resource group 3:
>>> 
>>> *	/dev/mapper/mpathg ‑ shared volume 5
>>> *	/dev/mapper/mpathh ‑ shared volume 6
>>> *	/dev/mapper/mpathi ‑ nfs_shared_infodir for resource group 3
>>> 
>>>  
>>> 
>>> My concern is that when I run a df command on the active node, the 
>>> last ocf_heartbeat_nfsserver volume (/dev/mapper/mpathi) mounted to
>> /var/lib/nfs.
>>> I understand that I cannot change this, but I can change the location 
>>> of
>> the
>>> rpc_pipefs folder.
>>> 
>>>  
>>> 
>>> I have had this setup running with 2 resource groups in our 
>>> development environment, and have not noticed any issues, but since 
>>> we're planning to move to production and add a 3rd resource group, I 
>>> want to make sure that this setup will not cause any issues. I am by 
>>> no means an expert on NFS, so some insight is appreciated.
>>> 
>>>  
>>> 
>>> If this kind of setup is not supported or recommended, I have 2 
>>> alternate plans in mind:
>>> 
>>> 1.	Have all resources in the same resource group, in a setup that will
>>> look like this:
>>> 
>>> a.	1x ocf_heartbeat_IPaddr2 resource for the Virtual IP that exposes
>>> the NFS share.
>>> b.	7x ocf_heartbeat_Filesystem resources (1 is for the
>>> nfs_shared_infodir and 6 exposed via the NFS server)
>>> c.	1x ocf_heartbeat_nfsserver resource that uses the aforementioned
>>> nfs_shared_infodir.
>>> d.	6x ocf_heartbeat_exportfs resources that expose the other 6
>>> filesystems as NFS shares. Use the clientspec option to restrict to 
>>> IPs and prevent unwanted mounts.
>>> e.	1x ocf_heartbeat_nfsnotify resource that has the Virtual IP set as
>>> its own source_host.
>>> 
>>> 2.	Setup 2 more clusters to accommodate our needs
>>> 
>>>  
>>> 
>>> I really want to avoid #2, due to the fact that it will be overkill 
>>> for our case.
>> 
>> Things you might consider is to get reid of the groups and use 
>> explicit colocation and orderings. The advantages will be that you can 
>> execute
> several 
>> agents in parallel (e.g. prepare all fileysstems in parallel). In the 
>> past
> we 
>> had made the experience that exportfs resources can take quite some 
>> time and
> 
>> if you have like 20 or more of them, it delays the shutdown/startup 
>> significatly.
>> So we moved to using netgroups provided by LDAP instead, and we could 
>> reduce
> 
>> the number of exportfs statements drastically.
>> However we have one odd problem (SLES12 SP4): The NFS resource using 
>> systemd
> 
>> does not shut down clearly due some unmount issue related the shared 
>> info dir.
>> 
>>> 
>>> Thanks
>>> 
>>>  
>>> 
>>> Christoforos Christoforou
>>> 
>>> Senior Systems Administrator
>>> 
>>> Global Reach Internet Productions
>>> 
>>>  <http://www.twitter.com/globalreach> Twitter | 
>>> <http://www.facebook.com/globalreach> Facebook | 
>>> <https://www.linkedin.com/company/global‑reach‑internet‑productions>
>>> LinkedIn
>>> 
>>> p (515) 996‑0996 |  <http://www.globalreach.com/> globalreach.com
>>> 
>>>  





More information about the Users mailing list