[ClusterLabs] CentOS 7 nfs exports erroring

Thu May 28 20:32:38 UTC 2015

Solved this by using the nfsserver resource agent instead of exportfs, and
using a ceph block device as the nfs shared info dir.

On Thu, May 28, 2015 at 9:51 AM, Steve Dainard <sdainard at spd1.com> wrote:

> Hello,
>
> I'm configuring a cluster which maps and mounts ceph rbd's, then exports
> each mount over nfs, mostly following Sebastien's post here:
> http://www.sebastien-han.fr/blog/2012/07/06/nfs-over-rbd/ but there are
> some differences using pacemaker 1.2 with systemd.
>
> The issue I'm running into is I can get ceph working without issue, but
> the nfs exports error with:
>
> # pcs status
> Cluster name: nfs
> Last updated: Thu May 28 09:21:13 2015
> Last change: Wed May 27 17:47:06 2015
> Stack: corosync
> Current DC: node1 (1) - partition with quorum
> Version: 1.1.12-a14efad
> 3 Nodes configured
> 24 Resources configured
>
>
> Online: [ node1 node2 node3 ]
>
> Full list of resources:
>
>  Resource Group: group_rbd_fs_nfs_vip
>      rbd_vol1   (ocf::ceph:rbd.in):     Started node1
>      ...
>      rbd_vol8   (ocf::ceph:rbd.in):     Started node1
>      fs_vol1    (ocf::heartbeat:Filesystem):    Started node1
>      ...
>      fs_vol8    (ocf::heartbeat:Filesystem):    Started node1
>      export_vol1        (ocf::heartbeat:exportfs):      Stopped
>      ...
>      export_vol8        (ocf::heartbeat:exportfs):      Stopped
>
> *Failed actions:*
> *    export_vol1_start_0 on node1 'unknown error' (1): call=262,
> status=complete, exit-reason='none', last-rc-change='Wed May 27 17:42:37
> 2015', queued=0ms, exec=56ms*
> *    export_vol1_start_0 on node2 'unknown error' (1): call=196,
> status=complete, exit-reason='none', last-rc-change='Wed May 27 17:43:04
> 2015', queued=0ms, exec=63ms*
> *    export_vol1_start_0 on node3 'unknown error' (1): call=196,
> status=complete, exit-reason='none', last-rc-change='Wed May 27 17:43:27
> 2015', queued=0ms, exec=69ms*
>
>
> PCSD Status:
>   node1: Online
>   node2: Online
>   node3: Online
>
> Daemon Status:
>   corosync: active/disabled
>   pacemaker: active/disabled
>   pcsd: active/enabled
>
> I thought this was an issue with the nfsd kernel module not loading, so I
> manually loaded it on each host, but no change.
>
> I'm also wondering if there's an error with my export resources config:
>   Resource: export_vol1 (class=ocf provider=heartbeat type=exportfs)
>    Attributes: directory=/mnt/vol1 clientspec=10.0.231.0/255.255.255.0
> options=rw,no_subtree_check,no_root_squash fsid=1
>    Operations: stop interval=0s timeout=120 (export_vol1-stop-timeout-120)
>                monitor interval=10s timeout=20s
> (export_vol1-monitor-interval-10s)
>                start interval=0 timeout=40s (export_vol1-start-interval-0)
>   Resource: export_vol2 (class=ocf provider=heartbeat type=exportfs)
>    Attributes: directory=/mnt/vol2 clientspec=10.0.231.0/255.255.255.0
> options=rw,no_subtree_check,no_root_squash fsid=2
>    Operations: stop interval=0s timeout=120 (export_vol2-stop-timeout-120)
>                monitor interval=10s timeout=20s
> (export_vol2-monitor-interval-10s)
>                start interval=0 timeout=40s (export_vol2-start-interval-0)
> ... (8 in total)
>
>
> exportfs is throwing an error, and even more odd its breaking the nfs
> subnet auth into IP's (10.0.231.103,10.0.231.100) which I have no idea
> where its getting those IP addresses from.
> Logs: http://pastebin.com/xEX2L7m1
>
> /etc/exportfs:
> /mnt/vol1       10.0.231.0/24(rw,no_subtree_check,no_root_squash)
> ...
> /mnt/vol8       10.0.231.0/24(rw,no_subtree_check,no_root_squash)
>
> # getenforce
> Permissive
>
> # systemctl status nfs
> nfs-server.service - NFS server and services
>    Loaded: loaded (/usr/lib/systemd/system/nfs-server.service; disabled)
>    Active: inactive (dead)
>
> Thanks,
> Steve
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20150528/035bd4e1/attachment.htm>