[ClusterLabs] NFS4 share not working

Andrei Borzenkov arvidjaar at gmail.com
Sat Feb 23 00:20:52 EST 2019


23.02.2019 2:57, solarflow99 пишет:
> I'm trying to have my NFS share exported via pacemaker and now it doesn't
> seem to be working, it also kills off nfs-mountd.  It looks like the rbd
> device could have something to do with it, the nfsroot doesn't get
> exported, but there's no indication why:
> 
> 
> 
> Feb 22 15:36:23 cephmgr101.corp.mydomain.com crmd[32600]:   notice: The
> local CRM is operational
> Feb 22 15:36:23 cephmgr101.corp.mydomain.com crmd[32600]:   notice: State
> transition S_STARTING -> S_PENDING
> Feb 22 15:36:24 cephmgr101.corp.mydomain.com crmd[32600]:   notice: Could
> not obtain a node name for corosync nodeid 3
> Feb 22 15:36:24 cephmgr101.corp.mydomain.com crmd[32600]:   notice: Could
> not obtain a node name for corosync nodeid 2
> Feb 22 15:36:24 cephmgr101.corp.mydomain.com crmd[32600]:   notice: Could
> not obtain a node name for corosync nodeid 3
> Feb 22 15:36:44 cephmgr101.corp.mydomain.com crmd[32600]:   notice: State
> transition S_PENDING -> S_NOT_DC

For any question about "why pacemaker decided to (not) start my
resource" logs from DC at the time of event are needed, because this is
where pacemaker logs why it took these decisions.

> Feb 22 15:36:45 cephmgr101.corp.mydomain.com
> nfsserver(p_nfs_server)[32648]: INFO: Status: rpcbind
> Feb 22 15:36:45 cephmgr101.corp.mydomain.com
> nfsserver(p_nfs_server)[32655]: INFO: Status: nfs-mountd
> Feb 22 15:36:45 cephmgr101.corp.mydomain.com
> nfsserver(p_nfs_server)[32662]: INFO: Status: nfs-idmapd
> Feb 22 15:36:45 cephmgr101.corp.mydomain.com
> nfsserver(p_nfs_server)[32674]: INFO: Status: rpc-statd
> Feb 22 15:36:45 cephmgr101.corp.mydomain.com crmd[32600]:   notice: Result
> of probe operation for p_nfs_server on cephmgr101.corp.mydomain.com: 0 (ok)
> Feb 22 15:36:50 cephmgr101.corp.mydomain.com crmd[32600]:   notice: Result
> of probe operation for p_rbd_map_1 on cephmgr101.corp.mydomain.com: 7 (not
> running)
> Feb 22 15:36:50 cephmgr101.corp.mydomain.com
> nfsserver(p_nfs_server)[32720]: INFO: Stopping NFS server ...
> Feb 22 15:36:50 cephmgr101.corp.mydomain.com systemd[1]: Stopping NFS
> server and services...
> Feb 22 15:36:50 cephmgr101.corp.mydomain.com systemd[1]: Stopped NFS server
> and services.
> Feb 22 15:36:50 cephmgr101.corp.mydomain.com systemd[1]: Stopping NFSv4
> ID-name mapping service...
> Feb 22 15:36:50 cephmgr101.corp.mydomain.com rpc.mountd[32525]: Caught
> signal 15, un-registering and exiting.
> Feb 22 15:36:50 cephmgr101.corp.mydomain.com systemd[1]: Stopping NFS Mount
> Daemon...
> Feb 22 15:36:50 cephmgr101.corp.mydomain.com systemd[1]: Stopped NFSv4
> ID-name mapping service.
> Feb 22 15:36:50 cephmgr101.corp.mydomain.com systemd[1]: Stopped NFS Mount
> Daemon.
> Feb 22 15:36:50 cephmgr101.corp.mydomain.com kernel: nfsd: last server has
> exited, flushing export cache
> Feb 22 15:36:50 cephmgr101.corp.mydomain.com
> nfsserver(p_nfs_server)[32744]: INFO: Stop: threads
> Feb 22 15:36:50 cephmgr101.corp.mydomain.com systemd[1]: Stopping NFS
> status monitor for NFSv2/3 locking....
> Feb 22 15:36:50 cephmgr101.corp.mydomain.com systemd[1]: Stopped NFS status
> monitor for NFSv2/3 locking..
> Feb 22 15:36:50 cephmgr101.corp.mydomain.com
> nfsserver(p_nfs_server)[32755]: INFO: Stop: rpc-statd
> Feb 22 15:36:50 cephmgr101.corp.mydomain.com
> nfsserver(p_nfs_server)[32765]: INFO: Stop: nfs-idmapd
> Feb 22 15:36:51 cephmgr101.corp.mydomain.com nfsserver(p_nfs_server)[313]:
> INFO: Stop: nfs-mountd
> Feb 22 15:36:51 cephmgr101.corp.mydomain.com nfsserver(p_nfs_server)[324]:
> INFO: Stop: rpc-gssd
> Feb 22 15:36:51 cephmgr101.corp.mydomain.com nfsserver(p_nfs_server)[333]:
> INFO: Stop: umount (1/10 attempts)
> Feb 22 15:36:52 cephmgr101.corp.mydomain.com systemd[1]: Stopped target
> rpc_pipefs.target.
> Feb 22 15:36:52 cephmgr101.corp.mydomain.com systemd[1]: Stopping
> rpc_pipefs.target.
> Feb 22 15:36:52 cephmgr101.corp.mydomain.com nfsserver(p_nfs_server)[349]:
> INFO: NFS server stopped
> Feb 22 15:36:52 cephmgr101.corp.mydomain.com crmd[32600]:   notice: Result
> of stop operation for p_nfs_server on cephmgr101.corp.mydomain.com: 0 (ok)
> Feb 22 15:36:52 cephmgr101.corp.mydomain.com
> exportfs(p_nfs_export_root_1)[411]: INFO: Directory /mnt/nfsroot is not
> exported to 172.20.3.0/255.255.255.0 (stopped).
> Feb 22 15:36:52 cephmgr101.corp.mydomain.com crmd[32600]:   notice: Result
> of probe operation for p_nfs_export_root_1 on cephmgr101.corp.mydomain.com:
> 7 (not running)
> Feb 22 15:36:52 cephmgr101.corp.mydomain.com Filesystem(p_fs_rbd_1)[431]:
> WARNING: Couldn't find device [/dev/rbd0]. Expected /dev/??? to exist
> Feb 22 15:36:52 cephmgr101.corp.mydomain.com crmd[32600]:   notice: Result
> of probe operation for p_fs_rbd_1 on cephmgr101.corp.mydomain.com: 7 (not
> running)
> Feb 22 15:36:52 cephmgr101.corp.mydomain.com crmd[32600]:   notice: Result
> of probe operation for p_ip_nfs_1 on cephmgr101.corp.mydomain.com: 7 (not
> running)
> Feb 22 15:36:52 cephmgr101.corp.mydomain.com
> exportfs(p_exportfs_cart_sigtrack)[547]: ERROR:
> /mnt/nfsroot/rbd0/cart_sigtrack does not exist or is not a directory
> Feb 22 15:36:52 cephmgr101.corp.mydomain.com
> exportfs(p_exportfs_cart_sigtrack)[553]: INFO: environment is invalid,
> resource considered stopped
> Feb 22 15:36:52 cephmgr101.corp.mydomain.com lrmd[32597]:   notice:
> p_exportfs_cart_sigtrack_monitor_0:522:stderr [
> ocf-exit-reason:/mnt/nfsroot/rbd0/cart_sigtrack does not exist or is not a
> directory ]
> Feb 22 15:36:52 cephmgr101.corp.mydomain.com crmd[32600]:   notice: Result
> of probe operation for p_exportfs_cart_sigtrack on
> cephmgr101.corp.mydomain.com: 7 (not running)
> Feb 22 15:36:52 cephmgr101.corp.mydomain.com crmd[32600]:   notice:
> cephmgr101.corp.mydomain.com-p_exportfs_cart_sigtrack_monitor_0:27 [
> ocf-exit-reason:/mnt/nfsroot/rbd0/cart_sigtrack does not exist or is not a
> directory\n ]
> Feb 22 15:36:55 cephmgr101.corp.mydomain.com kernel: libceph: mon1
> 172.20.3.22:6789 session established
> Feb 22 15:36:55 cephmgr101.corp.mydomain.com kernel: libceph: client94252
> fsid d36fd17c-174e-40d6-95b9-86bdd196b7d2
> Feb 22 15:36:55 cephmgr101.corp.mydomain.com kernel: rbd: rbd0: capacity
> 8589934592 features 0x1
> Feb 22 15:36:55 cephmgr101.corp.mydomain.com rbd.in(p_rbd_map_1)[587]:
> INFO: /dev/rbd0
> Feb 22 15:36:55 cephmgr101.corp.mydomain.com crmd[32600]:   notice: Result
> of start operation for p_rbd_map_1 on cephmgr101.corp.mydomain.com: 0 (ok)
> Feb 22 15:36:55 cephmgr101.corp.mydomain.com Filesystem(p_fs_rbd_1)[655]:
> INFO: Running start for /dev/rbd0 on /mnt/nfsroot/rbd0
> Feb 22 15:36:55 cephmgr101.corp.mydomain.com kernel: XFS (rbd0): Mounting
> V5 Filesystem
> Feb 22 15:36:56 cephmgr101.corp.mydomain.com kernel: XFS (rbd0): Ending
> clean mount
> Feb 22 15:36:56 cephmgr101.corp.mydomain.com crmd[32600]:   notice: Result
> of start operation for p_fs_rbd_1 on cephmgr101.corp.mydomain.com: 0 (ok)
> Feb 22 15:36:56 cephmgr101.corp.mydomain.com IPaddr2(p_ip_nfs_1)[791]:
> INFO: Adding inet address 172.20.3.52/24 with broadcast address
> 172.20.3.255 to device ens160
> Feb 22 15:36:56 cephmgr101.corp.mydomain.com IPaddr2(p_ip_nfs_1)[800]:
> INFO: Bringing device ens160 up
> Feb 22 15:36:56 cephmgr101.corp.mydomain.com IPaddr2(p_ip_nfs_1)[809]:
> INFO: /usr/libexec/heartbeat/send_arp -i 200 -c 5 -p
> /var/run/resource-agents/send_arp-172.20.3.52 -I ens160 -m auto 172.20.3.52
> Feb 22 15:37:00 cephmgr101.corp.mydomain.com crmd[32600]:   notice: Result
> of start operation for p_ip_nfs_1 on cephmgr101.corp.mydomain.com: 0 (ok)
> Feb 22 15:37:00 cephmgr101.corp.mydomain.com
> exportfs(p_exportfs_cart_sigtrack)[872]: INFO: Directory
> /mnt/nfsroot/rbd0/cart_sigtrack is not exported to 172.20.3.0/255.255.255.0
> (stopped).
> Feb 22 15:37:00 cephmgr101.corp.mydomain.com
> exportfs(p_exportfs_cart_sigtrack)[887]: INFO: Exporting file system ...
> Feb 22 15:37:00 cephmgr101.corp.mydomain.com
> exportfs(p_exportfs_cart_sigtrack)[911]: INFO: exportfs: No file systems
> exported! exporting 172.20.3.0/255.255.255.0:/mnt/nfsroot/rbd0/cart_sigtrack
> Feb 22 15:37:00 cephmgr101.corp.mydomain.com
> exportfs(p_exportfs_cart_sigtrack)[922]: INFO: File system exported
> Feb 22 15:37:00 cephmgr101.corp.mydomain.com crmd[32600]:   notice: Result
> of start operation for p_exportfs_cart_sigtrack on
> cephmgr101.corp.mydomain.com: 0 (ok)
> 
> 
> # ls -la /dev/rbd0
> brw-rw---- 1 root disk 252, 0 Feb 22 15:36 /dev/rbd0
> 
> 
> # mount | grep rbd0
> /dev/rbd0 on /mnt/nfsroot/rbd0 type xfs
> (rw,relatime,attr2,inode64,sunit=8192,swidth=8192,noquota)
> 
> 
> 
> 
> Once I restart nfs-mountd, I see its not exporting the nfsroot:
> 
> # showmount -e
> Export list for cephmgr101.corp.mydomain.com:
> /mnt/nfsroot/rbd0/cart_sigtrack 172.20.3.0/255.255.255.0
> 
> 
> 
> 
> # pcs resource show p_nfs_export_root_1
>  Resource: p_nfs_export_root_1 (class=ocf provider=heartbeat type=exportfs)
>   Attributes: clientspec=172.20.3.0/255.255.255.0 directory=/mnt/nfsroot
> fsid=0 options=rw,crossmnt,async,no_root_squash
>   Operations: monitor interval=30s
> (p_nfs_export_root_1-monitor-interval-30s)
>               start interval=0s timeout=40
> (p_nfs_export_root_1-start-interval-0s)
>               stop interval=0s timeout=120
> (p_nfs_export_root_1-stop-interval-0s)
> 
> 
> 
> 
> I've set the constraints the way I think is right, changing it doesn't seem
> to help:
> # pcs constraint order
> Ordering Constraints:
>   start p_rbd_map_1 then start p_nfs_export_root_1 (kind:Mandatory)
>   start p_nfs_export_root_1 then start g_nfs_1 (kind:Mandatory)
> 


Unless you also have (co)location constraints any resource may be
started on any node in cluster.



More information about the Users mailing list