[ClusterLabs] nfs-daemon will not start

Thu Sep 19 13:35:13 EDT 2019

On 19/09/19 16:43 +0200, Oyvind Albrigtsen wrote:
> Try upgrading resource-agents and maybe nfs-utils (if there's newer
> version for CentOS 7).
> 
> I recall some issue with how the nfs config was generated, which might
> be causing this issue.

This is what I'd start with, otherwise, see below.

> On 18/09/19 20:49 +0000, Jones, Keven wrote:
>> I have 2 centos7.6  VM’s setup. Was able to successfully create
>> cluster, setup LVM, NFSHARE but not able to get the nfs-daemon
>> (ocf::heartbeat:nfsserver): to start successfully.
>> 
>> 
>> [root at cs-nfs1 ~]# pcs status
>> 
>> [...]
>> 
>> Failed Actions:
>> * nfs-daemon_start_0 on cs-nfs1.saas.local 'unknown error' (1): call=25, status=Timed Out, exitreason='',
>>   last-rc-change='Wed Sep 18 16:31:48 2019', queued=0ms, exec=40001ms
>> * nfs-daemon_start_0 on cs-nfs2.saas.local 'unknown error' (1): call=22, status=Timed Out, exitreason='',
>>   last-rc-change='Wed Sep 18 16:31:06 2019', queued=0ms, exec=40002ms
>> 
>> [...]
>> 
>> [root at cs-nfs1 ~]# pcs resource debug-start nfs-daemon
>> Operation start for nfs-daemon (ocf:heartbeat:nfsserver) failed: 'Timed Out' (2)
>>> stdout: STATDARG="--no-notify"
>>> stdout: * rpc-statd.service - NFS status monitor for NFSv2/3 locking.
>>> stdout:    Loaded: loaded (/usr/lib/systemd/system/rpc-statd.service; static; vendor preset: disabled)
>>> stdout:    Active: inactive (dead) since Wed 2019-09-18 16:32:28 EDT; 13min ago
>>> stdout:   Process: 7054 ExecStart=/usr/sbin/rpc.statd $STATDARGS (code=exited, status=0/SUCCESS)
>>> stdout:  Main PID: 7055 (code=exited, status=0/SUCCESS)
>>> stdout:
>>> stdout: Sep 18 16:31:48 cs-nfs1 systemd[1]: Starting NFS status monitor for NFSv2/3 locking....
>>> stdout: Sep 18 16:31:48 cs-nfs1 rpc.statd[7055]: Version 1.3.0 starting
>>> stdout: Sep 18 16:31:48 cs-nfs1 rpc.statd[7055]: Flags: TI-RPC
>>> stdout: Sep 18 16:31:48 cs-nfs1 systemd[1]: Started NFS status monitor for NFSv2/3 locking..
>>> stdout: Sep 18 16:32:28 cs-nfs1 systemd[1]: Stopping NFS status monitor for NFSv2/3 locking....
>>> stdout: Sep 18 16:32:28 cs-nfs1 systemd[1]: Stopped NFS status monitor for NFSv2/3 locking..
>>> stderr: Sep 18 16:46:08 INFO: Starting NFS server ...
>>> stderr: Sep 18 16:46:08 INFO: Start: rpcbind i: 1
>>> stderr: Sep 18 16:46:08 INFO: Start: nfs-mountd i: 1
>>> stderr: Job for nfs-idmapd.service failed because the control
>>>         process exited with error code. See "systemctl status
>>>         nfs-idmapd.service" and "journalctl -xe" for details.
>>> 
>>> [...]
>>> 
>> 
>> [root at cs-nfs1 ~]# systemctl status nfs-idmapd.service
>> ● nfs-idmapd.service - NFSv4 ID-name mapping service
>>  Loaded: loaded (/usr/lib/systemd/system/nfs-idmapd.service; static; vendor preset: disabled)
>>  Active: failed (Result: exit-code) since Wed 2019-09-18 16:46:08 EDT; 1min 25s ago
>> Process: 8699 ExecStart=/usr/sbin/rpc.idmapd $RPCIDMAPDARGS (code=exited, status=1/FAILURE)
>> Main PID: 5334 (code=killed, signal=TERM)
>> 
>> Sep 18 16:46:08 cs-nfs1 systemd[1]: Starting NFSv4 ID-name mapping service...
>> Sep 18 16:46:08 cs-nfs1 systemd[1]: nfs-idmapd.service: control process exited, code=exited status=1
>> Sep 18 16:46:08 cs-nfs1 systemd[1]: Failed to start NFSv4 ID-name mapping service.
>> Sep 18 16:46:08 cs-nfs1 systemd[1]: Unit nfs-idmapd.service entered failed state.
>> Sep 18 16:46:08 cs-nfs1 systemd[1]: nfs-idmapd.service failed
>> 
>> [root at cs-nfs1 ~]# journalctl -xe
>> -- The start-up result is done.
>> Sep 18 16:46:08 cs-nfs1 rpc.idmapd[8711]: main: open(/var/lib/nfs/rpc_pipefs//nfs): No such file or directory

^ this is apparently the bottom-most diagnostics provided

Why would what appears to be a missing "rpc_pipefs on
/var/lib/nfs/rpc_pipefs" (or sunrpc type, perhaps, at least that's
what I observe on my system) mount occur at the moment it is expected
to be present, no idea.

But you haven't shown your exact nfs-daemon resource configuration,
afterall -- did you mangle with "rpcpipefs_dir" parameter, for
instance?.

Also, that would actually smell like a neglected integration with
systemd, since when this parameter would be changed, there is no
propagation of that towards the actual systemd unit files that
then get blindly managed, naively assuming coherency, AFAICT...
Then, the agent shall be fixed to disable such deliberate
modifications in the systemd scenarios, or something.

-- 
Jan (Poki)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20190919/67d3a0fc/attachment.sig>