[Pacemaker] nfs4 cluster fail-over stops working once I introduce ipaddr2 resource
Dennis Jacobfeuerborn
dennisml at conversis.de
Fri Feb 14 01:50:10 UTC 2014
Hi,
I'm still working on my NFSv4 cluster and things are working as
expected...as long as I don't add an IPAddr2 resource.
The DRBD, filesystem and exportfs resources work fine and when I put the
active node into standby everything fails over as expected.
Once I add a VIP as a IPAddr2 resource however I seem to get monitor
problems with the p_exportfs_root resource.
I've attached the configuration, status and a log file.
The transition status is the status a moment after I take nfs1
(192.168.100.41) offline. It looks like the stopping of p_ip_nfs does
something to the p_exportfs_root resource although I have no idea what
that could be.
The final status is the status after the cluster has settled. The
fail-over finished but the failed action is still present and cannot be
cleared with a "crm resource cleanup p_exportfs_root".
The log is the result of a "tail -f" on the corosync.log from the moment
before I issued the "crm node standby nfs1" to when the cluster has settled.
Does anybody know what the issue could be here? At first I thought that
using a VIP from the same network as the cluster nodes could be an issue
but when I change this to use an IP in a different network
192.168.101.43/24 the same thing happens.
The moment I remove p_ip_nfs from the configuration again fail-over back
and forth works without a hitch.
Regards,
Dennis
-------------- next part --------------
A non-text attachment was scrubbed...
Name: corosync.log
Type: text/x-log
Size: 65132 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140214/82187b08/attachment-0003.bin>
-------------- next part --------------
node nfs1 \
attributes standby="off"
node nfs2 \
attributes standby="off"
primitive p_drbd_nfs ocf:linbit:drbd \
params drbd_resource="r0" \
op monitor interval="15" role="Master" \
op monitor interval="30" role="Slave"
primitive p_exportfs_data ocf:heartbeat:exportfs \
params fsid="1" directory="/srv/nfs/data" options="rw,mountpoint,no_root_squash" clientspec="192.168.100.0/255.255.255.0" wait_for_leasetime_on_stop="true" \
op monitor interval="30s" \
op stop interval="0" timeout="20s"
primitive p_exportfs_root ocf:heartbeat:exportfs \
params fsid="0" directory="/srv/nfs" options="rw,crossmnt" clientspec="192.168.100.0/255.255.255.0" \
op monitor interval="10s" \
op stop interval="0" timeout="20s"
primitive p_fs_data ocf:heartbeat:Filesystem \
params device="/dev/drbd1" directory="/srv/nfs/data" fstype="ext4" \
op monitor interval="10s" \
op stop interval="0" timeout="20s"
primitive p_ip_nfs ocf:heartbeat:IPaddr2 \
params ip="192.168.100.43" cidr_netmask="24" \
op monitor interval="30s"
primitive p_lsb_nfsserver lsb:nfs \
op monitor interval="30s"
group g_nfs p_fs_data p_exportfs_root p_exportfs_data p_ip_nfs
ms ms_drbd_nfs p_drbd_nfs \
meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
clone cl_lsb_nfsserver p_lsb_nfsserver
colocation c_nfs_on_drbd inf: g_nfs ms_drbd_nfs:Master
order o_drbd_before_nfs inf: ms_drbd_nfs:promote g_nfs:start
property $id="cib-bootstrap-options" \
dc-version="1.1.10-14.el6_5.2-368c726" \
cluster-infrastructure="cman" \
stonith-enabled="false" \
no-quorum-policy="ignore" \
last-lrm-refresh="1392341228" \
maintenance-mode="false"
rsc_defaults $id="rsc-options" \
resource-stickiness="100"
-------------- next part --------------
Last updated: Fri Feb 14 01:26:36 2014
Last change: Fri Feb 14 01:22:53 2014 via crm_attribute on nfs2
Stack: cman
Current DC: nfs1 - partition with quorum
Version: 1.1.10-14.el6_5.2-368c726
2 Nodes configured
8 Resources configured
Node nfs1: standby
Online: [ nfs2 ]
Master/Slave Set: ms_drbd_nfs [p_drbd_nfs]
Masters: [ nfs2 ]
Stopped: [ nfs1 ]
Clone Set: cl_lsb_nfsserver [p_lsb_nfsserver]
Started: [ nfs2 ]
Stopped: [ nfs1 ]
Resource Group: g_nfs
p_fs_data (ocf::heartbeat:Filesystem): Started nfs2
p_exportfs_root (ocf::heartbeat:exportfs): Started nfs2
p_exportfs_data (ocf::heartbeat:exportfs): Started nfs2
p_ip_nfs (ocf::heartbeat:IPaddr2): Started nfs2
Failed actions:
p_exportfs_root_monitor_10000 on nfs1 'not running' (7): call=337, status=complete, last-rc-change='Fri Feb 14 01:23:02 2014', queued=0ms, exec=0ms
-------------- next part --------------
Last updated: Fri Feb 14 01:44:07 2014
Last change: Fri Feb 14 01:43:56 2014 via crm_attribute on nfs2
Stack: cman
Current DC: nfs1 - partition with quorum
Version: 1.1.10-14.el6_5.2-368c726
2 Nodes configured
8 Resources configured
Node nfs1: standby
Online: [ nfs2 ]
Master/Slave Set: ms_drbd_nfs [p_drbd_nfs]
Masters: [ nfs1 ]
Slaves: [ nfs2 ]
Clone Set: cl_lsb_nfsserver [p_lsb_nfsserver]
Started: [ nfs2 ]
Stopped: [ nfs1 ]
Resource Group: g_nfs
p_fs_data (ocf::heartbeat:Filesystem): Started nfs1
p_exportfs_root (ocf::heartbeat:exportfs): FAILED nfs1
p_exportfs_data (ocf::heartbeat:exportfs): Started nfs1
p_ip_nfs (ocf::heartbeat:IPaddr2): Stopped
Failed actions:
p_exportfs_root_monitor_10000 on nfs1 'not running' (7): call=485, status=complete, last-rc-change='Fri Feb 14 01:43:58 2014', queued=0ms, exec=0ms
More information about the Pacemaker
mailing list