[ClusterLabs] Two Node NFS doesn't failover with hardware glitches
David Vossel
dvossel at redhat.com
Tue Apr 7 18:15:51 UTC 2015
----- Original Message -----
> Hello All,
>
> First post to the list and a newbie to HA - please be patient with me!
>
> We have a very simple 2 node NFS HA setup using Linbit's recipe. It's been in
> use since September of 2014. Initial testing showed successful failover to
> the other node and it's been left alone since that time.
>
> Two weeks ago, the primary node had two issues hit at once. First, the
> primary RAID 1 HDD for the OS began to fail and this actually masked a
> second issue where the adaptec controller card started to throw errors as
> well (a RAID10 setup for the NFS mounts) giving spurious access to files.
>
> Amazingly the entire node continued to operate, but that's really not the
> behavior I was expecting. The logs show (I'll post the relevant logs by
> Friday if they are needed - the machine is currently down) several attempts
> to cut-over to the other node, but the NFS mounts would not release. It
> eventually lead to a split-brain condition.
>
> Another behavior that was undesirable was the almost 20 minute delay in
> shutting down the corosync+pacemaker services on the primary node to force a
> failover. This left the NFS clients with stale connections that were only
> able to be cleared by restarting the client machines (web servers.)
Yep, i've seen this. This is typically a result of the floating IP address
becoming available before the exports after a failover.
startup order should be.
1. mount shared storage
2. nfs server
3. exports
4. floating IP.
Here's some slides that outline how I'd recommend deploying nfs Active/Passive.
It is a little different from what you have deployed.
https://github.com/davidvossel/phd/blob/master/doc/presentations/nfs-ap-overview.pdf
-- David
> Restarting rpcbind, autofs, and nfs services weren't enough to kick the
> problem.
>
> I've done quite a bit of digging to understand more of the issue. One item is
> adjusting the /proc/nfsv4leasetime to 10 seconds and the nfs.conf
> nfsv4gracetime setting to 10 seconds.
>
> Still, this doesn't solve for the problem of a resource hanging on the
> primary node. Everything I'm reading indicates fencing is required, yet the
> boilerplate configuration from Linbit has stonith disabled.
>
> These units are running CentOS 6.5
> corosync 1.4.1
> pacemaker 1.1.10
> drbd
>
> Two questions then:
>
> 1. how do we handle cranky hardware issues to ensure a smooth failover?
> 2. what additional steps are needed to ensure the NFS mounts don't go stale
> on the clients?
>
>
>
> Below is the current output of crm configure show
>
> node store-1.usync.us \
> attributes standby=off maintenance=off
> node store-2.usync.us \
> attributes standby=off maintenance=on
> primitive p_drbdr0_nfs ocf:linbit:drbd \
> params drbd_resource=r0 \
> op monitor interval=31s role=Master \
> op monitor interval=29s role=Slave \
> op start interval=0 timeout=240s \
> op stop interval=0 timeout=120s
> primitive p_exportfs_root exportfs \
> params fsid=0 directory="/export" options="rw,sync,crossmnt"
> clientspec="10.0.2.0/255.255.255.0" wait_for_leasetime_on_stop=false \
> op start interval=0 timeout=240s \
> op stop interval=0 timeout=100s \
> meta target-role=Started
> primitive p_exportfs_user_assets exportfs \
> params fsid=1 directory="/export/user_assets"
> options="rw,sync,no_root_squash,mountpoint"
> clientspec="10.0.2.0/255.255.255.0" wait_for_leasetime_on_stop=false \
> op monitor interval=30s \
> op start interval=0 timeout=240s \
> op stop interval=0 timeout=100s \
> meta is-managed=true target-role=Started
> primitive p_fs_user_assets Filesystem \
> params device="/dev/drbd0" directory="/export/user_assets" fstype=ext4 \
> op monitor interval=10s \
> meta target-role=Started is-managed=true
> primitive p_ip_nfs IPaddr2 \
> params ip=10.0.2.200 cidr_netmask=24 \
> op monitor interval=30s \
> meta target-role=Started
> primitive p_lsb_nfsserver lsb:nfs \
> op monitor interval=30s \
> meta target-role=Started
> primitive p_ping ocf:pacemaker:ping \
> params host_list=10.0.2.100 multiplier=1000 name=p_ping \
> op monitor interval=30 timeout=60
> group g_fs p_fs_user_assets \
> meta target-role=Started
> group g_nfs p_ip_nfs p_exportfs_user_assets \
> meta target-role=Started
> ms ms_drbdr0_nfs p_drbdr0_nfs \
> meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true
> target-role=Started is-managed=true
> clone cl_ping p_ping
> location g_fs_on_connected_node g_fs \
> rule -inf: not_defined p_ping or p_ping lte 0
> colocation c_filesystem_with_drbdr0master inf: g_fs ms_drbdr0_nfs:Master
> colocation c_rootexport_with_nfsserver inf: p_exportfs_root p_lsb_nfsserver
> order o_drbdr0_before_filesystems inf: ms_drbdr0_nfs:promote g_fs:start
> order o_filesystem_before_nfsserver inf: g_fs p_lsb_nfsserver
> order o_nfsserver_before_rootexport inf: p_lsb_nfsserver p_exportfs_root
> order o_rootexport_before_exports inf: p_exportfs_root g_nfs
> property cib-bootstrap-options: \
> stonith-enabled=false \
> dc-version=1.1.10-14.el6_5.3-368c726 \
> cluster-infrastructure="classic openais (with plugin)" \
> expected-quorum-votes=2 \
> no-quorum-policy=ignore \
> maintenance-mode=false
> rsc_defaults rsc-options: \
> resource-stickiness=200
>
>
> #########################################
>
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
More information about the Users
mailing list