[ClusterLabs] Antw: Two Node NFS doesn't failover with hardware glitches

Wed Apr 8 06:09:13 UTC 2015

Hi!

If you can detect that "something's insane" on a node, yopu could play with node attributes (I think there are "node colors") to trigger migration of resources from an unhealthy node. I can't say much about the timeing issues without seeing the logs.
But make sure when using NFS cross-mounts that you only access the filesystems through NFS, even if they are "local".

Regards,
Ulrich

>>> Erich Prinz <erich at live2sync.com> schrieb am 07.04.2015 um 18:51 in Nachricht
<7EE5F00D-3C93-4E4F-A1FD-C65DA98F909C at live2sync.com>:
> Hello All,
> 
> First post to the list and a newbie to HA - please be patient with me!
> 
> We have a very simple 2 node NFS HA setup using Linbit's recipe. It's been 
> in use since September of 2014. Initial testing showed successful failover to 
> the other node and it's been left alone since that time. 
> 
> Two weeks ago, the primary node had two issues hit at once. First, the 
> primary RAID 1 HDD for the OS began to fail and this actually masked a second 
> issue where the adaptec controller card started to throw errors as well (a 
> RAID10 setup for the NFS mounts) giving spurious access to files.
> 
> Amazingly the entire node continued to operate, but that's really not the 
> behavior I was expecting. The logs show (I'll post the relevant logs by 
> Friday if they are needed - the machine is currently down) several attempts to 
> cut-over to the other node, but the NFS mounts would not release. It 
> eventually lead to a split-brain condition. 
> 
> Another behavior that was undesirable was the almost 20 minute delay in 
> shutting down the corosync+pacemaker services on the primary node to force a 
> failover. This left the NFS clients with stale connections that were only 
> able to be cleared by restarting the client machines (web servers.) 
> Restarting rpcbind, autofs, and nfs services weren't enough to kick the 
> problem.
> 
> I've done quite a bit of digging to understand more of the issue. One item 
> is adjusting the /proc/nfsv4leasetime to 10 seconds and the nfs.conf 
> nfsv4gracetime setting to 10 seconds.
> 
> Still, this doesn't solve for the problem of a resource hanging on the 
> primary node. Everything I'm reading indicates fencing is required, yet the 
> boilerplate configuration from Linbit has stonith disabled.
> 
> These units are running CentOS 6.5
> corosync 1.4.1
> pacemaker 1.1.10
> drbd
> 
> Two questions then:
> 
> 1. how do we handle cranky hardware issues to ensure a smooth failover?
> 2. what additional steps are needed to ensure the NFS mounts don't go stale 
> on the clients?
> 
> 
> 
> Below is the current output of crm configure show
> 
> node store-1.usync.us \
> 	attributes standby=off maintenance=off
> node store-2.usync.us \
> 	attributes standby=off maintenance=on
> primitive p_drbdr0_nfs ocf:linbit:drbd \
> 	params drbd_resource=r0 \
> 	op monitor interval=31s role=Master \
> 	op monitor interval=29s role=Slave \
> 	op start interval=0 timeout=240s \
> 	op stop interval=0 timeout=120s
> primitive p_exportfs_root exportfs \
> 	params fsid=0 directory="/export" options="rw,sync,crossmnt" 
> clientspec="10.0.2.0/255.255.255.0" wait_for_leasetime_on_stop=false \
> 	op start interval=0 timeout=240s \
> 	op stop interval=0 timeout=100s \
> 	meta target-role=Started
> primitive p_exportfs_user_assets exportfs \
> 	params fsid=1 directory="/export/user_assets" 
> options="rw,sync,no_root_squash,mountpoint" 
> clientspec="10.0.2.0/255.255.255.0" wait_for_leasetime_on_stop=false \
> 	op monitor interval=30s \
> 	op start interval=0 timeout=240s \
> 	op stop interval=0 timeout=100s \
> 	meta is-managed=true target-role=Started
> primitive p_fs_user_assets Filesystem \
> 	params device="/dev/drbd0" directory="/export/user_assets" fstype=ext4 \
> 	op monitor interval=10s \
> 	meta target-role=Started is-managed=true
> primitive p_ip_nfs IPaddr2 \
> 	params ip=10.0.2.200 cidr_netmask=24 \
> 	op monitor interval=30s \
> 	meta target-role=Started
> primitive p_lsb_nfsserver lsb:nfs \
> 	op monitor interval=30s \
> 	meta target-role=Started
> primitive p_ping ocf:pacemaker:ping \
> 	params host_list=10.0.2.100 multiplier=1000 name=p_ping \
> 	op monitor interval=30 timeout=60
> group g_fs p_fs_user_assets \
> 	meta target-role=Started
> group g_nfs p_ip_nfs p_exportfs_user_assets \
> 	meta target-role=Started
> ms ms_drbdr0_nfs p_drbdr0_nfs \
> 	meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true 
> target-role=Started is-managed=true
> clone cl_ping p_ping
> location g_fs_on_connected_node g_fs \
> 	rule -inf: not_defined p_ping or p_ping lte 0
> colocation c_filesystem_with_drbdr0master inf: g_fs ms_drbdr0_nfs:Master
> colocation c_rootexport_with_nfsserver inf: p_exportfs_root p_lsb_nfsserver
> order o_drbdr0_before_filesystems inf: ms_drbdr0_nfs:promote g_fs:start
> order o_filesystem_before_nfsserver inf: g_fs p_lsb_nfsserver
> order o_nfsserver_before_rootexport inf: p_lsb_nfsserver p_exportfs_root
> order o_rootexport_before_exports inf: p_exportfs_root g_nfs
> property cib-bootstrap-options: \
> 	stonith-enabled=false \
> 	dc-version=1.1.10-14.el6_5.3-368c726 \
> 	cluster-infrastructure="classic openais (with plugin)" \
> 	expected-quorum-votes=2 \
> 	no-quorum-policy=ignore \
> 	maintenance-mode=false
> rsc_defaults rsc-options: \
> 	resource-stickiness=200
> 
> 
> #########################################
> 
> 
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org 
> http://clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org