[ClusterLabs] Two Node NFS doesn't failover with hardware glitches
kgronlund at suse.com
Tue Apr 7 14:01:00 EDT 2015
Erich Prinz <erich at live2sync.com> writes:
> Still, this doesn't solve for the problem of a resource hanging on the primary node. Everything I'm reading indicates fencing is required, yet the boilerplate configuration from Linbit has stonith disabled.
> These units are running CentOS 6.5
> corosync 1.4.1
> pacemaker 1.1.10
> Two questions then:
> 1. how do we handle cranky hardware issues to ensure a smooth failover?
> 2. what additional steps are needed to ensure the NFS mounts don't go stale on the clients?
As you might have guessed, you have answered your question already -
what you need to solve this situation is stonith. When a node refuses to
die gracefully, you really do need stonith to force it into a known
These days most documentation tries to emphasize this more than in the
past. I can recommend Tims cartoon explanation of how and why stonith
// Kristoffer Grönlund
// kgronlund at suse.com
More information about the Users