[ClusterLabs] Fencing with a 3-node (1 for quorum only) cluster

Thu Aug 4 19:03:51 EDT 2016

On 04/08/16 06:56 PM, Dan Swartzendruber wrote:
> I'm setting up an HA NFS server to serve up storage to a couple of
> vsphere hosts.  I have a virtual IP, and it depends on a ZFS resource
> agent which imports or exports a pool.  So far, with stonith disabled,
> it all works perfectly.  I was dubious about a 2-node solution, so I
> created a 3rd node which runs as a virtual machine on one of the hosts. 
> All it is for is quorum.  So, looking at fencing next.  The primary
> server is a poweredge R905, which has DRAC for fencing.  The backup
> storage node is a Supermicro X9-SCL-F (with IPMI).  So I would be using
> the DRAC agent for the former and the ipmilan for the latter?  I was
> reading about location constraints, where you tell each instance of the
> fencing agent not to run on the node that would be getting fenced.  So,
> my first thought was to configure the drac agent and tell it not to
> fence node 1, and configure the ipmilan agent and tell it not to fence
> node 2.  The thing is, there is no agent available for the quorum node. 
> Would it make more sense instead to tell the drac agent to only run on
> node 2, and the ipmilan agent to only run on node 1?  Thanks!

This is a common mistake.

Fencing and quorum solve different problems and are not interchangeable.

In short;

Fencing is a tool when things go wrong.

Quorum is a tool when things are working.

The only impact that having quorum has with regard to fencing is that it
avoids a scenario when both nodes try to fence each other and the faster
one wins (which is itself OK). Even then, you can add 'delay=15' the
node you want to win and it will win is such a case. In the old days, it
would also prevent a fence loop if you started the cluster on boot and
comms were down. Now though, you set 'wait_for_all' and you won't get a
fence loop, so that solves that.

Said another way; Quorum is optional, fencing is not (people often get
that backwards).

As for DRAC vs IPMI, no, they are not two things. In fact, I am pretty
certain that fence_drac is a symlink to fence_ipmilan. All DRAC is (same
with iRMC, iLO, RSA, etc) is "IPMI + features". Fundamentally, the fence
action; rebooting the node, works via the basic IPMI standard using the
DRAC's BMC.

To do proper redundant fencing, which is a great idea, you want
something like switched PDUs. This is how we do it (with two node
clusters). IPMI first, and if that fails, a pair of PDUs (one for each
PSU, each PDU going to independent UPSes) as backup.

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?