[ClusterLabs] Problem using IPMI for fencing

Digimer lists at alteeve.ca
Tue Mar 3 18:51:47 UTC 2015


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 03/03/15 01:42 PM, Ken Gaillot wrote:
> On 03/03/2015 01:14 PM, Jose Manuel Martínez wrote:
>> Hello everybody.
>> 
>> I'm trying to build an active/passive cluster for the Lustre
>> filesystem. Pacemaker is working fine in most situations except
>> one: If a node goes out of power in a 2-node cluster, and I am
>> using fence_ipmilan as fencing resource (for HP iLO2), the alive
>> node is not able to takeover the resources of the failed node. It
>> tries to check the fencing device trying to reboot it, but as the
>> node is dead (no power), the IPMI interface does not answer.
> 
> Correct, IPMI that shares power with its host should not be used as
> the sole fencing device for this very reason. There is no way for
> the cluster to be certain that the host is down and not just the
> IPMI.
> 
> IPMI is fine as the first-attempt fencing device, but there should
> be a fallback fencing device that is independent of the host (such
> as a remotely controllable power switch).

I agree 100%, and do this myself.

IPMI fencing, when it works, is best because when it returns "off", we
can be *very* sure the node is actually off. As you said though, it is
electrically and mechanically coupled to the host, so it's vulnerable
to certain failure cases (ie: total loss of power, mechanical
destruction, etc).

For this, I always use a pair of switched PDUs (APC and Raritan both
work well, TrippLite works but is slow). I use a pair because I also
have power redundancy (separate UPSes, PDUs, etc). So to hand this,
you need a fairly complex stonith configuration to make it work.

In 'pcs', this is called 'STONITH Levels' and you can see how how to
build this here:

http://clusterlabs.org/wiki/STONITH_Levels

In 'crm', this is called "Fencing Topology" and you can see how to
configure it here:

http://clusterlabs.org/wiki/Fencing_topology

In my opinion, this is the only proper configuration for fencing if
you want to remove all single points of failure.

- -- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQIcBAEBAgAGBQJU9gLDAAoJECChztQA3mh0SUUQAJ89YOoBH3m1/jR0hUsS5VV2
sgwi2DuCXPKamHDzMNL8mFnozZxCD5QMs5+yDjWJZWxXCXEz6VB4aR4zVz31URi5
iZdN0RmaUVbdVqgJrY0KH8QaxuCg440H2mE41Qj/8OKYmK9RW0fhErU59Ydud5wX
jTuTqBRhfniMr4Qd2myYTmkm7+AwEwy1NfthimweTOTLib/11G8/esJ5AMz6Upeq
dyKbDxoOJuPODJfglKCrytqJnWuFrfzUWSbVnpRf4pMaRIdeL/Ko9Vsi4zIB3UD3
TWxbWUS/MM3QopzV9ruFX1yvu0B+YHKhmecgEGtXAxgyWI6zj7RQNHiJ/rBRi8Rk
Dld5bdAnTzADQeHsvU3PIK+ilrwFjZsCoK8dgK5eSr0jQrKRGUhkTOF6LtMP7HYA
xtWu3kXE/YbVrBT8BhdFTWSGTBvnCGIfzGNY+/wm45uLXf4lMg2fWW5OCKlgAj0K
W/srPAU5M8tJesrPXiDY//V2DkQhAsurrNUwVjL+e6mA8LQyyH79bNcP0cN+gyIo
LHqmK2OVEdwr7uOjijtA5y49iyreR92nfVLOZUzxjrpjXs36eSzJ+DdUulzx3cJJ
49BQPlT/+Hb5V7hIVUSFTneyLGrJOLG9g9hFtx4nL9sNTddzpJhXeeEyfy/q9+yy
DEigN8nBiILHLfHdGMIh
=1UnT
-----END PGP SIGNATURE-----




More information about the Users mailing list