[ClusterLabs] Problem using IPMI for fencing
Digimer
lists at alteeve.ca
Tue Mar 3 18:51:47 UTC 2015
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 03/03/15 01:42 PM, Ken Gaillot wrote:
> On 03/03/2015 01:14 PM, Jose Manuel Martínez wrote:
>> Hello everybody.
>>
>> I'm trying to build an active/passive cluster for the Lustre
>> filesystem. Pacemaker is working fine in most situations except
>> one: If a node goes out of power in a 2-node cluster, and I am
>> using fence_ipmilan as fencing resource (for HP iLO2), the alive
>> node is not able to takeover the resources of the failed node. It
>> tries to check the fencing device trying to reboot it, but as the
>> node is dead (no power), the IPMI interface does not answer.
>
> Correct, IPMI that shares power with its host should not be used as
> the sole fencing device for this very reason. There is no way for
> the cluster to be certain that the host is down and not just the
> IPMI.
>
> IPMI is fine as the first-attempt fencing device, but there should
> be a fallback fencing device that is independent of the host (such
> as a remotely controllable power switch).
I agree 100%, and do this myself.
IPMI fencing, when it works, is best because when it returns "off", we
can be *very* sure the node is actually off. As you said though, it is
electrically and mechanically coupled to the host, so it's vulnerable
to certain failure cases (ie: total loss of power, mechanical
destruction, etc).
For this, I always use a pair of switched PDUs (APC and Raritan both
work well, TrippLite works but is slow). I use a pair because I also
have power redundancy (separate UPSes, PDUs, etc). So to hand this,
you need a fairly complex stonith configuration to make it work.
In 'pcs', this is called 'STONITH Levels' and you can see how how to
build this here:
http://clusterlabs.org/wiki/STONITH_Levels
In 'crm', this is called "Fencing Topology" and you can see how to
configure it here:
http://clusterlabs.org/wiki/Fencing_topology
In my opinion, this is the only proper configuration for fencing if
you want to remove all single points of failure.
- --
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAEBAgAGBQJU9gLDAAoJECChztQA3mh0SUUQAJ89YOoBH3m1/jR0hUsS5VV2
sgwi2DuCXPKamHDzMNL8mFnozZxCD5QMs5+yDjWJZWxXCXEz6VB4aR4zVz31URi5
iZdN0RmaUVbdVqgJrY0KH8QaxuCg440H2mE41Qj/8OKYmK9RW0fhErU59Ydud5wX
jTuTqBRhfniMr4Qd2myYTmkm7+AwEwy1NfthimweTOTLib/11G8/esJ5AMz6Upeq
dyKbDxoOJuPODJfglKCrytqJnWuFrfzUWSbVnpRf4pMaRIdeL/Ko9Vsi4zIB3UD3
TWxbWUS/MM3QopzV9ruFX1yvu0B+YHKhmecgEGtXAxgyWI6zj7RQNHiJ/rBRi8Rk
Dld5bdAnTzADQeHsvU3PIK+ilrwFjZsCoK8dgK5eSr0jQrKRGUhkTOF6LtMP7HYA
xtWu3kXE/YbVrBT8BhdFTWSGTBvnCGIfzGNY+/wm45uLXf4lMg2fWW5OCKlgAj0K
W/srPAU5M8tJesrPXiDY//V2DkQhAsurrNUwVjL+e6mA8LQyyH79bNcP0cN+gyIo
LHqmK2OVEdwr7uOjijtA5y49iyreR92nfVLOZUzxjrpjXs36eSzJ+DdUulzx3cJJ
49BQPlT/+Hb5V7hIVUSFTneyLGrJOLG9g9hFtx4nL9sNTddzpJhXeeEyfy/q9+yy
DEigN8nBiILHLfHdGMIh
=1UnT
-----END PGP SIGNATURE-----
More information about the Users
mailing list