[Pacemaker] [Linux-HA] new doc about stonith/fencing

Ryan Steele ryans at aweber.com
Fri May 29 14:31:37 EDT 2009


Jan Kalcic wrote:
> Really interesting. I would have appreciated some more example (they are
> always welcome) but still very interesting.
> 
> Thanks,
> Jan
> 
> Dejan Muhamedagic wrote:
>> Hi,
>>
>> Trying to make it a bit less mysterious, I wrote something about
>> fencing and stonith quite a while ago and then forgot to share
>> the link. Sorry about that.
>>
>> Here it is:
>>
>> http://www.clusterlabs.org/mediawiki/images/f/f2/Crm_fencing.pdf
>>
>> As usual, constructive criticism/suggestions/etc are welcome.
>> I won't be able to read your impressions for the next two weeks,
>> but will sure look forward to see them afterwards.
>>
>> Cheers,
>>
>> Dejan
>> _______________________________________________
>> Linux-HA mailing list
>> Linux-HA at lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>
>>   
> 
> 
> _______________________________________________
> Pacemaker mailing list
> Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker


I found this to be informative as well, Dejan - thanks for taking the time to write this.  However, I too agree with Jan 
in that some examples using more recommended non-testing STONITH devices would be great, since SSH, null, and other 
network-based tests are apparently frowned upon in production environments (based on comments by Andrew and the article 
here which he referenced: http://theclusterguy.clusterlabs.org/post/113230399/highly-available-data-corruption).  For 
example, I have Raritan 30A PDU's in my cabs, but I didn't see anything in the output of 'stonith -L' except an APC 
switched rack PDU.


Now I know that a document like this can't be expected to cover every single type of STONITH device in existence, but 
some instructions on writing custom STONITH plugins might be useful so that folks can write them for their particular 
STONITH device (PDU or IPMI card or what have you) and contribute back to the community which will in turn help others. 
   I've looked at both the clusterlabs.org and linux-ha.org sites, but didn't see any documentation on rolling your own 
at either site, and the Novell docs on this topic were GUI-centric which unfortunately aren't as helpful to those of use 
sticking with the CLI.


The other thing that might be helpful is to know what the goal is in terms of recovering from a STONITH action.  If one 
has a node that STONITH powers off at the PDU outlet because it's lost networking, and then networking is subsequently 
restored, how are we do get the node back in action?


Thanks and Regards,
Ryan




More information about the Pacemaker mailing list