[Pacemaker] Howto write a STONITH agent

Dejan Muhamedagic dejanmm at fastmail.fm
Fri Jan 14 11:18:06 EST 2011


On Fri, Jan 14, 2011 at 05:10:17PM +0100, Christoph Herrmann wrote:
> -----Ursprüngliche Nachricht-----
> Von: Dejan Muhamedagic <dejanmm at fastmail.fm>
> Gesendet: Fr 14.01.2011 12:31
> An: The Pacemaker cluster resource manager <pacemaker at oss.clusterlabs.org>; 
> Betreff: Re: [Pacemaker] Howto write a STONITH agent
> 
> > Hi,
> > 
> > On Thu, Jan 13, 2011 at 09:09:38PM +0100, Christoph Herrmann wrote:
> > > Hi,
> > > 
> > > I have some brand new HP Blades with ILO Boards (iLO 2 Standard Blade Edition 
> > 1.81 ...)
> > > But I'm not able to connect with them via the external/riloe agent.
> > > When i try:
> > > 
> > > stonith -t external/riloe -p "hostlist=node1 ilo_hostname=ilo1  
> > ilo_user=ilouser ilo_password=ilopass ilo_can_reset=1 ilo_protocol=2.0 
> > ilo_powerdown_method=power" -S
> > 
> > Try this:
> > 
> > stonith -t external/riloe hostlist=node1 ilo_hostname=ilo1  ilo_user=ilouser 
> > ilo_password=ilopass ilo_can_reset=1 ilo_protocol=2.0 
> > ilo_powerdown_method=power -S
> 
> thats much better (looks like PEBKAC ;-), thanks! But it is not reliable. I've tested it about 10 times
> and 5 times it hangs.  That's not what I want.

Did you try to find out why did it hang?

> Finally I will use my own ssh-ilo agent. It's very simple (KISS) and reliable. The external/riloe agent did not
> look to simple.

Right. Let's everybody roll our own ;->

> So my questions still remain. Is there a HOWTO for writing stonith agents.

No.

> Is it usefull to write (to run) a stonith agent as cloned resource?

Sometimes. There are quite some resources. You can take a look
at clusterlabs.org.

> What should the status check do with a cloned stonith resource. Is it usefull in any way? (As long as I have 4 different nodes with 4 different ilo boards.)

The status should check for the device status, not nodes.

Thanks,

Dejan

> 
>  
> Cheers,
> 
> 
>   Christoph &:-)
> 
> 
> > Thanks,
> > 
> > Dejan
> > 
> > > 
> > > I get the following answer:
> > > 
> > > external/riloe[14317]: ERROR: unknown power method %s, setting to "power"
> > > external/riloe[14317]: ERROR: [Errno -2] Name or service not known, while 
> > talking to ilo_hostname=ilo1
> > > 
> > > ** (process:14315): CRITICAL **: external_run_cmd: Calling 
> > '/usr/lib64/stonith/plugins/external/riloe status' returned 1
> > > 
> > > ** (process:14315): CRITICAL **: external_status: 'riloe status' failed with 
> > rc 1
> > > stonith: external/riloe device not accessible.
> > > 
> > > 
> > > But I can access ilo1 with http, https and ssh. The easiest way to reset a 
> > node is to run:
> > > 
> > > ssh -i ilo-sshkey ilouser at ilo1 reset system1 
> > > 
> > > I thouhgt it is easier to write a new ssh-ilo agent (I'm almost done :-) than 
> > debugging the existing one. But I'm looking for a short howto. I've read some 
> > STONITH agents, but they are not completely self-explaining and I have some 
> > questions. Is there a short howto write a stonith agent manual which google and 
> > I were not able to find?
> > > Or should I post all questions to the list?
> > > here we go:
> > > 
> > > 1. (and most important): What does the status check do, if you have an agent 
> > which runs as cloned resource (my ssh-ilo agent should run as a cloned 
> > resource). Does it check all nodes? Is it possible to check the status of a 
> > single node?
> > > 2. What are the expected return codes?
> > > 
> > > more to follow ;-)
> > > 
> > > 
> > > 
> > > 
> > > regards
> > > 
> > > 
> > >    Christoph &:-)
> -- 
> Vorstand/Board of Management:
> Dr. Bernd Finkbeiner, Dr. Roland Niemeier, 
> Dr. Arno Steitz, Dr. Ingrid Zech
> Vorsitzender des Aufsichtsrats/
> Chairman of the Supervisory Board:
> Michel Lepert
> Sitz/Registered Office: Tuebingen
> Registergericht/Registration Court: Stuttgart
> Registernummer/Commercial Register No.: HRB 382196 
> 
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker




More information about the Pacemaker mailing list