[Pacemaker] Shooting and diagnosis of stonith plugins

Fri Oct 17 04:36:38 EDT 2008

Hi Dejan

 >> B is remarked in http://www.linux-ha.org/STONITH.
 >>   it says like this.
 >>
 >>     3. When given a RESET or OFF command it must not return
 >>        control to its caller until the node is no longer running.
 >
 > stonithd retries forever.

What I cited is an article about stonith plugins, Not stonithd.
I cite it further.

=========================
There are a few properties a STONITH plugin must have for it
to be usable in Heartbeat:
   1. ...
   2. ...
   3. When given a RESET or OFF command it must not return
      control to its caller until the node is no longer running.
   4. ...
=========================

Or are there other documents for stonith plugins of Pacemaker?

 > Sorry, but I really can't see where's the issue here.

I am looking for alternative criteria to 'ssh'
I want to know how to make correct stonith plugins for Pacemaker.
(This no longer doesn't match the title, Sorry.
  Should I make another thread for this issue?)

Are the articles about stonith plugins in http://www.linux-ha.org/
also right for Pacemaker?

Dejan Muhamedagic wrote:
> Hi,
> 
> On Thu, Oct 16, 2008 at 06:00:16PM +0900, Takenaka Kazuhiro wrote:
>> Hi Dejan.
>>
>>>> If 'ibmrsa-telnet' goes right way, it means any stonith plugin
>>>> that can't shoot a host machine with a power fault must not
>>>> be used alone. They must use with some other plugin which checks
>>>> if its target machines is running or not.
>>> This is an inherent problem of the lights-out devices such as IBM
>>> RSA or HP iLO, i.e. that they share power source with the node
>>> they manage. Power failure renders this kind of stonith device
>>> useless. Unfortunately, there's nothing one can do about it.
>> But something must be done.
> 
> If one had only means :) The only way to deal with this would be
> to fall back to meatware.
> 
>> In this case, what a plugin can do is one of the following:
>>
>>   A) Check the target by another way.
>>   B) Retry forever.
>>   C) Return failure to caller.
>>
>> A is what 'ssh' does.
>>   And you said 'ssh' isn't a production.
>>   Does it mean any other real stonith plugin must not do A?
>>
>> B is remarked in http://www.linux-ha.org/STONITH.
>>   it says like this.
>>
>>     3. When given a RESET or OFF command it must not return
>>        control to its caller until the node is no longer running.
> 
> stonithd retries forever.
> 
>>   Any plugin follows B keeps running until stonithd kills it
>>   on an error.
>>
>> C is what 'ibmrsa-telnet' does.
>>   Any plugin follows C returns failure on an error immediatly.
>>   But I don't know any document which encourages C.
> 
> That's all fine, but remember that the said plugin can't reach
> the stonith device. Hence, all it can do is report an error.
> 
> Sorry, but I really can't see where's the issue here.
> 
> Thanks,
> 
> Dejan
> 
>> Which is a right choice for real stonith plugins?
>>
>> Dejan Muhamedagic wrote:
>>> Hi Takenaka-san,
>>>
>>> On Wed, Oct 15, 2008 at 02:09:17PM +0900, Takenaka Kazuhiro wrote:
>>>> Hi Dejan.
>>>>
>>>>> Hi Takenaka-san,
>>>>>
>>>>> On Fri, Oct 10, 2008 at 03:30:27PM +0900, Takenaka Kazuhiro wrote:
>>>>>> Hi all.
>>>>>>
>>>>>> So far as I know, every stonith plugin is expected to diagnose if
>>>>>> its target is fenced out from the other nodes before it returns
>>>>>> successful status on 'reset' or 'off'.
>>>>> It depends on the stonith device. Sometimes it is enough just to
>>>>> send the reset command and let the device deal with it. Sometimes
>>>>> it is necessary to check the current power state. However, it
>>>>> looks like this is not what you want to talk about.
>>>> You said "The point of a stonith operation is to ensure that a host
>>>> is down or rebooted." in the following thread.
>>>>
>>>> http://lists.community.tummy.com/pipermail/linux-ha/2008-August/034323.html
>>>>
>>>> So I have thought any stonith plugin should make sure if its target
>>>> is down or rebooted before it returns. However, this isn't a main
>>>> issue just as you understood.
>>>>
>>>>>> However, I think this diagnosis is somewhat excess burden for an
>>>>>> indivdual plugin.
>>>>> Actually, the stonith plugins are not required to know the state
>>>> ... snip ...
>>>>>>   <primitive type="external/ssh class="stonith" task="shoot" ...>
>>>>>>
>>>>>> I hope some kind of agreement will be made about this problem.
>>>> Please let me put aside your comments abobe for now.
>>>> I have an question about your comments below and I'd like
>>>> you to answer it first.
>>>>
>>>>> This new concept does make sense with the ssh plugin. However,
>>>>> all other plugins function in a significantly different way and I
>>>>> don't see how this can apply to them.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Dejan
>>>> Yes. 'ssh' is so different from 'ibmrsa-telnet'.
>>>>
>>>> 'ssh' shooots a target via a NIC.
>>>> 'ibmrsa-telnet' shooots a target via a RSA.
>>> Actually, I'd rather leave ssh out of this discussion. It was
>>> never meant for production, just for testing.
>>>
>>>> So, these devices must lost their power when power-faults
>>>> occur on their host machines
>>>>
>>>> In this case, neither 'ssh' nor 'ibmrsa-telnet' can deal with
>>>> their target devices. They gets a explicit connection failure
>>>> in this situation.
>>>>
>>>> But what actually follows is so different.
>>>>
>>>> In the case where 'ssh' is used as a stonith plugin, it returns a
>>>> successful status and the suspended resources are resumed on the
>>>> other nodes.
>>>>
>>>> On the other hand, In the case where 'ibmrsa-telnet' is used,
>>>> it returns an error status and the suspended resources are not
>>>> resumed anywhere. (I think 'ibmrsa-telnet' isn't only one plugin
>>>> that works in this way. 'ibmrsa' and 'ipmi' also should work in
>>>> the same way.)
>>>>
>>>> 'ssh' and 'ibmrsa-telnet' measure success and failure of
>>>> shooting targets in different way and it makes difference
>>>> of these results.
>>>>
>>>> 'ssh' never checks whether it could deal with its target device.
>>>> Even if the deal failes explicitly, 'ssh' ignores it.
>>>> Instead, 'ssh' always returns its status according to a subsequent
>>>> ping check.
>>>>
>>>> On the other hand, 'ibmrsa-telnet' returns its status according
>>>> to if it could deal with the device. Whenever 'ibmrsa-telnet'
>>>> gets any explicit failure with dealing, it returns an error
>>>> status. 'ibmrsa-telnet' never checks target's status in any way.
>>>>
>>>> Which is a correct implementation as a stonith plugin?
>>> Both. Note that ssh relies on the network, hence using ping to
>>> verify the host status is fine. However, for a "real" stonith
>>> device such as RSA doing that would be wrong.
>>>
>>>> In the other words, When a explicit connection error occurs
>>>> during a stonith action, How should stonith plugins do?
>>>>
>>>> I have believed 'ssh' goes right way. Because I have thought
>>>> a stonith plugin which failes a failover on a power fault
>>>> is out of problem.
>>> If the stonith device cannot be reached then we don't know if the
>>> host is running or not. Hence we have to assume the worst case.
>>>
>>>> If 'ibmrsa-telnet' goes right way, it means any stonith plugin
>>>> that can't shoot a host machine with a power fault must not
>>>> be used alone. They must use with some other plugin which checks
>>>> if its target machines is running or not.
>>> This is an inherent problem of the lights-out devices such as IBM
>>> RSA or HP iLO, i.e. that they share power source with the node
>>> they manage. Power failure renders this kind of stonith device
>>> useless. Unfortunately, there's nothing one can do about it.
>>>
>>> Thanks,
>>>
>>> Dejan
>>>
>>>> Dejan Muhamedagic wrote:
>>>>> Hi Takenaka-san,
>>>>>
>>>>> On Fri, Oct 10, 2008 at 03:30:27PM +0900, Takenaka Kazuhiro wrote:
>>>>>> Hi all.
>>>>>>
>>>>>> So far as I know, every stonith plugin is expected to diagnose if
>>>>>> its target is fenced out from the other nodes before it returns
>>>>>> successful status on 'reset' or 'off'.
>>>>> It depends on the stonith device. Sometimes it is enough just to
>>>>> send the reset command and let the device deal with it. Sometimes
>>>>> it is necessary to check the current power state. However, it
>>>>> looks like this is not what you want to talk about.
>>>>>
>>>>>> However, I think this diagnosis is somewhat excess burden for an
>>>>>> indivdual plugin.
>>>>> Actually, the stonith plugins are not required to know the state
>>>>> of the host. They just make sure that the host is in a certain
>>>>> state or that it is reset. This normally doesn't involve the host
>>>>> itself, just the device which can manage it. Put in other words:
>>>>> If you pull the power plug or press the reset button there's no
>>>>> need to try ping or ssh or whatever else to verify that the host
>>>>> really went down.
>>>>>
>>>>>> Because authors of plugins know how to deal with stonith devices
>>>>>> for which they make plugins, but they can't always expect structure
>>>>>> of clusters on which their plugins will work.
>>>>>>
>>>>>> When a clusters administrator try to use some plugin but the diagnosis
>>>>>> of the plugin doesn't match the cluster, the administrator can't help
>>>>>> but directly alter the plugin.
>>>>>>
>>>>>> This gets down plugins' adaptiveness and can't be favorable.
>>>>>> One idea to avoid this problem is making schemes or conventions
>>>>>> which enable plugins to delegate the diagnosis to other plugins.
>>>>>>
>>>>>> Attached two plugins are a sample of this idea. They work cooperatively
>>>>>> by the attached cib.xml.
>>>>> It is an interesting idea. It seems like it would require that
>>>>> all existing stonith plugins return false so that the next, the
>>>>> "test status" plugin can report the state of the host.
>>>>>
>>>>>> 'sshAltered' only shoots its targets and 'pingAllAddr' only diagnoses
>>>>>> activity of its targets.
>>>>>>
>>>>>> The followings are little more detailed explanations:
>>>>>>
>>>>>>   When some accidents made necessary to shoot a corrupted node
>>>>>>   by another node, the shooter node uses 'sshAltered' firstly to
>>>>>>   shoot the target node.
>>>>>>
>>>>>>   'sshAltered' shoots its targets but never exits with a successful
>>>>>>   status if the value of attribute 'shoot_only' is "yes" in the same
>>>>>>   way as the attached cib.xml. So, next plugin will be used always
>>>>>>   if it is defined.
>>>>>>
>>>>>>   'pingAllAddr' confirms activity of the IP addresses of its targets
>>>>>>   specified in cib.xml. If any of the IP addresses don't respond,
>>>>>>   'pingAllAddr' exits with a successful status, otherwise it
>>>>>>   exits with an error status.
>>>>>>
>>>>>> After once 'external/ssh' is rewritten into 'sshAltered', there
>>>>>> is no need to rewrite it again to use other conditions to
>>>>>> confirm targets' death.
>>>>>>
>>>>>> For example, if a cluster uses iSCSI shared storages and
>>>>>> a failover action on this cluster must wait for the iSCSI target
>>>>>> devices to sweep connections to the corrupted node, it can do by
>>>>>> the other type plugins instead of 'pingAllAddr'. Their task is to
>>>>>> ask iSCSI target devices about completion of connection sweeping.
>>>>>>
>>>>>> Vice-versa is also true. Any plugin which follows the explained
>>>>>> convention can work together with 'pingAllAddr'.
>>>>>>
>>>>>> It can also be avalable by another tag-attibute like this:
>>>>>>
>>>>>>   <primitive type="external/ssh class="stonith" task="shoot" ...>
>>>>>>
>>>>>> I hope some kind of agreement will be made about this problem.
>>>>> This new concept does make sense with the ssh plugin. However,
>>>>> all other plugins function in a significantly different way and I
>>>>> don't see how this can apply to them.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Dejan
>>>>>
>>>>>
>>>>>> Best regard.
>>>>>> -- 
>>>>>> Takenaka Kazuhiro <takenaka.kazuhiro at oss.ntt.co.jp>
>>>>
>>>>
>>>> _______________________________________________
>>>> Pacemaker mailing list
>>>> Pacemaker at clusterlabs.org
>>>> http://list.clusterlabs.org/mailman/listinfo/pacemaker
>>> _______________________________________________
>>> Pacemaker mailing list
>>> Pacemaker at clusterlabs.org
>>> http://list.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>> -- 
>> ?????? ??????
>> Takenaka Kazuhiro <takenaka.kazuhiro at oss.ntt.co.jp>
>> NTT OSS????????? ??????????????????
>> TEL 03-5860-5135
>>
>>
>>
>> _______________________________________________
>> Pacemaker mailing list
>> Pacemaker at clusterlabs.org
>> http://list.clusterlabs.org/mailman/listinfo/pacemaker
> 
> _______________________________________________
> Pacemaker mailing list
> Pacemaker at clusterlabs.org
> http://list.clusterlabs.org/mailman/listinfo/pacemaker
> 
-- 
Takenaka Kazuhiro <takenaka.kazuhiro at oss.ntt.co.jp>