[Pacemaker] Reboot node with stonith after killing a corosync-process?

Fri Apr 15 07:35:06 EDT 2011

OK, I understand. Many thanks for your help.

2011/4/15 Andrew Beekhof <andrew at beekhof.net>:
> On Fri, Apr 15, 2011 at 12:09 PM, Dominik Klein <dk at in-telegence.net> wrote:
>> Hi
>>
>> On 04/15/2011 09:05 AM, Tom Tux wrote:
>>> I can reproduce this behavior:
>>>
>>> - On node02, which had no resources online, I killed all corosync
>>> processes with "killall -9 corosync".
>>> - Node02 was rebootet through stonith
>>> - On node01, I can see the following lines in the message-log (line 6
>>> schedules the STONITH):
>>>
>>> For me it seems, that node01 recognized, that the cluster-processes on
>>> node02 were not shot down properly. So the behavior in this case is,
>>> to stonith the node. Could this behavior be disabled? Which setting?
>>
>> The cluster cannot distinguish between a node that has lost power, has
>> broken network or someone killed corosync there.
>>
>> To the surviving node, the other one is just dead and stonith makes sure
>> it really is.
>>
>> That's expected and i guess it will not change.
>
> 100% correct
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>