[ClusterLabs] Antw: Re: crmd[4893]: notice: peer_update_callback: Node return implies stonith of node1 (action 34) completed

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Mon Mar 30 07:36:17 UTC 2015


>>> Andrew Beekhof <andrew at beekhof.net> schrieb am 30.03.2015 um 02:30 in Nachricht
<C19EFA41-B4F5-4EAF-8CEC-590ACCF8FAE2 at beekhof.net>:

>> On 11 Mar 2015, at 5:48 pm, Andrei Borzenkov <arvidjaar at gmail.com> wrote:
>> 
>> I was looking why node did takeover of another node and stumbled on
>> this message. The sequence was
>> 
>> - loss of LAN connection for ~ 30 seconds
>> - split brain
>> - node initiated IPMI stonith
>> - *BEFORE* IPMI stonith returned LAN connection was back and both
>> nodes saw each other
>> - and crmd assumed stonith worked
>> 
>> 
>> Is it intentional?
> 
> Pretty much.
> 
>> Node node was actually rebooted by IPMI after this.
> 
> Not ideal but also nothing much we can do about it.

Actually we were experiencing similar situations that were handled sub-optimal IMHO:
The cluster decided to fence some node while it had no quorum. The node to be fenced was down anyway, but joined the cluster after reboot. THEN the cluster had quorum and fenced the node that just joined the cluster, causing a loss of quorum again...

I feel that a node freshly joining the cluster should cancel all fencing requests targeted at it.

Regards,
Ulrich

> 
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org 
> http://clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org 








More information about the Users mailing list