[ClusterLabs] What is timeout for initial fencing after startup?

Andrew Beekhof andrew at beekhof.net
Mon Mar 30 00:58:03 UTC 2015


> On 3 Mar 2015, at 9:10 am, Andreas Kurz <andreas.kurz at gmail.com> wrote:
> 
> On 2015-02-28 07:07, Andrei Borzenkov wrote:
>> В Fri, 27 Feb 2015 22:45:56 +0100
>> Andreas Kurz <andreas.kurz at gmail.com> пишет:
>> 
>>> On 2015-02-27 10:40, Andrei Borzenkov wrote:
>>>> I'm testing what happens in 2 node cluster when one node is not
>>>> present at startup. It appears that pacemaker gives up attempt to
>>>> stonith other node after 10 minutes. Where these 10 minutes come from?
>>> 
>>> I'd say these 10minutes come from the cluster-property 'stonith-timeout'
>>> with its default of 60s and the feature to stop trying to fence a node
>>> after the 10th failed fencing attempt.
>>> 
>> 
>> Do you happen to know if "10th failed attempt" is documented somewhere?
>> I'm trying to find cluster or stonith property that may look like it is
>> related but cannot. Only stonith-timeout is present in documentation.
> 
> I'm only aware this git commit in pacemaker, mentioning the 10 stonith
> attempts:
> 
> commit e29d2f9f627e492032ce64563ee3def0ff1b000d
> Author: Andrew Beekhof <andrew at beekhof.net>
> Date:   Fri Jul 6 16:18:19 2012 +1000
> 
>    High: crmd: Block after 10 failed fencing attempts for a node

I've made a note to document this.
You really want to be looking into why your fencing is so unreliable though.



More information about the Users mailing list