[Pacemaker] Stonith: How to avoid deathmatch cluster partitioning

Andreas Kurz andreas at hastexo.com
Wed May 15 16:55:43 EDT 2013


On 2013-05-15 15:34, Klaus Darilion wrote:
> On 15.05.2013 14:51, Digimer wrote:
>> On 05/15/2013 08:37 AM, Klaus Darilion wrote:
>>> primitive st-pace1 stonith:external/xen0 \
>>>          params hostlist="pace1" dom0="xentest1" \
>>>          op start start-delay="15s" interval="0"
>>
>> Try;
>>
>> primitive st-pace1 stonith:external/xen0 \
>>          params hostlist="pace1" dom0="xentest1" delay="15" \
>>          op start start-delay="15s" interval="0"
>>
>> The idea here is that, when both nodes lose contact and initiate a
>> fence, 'st-pace1' will get a 15 second reprieve. That is, 'st-pace2'
>> will wait 15 seconds before trying to fence 'st-pace1'. If st-pace1 is
>> still alive, it will fence 'st-pace2' without delay, so pace2 will be
>> dead before it's timer expires, preventing a dual-fence. However, if
>> pace1 really is dead, pace2 will fence it and recovery, just with a 15
>> second delay.
> 
> Sounds good, but pacemaker does not accept the parameter:
> 
>    ERROR: st-pace1: parameter delay does not exist

start-delay is an option of the monitor operation ... in fact means
"don't trust that start was successfull, wait for the initial monitor
some more time"

The problem is, this would only make sense for one single stonith
resource that can fence more nodes. In case of a split-brain that would
delay the start on that node where the stonith resource was not running
before and gives that node a "penalty".

In your example with two stonith resources running all the time,
Digimer's suggestion is a good idea: use one of the redhat fencing
agents, most of them have some sort of "stonith-delay" parameter that
you can use with one instance.

Regards,
Andreas

-- 
Need help with Pacemaker?
http://www.hastexo.com/now


> 
> The syntax suggested by you assumes that "delay" is a parameter accepted
> by the stonith resource. But this is not the case. Also "grep delay
> /usr/lib/stonith/plugins/external/*" does not reveal a single stonith
> resource which accepts this parameter.
> 
> Further, it would make sense to have "delay" as Pacemaker parameter. I
> also tried
>   primitive st-pace1 stonith:external/xen0 delay="15" \
>         params hostlist="pace1" dom0="xentest1" \
>         op start start-delay="15s" interval="0"
> but this also gives syntax errors.
> 
> Any other hints?
> 
> thanks
> Klaus
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org



-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 287 bytes
Desc: OpenPGP digital signature
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130515/06ceab19/attachment-0003.sig>


More information about the Pacemaker mailing list