[Pacemaker] Stonith: How to avoid deathmatch cluster partitioning

Andreas Kurz andreas at hastexo.com
Fri May 17 16:40:32 EDT 2013

On 2013-05-16 11:01, Lars Marowsky-Bree wrote:
> On 2013-05-15T22:55:43, Andreas Kurz <andreas at hastexo.com> wrote:
>> start-delay is an option of the monitor operation ... in fact means
>> "don't trust that start was successfull, wait for the initial monitor
>> some more time"
> It can be used on start here though to avoid exactly this situation; and
> it works fine for that, effectively being equivalent to the "delay"
> option on stonith (since the start always precedes the fence).

Hmm ... looking at the configuration there are two stonith resources,
each one locked to a node and they are started all the time so I can't
see how that would help here in case of a split-brain ... but please
correct me if I miss something here.

>> The problem is, this would only make sense for one single stonith
>> resource that can fence more nodes. In case of a split-brain that would
>> delay the start on that node where the stonith resource was not running
>> before and gives that node a "penalty".
> Sure. In a split-brain scenario, one side will receive a penalty, that's
> the whole point of this exercise. In particular for the external/sbd
> agent.

So you are confirming my explanation, thanks ;-)

Best regards,

> Or by grouping all fencing resources to always run on one node; if you
> don't have access to RHT fence agents, for example.
> external/sbd also has code to avoid a death-match cycle in case of
> persistent split-brain scenarios now; after a reboot, the node that was
> fenced will not join unless the fence is cleared first.
> (The RHT world calls that "unfence", I believe.)
> That should be a win for the fence_sbd that I hope to get around to
> sometime in the next few months, too ;-)
>> In your example with two stonith resources running all the time,
>> Digimer's suggestion is a good idea: use one of the redhat fencing
>> agents, most of them have some sort of "stonith-delay" parameter that
>> you can use with one instance.
> It'd make sense to have logic for this embedded at a higher level,
> somehow; the problem is all too common.
> Of course, it is most relevant in scenarios where "split brain" is a
> significantly higher probability than "node down". Which is true for
> most test scenarios (admins love yanking cables), but in practice, it's
> mostly truly the node down.
> Regards,
>     Lars

Need help with Pacemaker?

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 287 bytes
Desc: OpenPGP digital signature
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130517/484600ef/attachment-0003.sig>

More information about the Pacemaker mailing list