[Pacemaker] a situation where pacemaker refuses to stop

Mon Feb 25 19:36:03 EST 2013

On Tue, Feb 26, 2013 at 2:40 AM, Brian J. Murrell <brian at interlinx.bc.ca> wrote:
> On 13-02-24 07:56 PM, Andrew Beekhof wrote:
>>
>> Basically yes.
>> Stonith is the first stage of recovery and supposed to be at least
>> vaguely reliable.
>> Have you figured out why fencing is so broken?
>
> It wasn't really "broken" but was in the process of being configured
> when this situation arose.  The set up hadn't gotten to configuring the
> stonith resource yet.

So you purposefully tricked pacemaker by defining a dummy fencing
resource (instead of setting stonith-enabled=false)... sounds like its
pretty broken to me :)

What was your definition of st-fencing btw?

>
>> Part of the problem is that 2-node clusters have no concept of quorum,
>> so they can get a bit trigger-happy in the name of data-integrity.
>> If Pacemaker were to shut down in this case, it would be leaving
>> things (as far as it can tell) in an inconsistent state which is
>> likely result in bad things later on - there's not much point in
>> "highly available corrupted data".
>
> Fair enough I suppose.  It's a corner case that one wants/needs to try
> to avoid then.  :-/
>
> Cheers,
> b.
>
>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>