[ClusterLabs] Pacemaker startup-fencing

Thu Mar 17 09:18:31 UTC 2016

Andrei Borzenkov <arvidjaar at gmail.com> writes:

> On Wed, Mar 16, 2016 at 4:18 PM, Lars Ellenberg <lars.ellenberg at linbit.com> wrote:
>
>> On Wed, Mar 16, 2016 at 01:47:52PM +0100, Ferenc Wágner wrote:
>>
>>>>> And some more about fencing:
>>>>>
>>>>> 3. What's the difference in cluster behavior between
>>>>>    - stonith-enabled=FALSE (9.3.2: how often will the stop operation be retried?)
>>>>>    - having no configured STONITH devices (resources won't be started, right?)
>>>>>    - failing to STONITH with some error (on every node)
>>>>>    - timing out the STONITH operation
>>>>>    - manual fencing
>>>>
>>>> I do not think there is much difference. Without fencing pacemaker
>>>> cannot make decision to relocate resources so cluster will be stuck.
>>>
>>> Then I wonder why I hear the "must have working fencing if you value
>>> your data" mantra so often (and always without explanation).  After all,
>>> it does not risk the data, only the automatic cluster recovery, right?
>>
>> stonith-enabled=false
>> means:
>> if some node becomes unresponsive,
>> it is immediately *assumed* it was "clean" dead.
>> no fencing takes place,
>> resource takeover happens without further protection.
>
> Oh! Actually it is not quite clear from documentation; documentation
> does not explain what happens in case of stonith-enabled=false at all.

Yes, this is a crucially important piece of information, which should be
prominently announced in the documentation.  Thanks for spelling it out,
Lars.  Hope you don't mind that I turned your text into
https://github.com/ClusterLabs/pacemaker/pull/960.
-- 
Feri