[ClusterLabs] What triggers fencing?

Digimer lists at alteeve.ca
Mon Jul 9 11:33:44 EDT 2018

On 2018-07-09 09:56 AM, Klaus Wenninger wrote:
> On 07/09/2018 03:49 PM, Digimer wrote:
>> On 2018-07-09 08:31 AM, Klaus Wenninger wrote:
>>> On 07/09/2018 02:04 PM, Confidential Company wrote:
>>>> Hi,
>>>> Any ideas what triggers fencing script or stonith?
>>>> Given the setup below:
>>>> 1. I have two nodes
>>>> 2. Configured fencing on both nodes
>>>> 3. Configured delay=15 and delay=30 on fence1(for Node1) and
>>>> fence2(for Node2) respectively
>>>> *What does it mean to configured delay in stonith? wait for 15 seconds
>>>> before it fence the node?
>>> Given that on a 2-node-cluster you don't have real quorum to make one
>>> partial cluster fence the rest of the nodes the different delays are meant
>>> to prevent a fencing-race.
>>> Without different delays that would lead to both nodes fencing each
>>> other at the same time - finally both being down.
>> Not true, the faster node will kill the slower node first. It is
>> possible that through misconfiguration, both could die, but it's rare
>> and easily avoided with a 'delay="15"' set on the fence config for the
>> node you want to win.
> What exactly is not true? Aren't we saying the same?
> Of course one of the delays can be 0 (most important is that
> they are different).

Perhaps I misunderstood your message. It seemed to me that the
implication was that fencing in 2-node without a delay always ends up
with both nodes being down, which isn't the case. It can happen if the
fence methods are not setup right (ie: the node isn't set to immediately
power off on ACPI power button event).

If the delay is set on both nodes, and they are different, it will work
fine. The reason not to do this is that if you use 0, then don't use
anything at all (0 is default), and any other value causes avoidable
fence delays.

>> Don't use a delay on the other node, just the node you want to live in
>> such a case.
>>>> *Given Node1 is active and Node2 goes down, does it mean fence1 will
>>>> first execute and shutdowns Node1 even though Node2 goes down?
>>> If Node2 managed to sign off properly it will not.
>>> If network-connection is down so that Node2 can't inform Node1 that it
>>> is going
>>> down and finally has stopped all resources it will be fenced by Node1.
>>> Regards,
>>> Klaus
>> Fencing occurs in two cases;
>> 1. The node stops responding (meaning it's in an unknown state, so it is
>> fenced to force it into a known state).
>> 2. A resource / service fails to stop stop. In this case, the service is
>> in an unknown state, so the node is fenced to force the service into a
>> known state so that it can be safely recovered on the peer.
>> Graceful withdrawal of the node from the cluster, and graceful stopping
>> of services will not lead to a fence (because in both cases, the node /
>> service are in a known state - off).

Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain than in the near certainty that people of equal talent
have lived and died in cotton fields and sweatshops." - Stephen Jay Gould

More information about the Users mailing list