[ClusterLabs] Fencing on 2-node cluster
Digimer
lists at alteeve.ca
Wed Jun 20 17:52:46 EDT 2018
Note: Please reply to he list, not me directly.
The stonith delay helps predict who will win in a comms break event
where both try to fence the other at the same time. If you disable
quorum and it still doesn't fence, something else is wrong (and it's not
related to the delay).
Get the cluster healthy, tail the system logs from both nodes, trigger a
fault and wait for things to settle. Then share the logs please.
digimer
On 2018-06-20 05:43 PM, Casey & Gina wrote:
> I tested with just the quorum disabled and powering off one of the nodes. It still isn't getting fenced or rebooted.
>
> Is the different delay needed for this? Can you please tell me the syntax/command to add this attribute?
>
>> On 2018-06-20, at 3:40 PM, Digimer <lists at alteeve.ca> wrote:
>>
>> Yup, that will do it.
>>
>> The way you choose the node to "win", is to add a 'delay="15"' attribute
>> to the stonith configuration for the "primary" node. The way this works
>> is this;
>>
>> Assume node1 is running your services, so in a comm break, you want to
>> have node 1 fence node 2 faster than node 2 can fence node 1. You add
>> the delay to node1's stonith config. So when a fence is needed, node2
>> looks up how to fence node1, sees the delay and sleeps for 15 seconds.
>> Node1 looks up how to fence node2, sees no delay, and fences it
>> immediately. This way, node2 is gone before it exists the delay,
>> ensuring node1 wins.
>>
>> If, however, node1 was really dead, then after the 15 second delay, it
>> proceeds to fence node1 and recover lost services.
>>
>> digimer
>>
>> On 2018-06-20 05:30 PM, Casey & Gina wrote:
>>> Thank you, this is done with `pcs property set no-quorum-policy=ignore`?
>>>
>>> How would I set the fence delay, and what if either node can be "primary"?
>>>
>>>> On 2018-06-20, at 3:24 PM, Digimer <lists at alteeve.ca> wrote:
>>>>
>>>> Make sure quorum is disabled. Quorum doesn't work on 2-node clusters.
>>>> Also be sure to set a fence delay on the "primary" node (however you
>>>> define that) so that you have some predictability about which node will
>>>> live in a comms break event.
>>>>
>>>> digimer
>>>>
>>>> On 2018-06-20 05:22 PM, Casey & Gina wrote:
>>>>> I tried testing out a fencing configuration that I had working with a 3-node cluster, using a 2-node cluster. What I found is that when I power off one of the nodes forcibly, it does not get fenced and rebooted as it does on a 3-node cluster. I have verified that I can fence and reboot one node from the other using stonith_admin... Is there a difference in the configuration that is needed on a 2-node cluster for fencing to work?
>>>>>
>>>>> Thank you,
>>>>>
>>>>
>>>>
>>>> --
>>>> Digimer
>>>> Papers and Projects: https://alteeve.com/w/
>>>> "I am, somehow, less interested in the weight and convolutions of
>>>> Einstein’s brain than in the near certainty that people of equal talent
>>>> have lived and died in cotton fields and sweatshops." - Stephen Jay Gould
>>>
>>
>>
>> --
>> Digimer
>> Papers and Projects: https://alteeve.com/w/
>> "I am, somehow, less interested in the weight and convolutions of
>> Einstein’s brain than in the near certainty that people of equal talent
>> have lived and died in cotton fields and sweatshops." - Stephen Jay Gould
>
--
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain than in the near certainty that people of equal talent
have lived and died in cotton fields and sweatshops." - Stephen Jay Gould
More information about the Users
mailing list