[ClusterLabs] Two-node Pacemaker cluster with "fence_aws" fence agent
Klaus Wenninger
kwenning at redhat.com
Mon Sep 7 05:26:01 EDT 2020
On 9/4/20 11:24 PM, Digimer wrote:
> On 2020-09-04 5:15 p.m., Philippe M Stedman wrote:
>> Hi ClusterLabs development,
>>
>> I am in the process of deploying a two-node cluster on AWS and using the
>> fence_aws fence agent for fencing. I was reading through the following
>> article about common pitfalls in configuring two-node Pacemaker clusters:
>> https://www.thegeekdiary.com/most-common-two-node-pacemaker-cluster-issues-and-their-workarounds/
>>
>> and the only concern I have is regarding the fencing device. If I read
>> this correctly, there is no need to configure delayed fencing if the
>> fence device can guarantee serialized access.My question here is does
>> the fence_aws agent guarantee serialized access? In the event of a loss
>> of communication between the two cluster nodes, can I guarantee that one
>> host will win the race to fence the other and I won't end up in a
>> situation where both hosts get fenced.
>>
>> Do I need to implement delayed fencing with the fence_aws agent or not?
>> I appreciate any feedback.
>>
>> Thanks,
>>
>> *Phil Stedman*
>> Db2 High Availability Development and Support
>> Email: pmstedma at us.ibm.com
> It would depend on AWS, and I don't believe it's a good idea to design a
> solution that depends on a third party's behaviour.
>
> There's another aspect of fence delays to consider as well; It's also to
> help ensure that the best node survives, not just that one of them does.
> So say your DB is running on node 1, you want to preferentially fence
> node 2. If, later, your DB moves to node 2, then you want to reconfigure
> your stonith devices to preferentially fence node 1.
>
> The delay parameter tells the agent to wait N seconds before fencing the
> associated node. So if your DB is on node 1, you would set the stonith
> device configuration that terminates node 1 to have, say, 'delay="15"'.
> This way, node 2 looks up how to fence node 1, sees the delay, and
> sleeps. Node 1 looks up how to fence node 2, sees no delay, and fences
> immediately. Node 2 is dead before the sleep exits, ensuring in a comms
> break where both nodes are otherwise OK that the node 1, the service
> host, lives.
>
Just as a note to the above I wanted to mention 2 approaches
to automatically give some preference to the 'better' node
in these fencing-races:
- priority-fencing-delay - introduced by Yan Gao earlier this year
   Optionally derive the priority of a node from the
   resource-prioritiesof the resources it is running.
   In a fencing-race the node with the highest priority
   has a certainadvantage over the others as fencing requests
   for that node areexecuted with an additional delay.
- fence_heuristics_ping
   Not really a fencing agent by itself!
   Put on the same fencing level with the actual fencing agent for
   your node to make actual fencing depend on the result of (own)
   connectivity determinded using ping heuristics.
   Btw. still waiting for feedback on the basic idea and
   contributions picking up the idea taking into account
   other aspects that might make a node the 'better' node ;-)
Klaus
More information about the Users
mailing list