[ClusterLabs] Two-node Pacemaker cluster with "fence_aws" fence agent

Klaus Wenninger kwenning at redhat.com
Mon Sep 7 05:26:01 EDT 2020


On 9/4/20 11:24 PM, Digimer wrote:
> On 2020-09-04 5:15 p.m., Philippe M Stedman wrote:
>> Hi ClusterLabs development,
>>
>> I am in the process of deploying a two-node cluster on AWS and using the
>> fence_aws fence agent for fencing. I was reading through the following
>> article about common pitfalls in configuring two-node Pacemaker clusters:
>> https://www.thegeekdiary.com/most-common-two-node-pacemaker-cluster-issues-and-their-workarounds/
>>
>> and the only concern I have is regarding the fencing device. If I read
>> this correctly, there is no need to configure delayed fencing if the
>> fence device can guarantee serialized access.My question here is does
>> the fence_aws agent guarantee serialized access? In the event of a loss
>> of communication between the two cluster nodes, can I guarantee that one
>> host will win the race to fence the other and I won't end up in a
>> situation where both hosts get fenced.
>>
>> Do I need to implement delayed fencing with the fence_aws agent or not?
>> I appreciate any feedback.
>>
>> Thanks,
>>
>> *Phil Stedman*
>> Db2 High Availability Development and Support
>> Email: pmstedma at us.ibm.com
> It would depend on AWS, and I don't believe it's a good idea to design a
> solution that depends on a third party's behaviour.
>
> There's another aspect of fence delays to consider as well; It's also to
> help ensure that the best node survives, not just that one of them does.
> So say your DB is running on node 1, you want to preferentially fence
> node 2. If, later, your DB moves to node 2, then you want to reconfigure
> your stonith devices to preferentially fence node 1.
>
> The delay parameter tells the agent to wait N seconds before fencing the
> associated node. So if your DB is on node 1, you would set the stonith
> device configuration that terminates node 1 to have, say, 'delay="15"'.
> This way, node 2 looks up how to fence node 1, sees the delay, and
> sleeps. Node 1 looks up how to fence node 2, sees no delay, and fences
> immediately. Node 2 is dead before the sleep exits, ensuring in a comms
> break where both nodes are otherwise OK that the node 1, the service
> host, lives.
>
Just as a note to the above I wanted to mention 2 approaches
to automatically give some preference to the 'better' node
in these fencing-races:

- priority-fencing-delay - introduced by Yan Gao earlier this year
    Optionally derive the priority of a node from the
    resource-prioritiesof the resources it is running.
    In a fencing-race the node with the highest priority
    has a certainadvantage over the others as fencing requests
    for that node areexecuted with an additional delay.

- fence_heuristics_ping
    Not really a fencing agent by itself!
    Put on the same fencing level with the actual fencing agent for
    your node to make actual fencing depend on the result of (own)
    connectivity determinded using ping heuristics.

    Btw. still waiting for feedback on the basic idea and
    contributions picking up the idea taking into account
    other aspects that might make a node the 'better' node ;-)


Klaus



More information about the Users mailing list