[ClusterLabs] best practice fencing with ipmi in 2node-setups / cloneresource/monitor/timeout

Wed Sep 21 14:13:56 UTC 2016

On 09/21/2016 01:51 AM, Stefan Bauer wrote:
> Hi Ken,
> 
> let met sum it up:
> 
> Pacemaker in recent versions is smart enough to run (trigger, execute) the fence operation on the node, that is not the target.
> 
> If i have an external stonith device that can fence multiple nodes, a single primitive is enough in pacemaker.
> 
> If with external/ipmi i can only address a single node, i need to have multiple primitives - one for each node.
> 
> In this case it's recommended to let the primitive always run on the opposite node - right?

Yes, exactly :-)

In terms of implementation, I'd use a +INFINITY location constraint to
tie the device to the opposite node. This approach (as opposed to a
-INFINITY constraint on the target node) allows the target node to run
the fence device when the opposite node is unavailable.

> thank you.
> 
> Stefan
>  
> -----Ursprüngliche Nachricht-----
>> Von:Ken Gaillot <kgaillot at redhat.com>
>> Gesendet: Die 20 September 2016 16:49
>> An: users at clusterlabs.org
>> Betreff: Re: [ClusterLabs] best practice fencing with ipmi in 2node-setups / cloneresource/monitor/timeout
>>
>> On 09/20/2016 06:42 AM, Digimer wrote:
>>> On 20/09/16 06:59 AM, Stefan Bauer wrote:
>>>> Hi,
>>>>
>>>> i run a 2 node cluster and want to be save in split-brain scenarios. For
>>>> this i setup external/ipmi to stonith the other node.
>>>
>>> Please use 'fence_ipmilan'. I believe that the older external/ipmi are
>>> deprecated (someone correct me if I am wrong on this).
>>
>> It's just an alternative. The "external/" agents come with the
>> cluster-glue package, which isn't provided by some distributions (such
>> as RHEL and its derivatives), so it's "deprecated" on those only.
>>
>>>> Some possible issues jumped to my mind and i would ike to find the best
>>>> practice solution:
>>>>
>>>> - I have a primitive for each node to stonith. Many documents and guides
>>>> recommend to never let them run on the host it should fence. I would
>>>> setup clone resources to avoid dealing with locations that would also
>>>> influence scoring. Does that make sense?
>>>
>>> Since v1.1.10 of pacemaker, you don't have to worry about this.
>>> Pacemaker is smart enough to know where to run a fence call from in
>>> order to terminate a target.
>>
>> Right, fence devices can run anywhere now, and in fact they don't even
>> have to be "running" for pacemaker to use them -- as long as they are
>> configured and not intentionally disabled, pacemaker will use them.
>>
>> There is still a slight advantage to not running a fence device on a
>> node it can fence. "Running" a fence device in pacemaker really means
>> running the recurring monitor for it. Since the node that runs the
>> monitor has "verified" access to the device, pacemaker will prefer to
>> use it to execute that device. However, pacemaker will not use a node to
>> fence itself, except as a last resort if no other node is available. So,
>> running a fence device on a node it can fence means that the preference
>> is lost.
>>
>> That's a very minor detail, not worth worrying about. It's more a matter
>> of personal preference.
>>
>> In this particular case, a more relevant concern is that you need
>> different configurations for the different targets (the IPMI address is
>> different).
>>
>> One approach is to define two different fence devices, each with one
>> IPMI address. In that case, it makes sense to use the location
>> constraints to ensure the device prefers the node that's not its target.
>>
>> Another approach (if the fence agent supports it) is to use
>> pcmk_host_map to provide a different "port" (IPMI address) depending on
>> which host is being fenced. In this case, you need only one fence device
>> to be able to fence both hosts. You don't need a clone. (Remember, the
>> node "running" the device merely refers to its monitor, so the cluster
>> can still use the fence device, even if that node crashes.)
>>
>>>> - Monitoring operation on the stonith primitive is dangerous. I read
>>>> that if monitor operations fail for the stonith device, stonith action
>>>> is triggered. I think its not clever to give the cluster the option to
>>>> fence a node just because it has an issue to monitor a fence device.
>>>> That should not be a reason to shutdown a node. What is your opinion on
>>>> this? Can i just set the primitive monitor operation to disabled?
>>>
>>> Monitoring is how you will detect that, for example, the IPMI cable
>>> failed or was unplugged. I do not believe the node will get fenced on
>>> fence agent monitor failing... At least not by default.
>>
>> I am not aware of any situation in which a failing fence monitor
>> triggers a fence. Monitoring is good -- it verifies that the fence
>> device is still working.
>>
>> One concern particular to on-board IPMI devices is that they typically
>> share the same power supply as their host. So if the machine loses
>> power, the cluster can't contact the IPMI to fence it -- which means it
>> will be unable to recover any resources from the lost node. (It can't
>> assume the node lost power -- it's possible just network connectivity
>> between the two nodes was lost.)
>>
>> The only way around that is to have a second fence device (such as an
>> intelligent power switch). If the cluster can't reach the IPMI, it will
>> try the second device.