[ClusterLabs] Pacemaker and Stonith : passive node won't bring up resources

Wed Jun 24 19:42:43 UTC 2015

On 24/06/15 01:00 PM, Mathieu Valois wrote:
> 
> Le 24/06/2015 18:29, Ken Gaillot a écrit :
>> On 06/24/2015 10:58 AM, Mathieu Valois wrote:
>>> Hi everybody,
>>> I'm working with Pacemaker and Stonith for High-Availability with
>>> 2-nodes cluster (called here A and B). Both nodes have one IPMI as fence
>>> device.
>>>
>>> The deal is :
>>>
>>>  * A is currently running resources
>>>  * B is in passive mode
>>>
>>> Then I plug off the supply of the A node. So every eth interfaces AND
>>> IPMI on A are unavailable. Here comes the trick : B tries unsuccessfully
>>> to bring A down, cause A's IPMI is unreachable. When N attempts have
>>> been done, B gives up and brings itself to "Block" state (called IDLE in
>>> the log file).
>> The behavior you describe is exactly what's intended. Since B can't
>> *confirm* that A is down, it can't run resources without risking a
>> split-brain situation.
>>
>>> Here is my question : how can I force B to bring back resources even if
>>> Stonith A fails ?
>> IPMI is not sufficient to be used as the only fence device. The
>> preferred solution is to create a fencing topology with the IPMI as the
>> first level, and a different fencing device (such as intelligent power
>> strip) as the second level.
>>
>>> I understand the consequences (concurrent writes, etc ...), but I rather
>>> like these compared to a service unavailable at all.
>>>
>>> Thanks for the help :)
>> And here you get into perhaps the biggest recurring controversy in high
>> availability. :) Depending on your resources, a split-brain situation
>> might corrupt or lose some or all of your data. Silent corruption can be
>> worse, you might have bad data and not even know it.
> I can't afford getting another fencing device. I'm forced to do this
> way. I've heard about quorum disk to manage split-brain issue.
> Could it be used in such a case with only one IPMI device for each node
> ? What does it involve ?

Quorum disk is a tool to help determine which node(s) should be quorate
in a partition. It is not a substitute for fencing; Quorum and fencing
server different roles.

If you want to be able to survive the total loss of the node, you must
use secondary fencing. Switched PDUs, like the AP7900, can often be
found on the used market for ~$200 each.

The software can not be configured to accept split-brains/data-loss in
such a case. To the HA stack, "critical" means "critical", not "critical
most of the time".

>> The consensus of HA professionals is that your data is not "available"
>> if it is corrupted, so proper fencing is a necessity.
>>
>> That said, some people do drive without their seat belts on :) so it is
>> possible to do what your describe. Dummy/null fence agents can always
>> return success. It's playing Russian roulette with your data though.

Don't do this. You're short-circuiting safety systems.

If the node fails, let it block. If you are certain the peer is dead,
clear the fence manually.

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?