[ClusterLabs] Antw: Re: How is fencing and unfencing suppose to work?

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Mon Oct 1 03:06:45 EDT 2018


>>> digimer <lists at alteeve.ca> schrieb am 28.09.2018 um 19:11 in Nachricht
<968d00cd-fad5-8f17-edfd-7787a9964355 at alteeve.ca>:
> On 2018-09-04 8:49 p.m., Ken Gaillot wrote:
>> On Tue, 2018-08-21 at 10:23 -0500, Ryan Thomas wrote:
>>> I’m seeing unexpected behavior when using “unfencing” – I don’t think
>>> I’m understanding it correctly.  I configured a resource that
>>> “requires unfencing” and have a custom fencing agent which “provides
>>> unfencing”.   I perform a simple test where I setup the cluster and
>>> then run “pcs stonith fence node2”, and I see that node2 is
>>> successfully fenced by sending an “off” action to my fencing agent.
>>> But, immediately after this, I see an “on” action sent to my fencing
>>> agent.  My fence agent doesn’t implement the “reboot” action, so
>>> perhaps its trying to reboot by running an off action followed by a
>>> on action.  Prior to adding “provides unfencing” to the fencing
>>> agent, I didn’t see the on action. It seems unsafe to say “node2 you
>>> can’t run” and then immediately “ you can run”.
>> I'm not as familiar with unfencing as I'd like, but I believe the basic
>> idea is:
>>
>> - the fence agent's off action cuts the machine off from something
>> essential needed to run resources (generally shared storage or network
>> access)
>>
>> - the fencing works such that a fenced host is not able to request
>> rejoining the cluster without manual intervention by a sysadmin
>>
>> - when the sysadmin allows the host back into the cluster, and it
>> contacts the other nodes to rejoin, the cluster will call the fence
>> agent's on action, which is expected to re-enable the host's access
>>
>> How that works in practice, I have only vague knowledge.
> 
> This is correct. Consider fabric fencing where fiber channel ports are 
> disconnected. Unfence restores the connection. Similar to a pure 'off' 
> fence call to switched PDUs, as you mention above. Unfence powers the 
> outlets back up.

I doubt whether successful fencing can be emulated by "pausing" I/O: when
re-establishing the fabric, outstanding I/Os might be performed (which cannot
happen after real fencing).

[...]

Regards,
Ulrich



More information about the Users mailing list