[ClusterLabs] Feedback wanted: Node reaction to fabric fencing

Ondrej ondrej-clusterlabs at famera.cz
Wed Jul 24 20:20:00 EDT 2019


On 7/25/19 2:33 AM, Ken Gaillot wrote:
> Hi all,
> 
> A recent bugfix (clbz#5386) brings up a question.
> 
> A node may receive notification of its own fencing when fencing is
> misconfigured (for example, an APC switch with the wrong plug number)
> or when fabric fencing is used that doesn't cut the cluster network
> (for example, fence_scsi).
> 
> Previously, the *intended* behavior was for the node to attempt to
> reboot itself in that situation, falling back to stopping pacemaker if
> that failed. However, due to the bug, the reboot always failed, so the
> behavior effectively was to stop pacemaker.
> 
> Now that the bug is fixed, the node will indeed reboot in that
> situation.
> 
> It occurred to me that some users configure fabric fencing specifically
> so that nodes aren't ever intentionally rebooted. Therefore, I intend
> to make this behavior configurable.
> 
> My question is, what do you think the default should be?
> 
> 1. Default to the correct behavior (reboot)
> 
> 2. Default to the current behavior (stop)
> 
> 3. Default to the current behavior for now, and change it to the
> correct behavior whenever pacemaker 2.1 is released (probably a few
> years from now)
> 

As long as there is option to change it I'm OK with change from next 
minor(?) version (2.0.3) to 'reboot'. But it should be pointed out in RC 
stage that this is going to occur and to get ready for it.

Is there any plan on getting this also into 1.1 branch?
If yes, then I would be for just introducing the configuration option in 
1.1.x with default to 'stop'.

--
Ondrej


More information about the Users mailing list