[ClusterLabs] Feedback wanted: Node reaction to fabric fencing
Roger Zhou
ZZhou at suse.com
Thu Jul 25 04:24:33 EDT 2019
On 7/25/19 1:33 AM, Ken Gaillot wrote:
> Hi all,
>
> A recent bugfix (clbz#5386) brings up a question.
>
> A node may receive notification of its own fencing when fencing is
> misconfigured (for example, an APC switch with the wrong plug number)
> or when fabric fencing is used that doesn't cut the cluster network
> (for example, fence_scsi).
>
> Previously, the *intended* behavior was for the node to attempt to
> reboot itself in that situation, falling back to stopping pacemaker if
> that failed. However, due to the bug, the reboot always failed, so the
> behavior effectively was to stop pacemaker.
>
> Now that the bug is fixed, the node will indeed reboot in that
> situation.
>
> It occurred to me that some users configure fabric fencing specifically
> so that nodes aren't ever intentionally rebooted. Therefore, I intend
> to make this behavior configurable.
>
> My question is, what do you think the default should be?
>
> 1. Default to the correct behavior (reboot)
>
> 2. Default to the current behavior (stop)
>
> 3. Default to the current behavior for now, and change it to the
> correct behavior whenever pacemaker 2.1 is released (probably a few
> years from now)
>
Sounds, 3) is the best choice.
Make it configurable, and keep the current behavior(stop) for backward
compatibility for the current minor version, eg. next 2.0.z(3+).
Well, the correct behavior (reboot) as the default should be enforced.
It should be the same crucial as stop failures of a resource. Make sense
in the next minor version, say, 2.1.
Thanks,
Roger
More information about the Users
mailing list