[ClusterLabs] How to reduce SBD watchdog timeout?
Lars Marowsky-Bree
lmb at suse.com
Mon Apr 8 06:38:05 EDT 2019
On 2019-04-07T12:06:40, Andrei Borzenkov <arvidjaar at gmail.com> wrote:
> After reading sources and experimenting I still do not see how it can
> help in two node cluster. In this case SBD will assume both nodes are
> out of quorum and both nodes will commit suicide.
It helps by not making a single SBD device a single point of failure in
the cluster.
That's particularly relevant in 2 node clusters which can more readily
experience loss of quorum temporarily due to loss of connectivity.
You're right it's not a perfect choice. Multiple SBDs or 2+ nodes are
preferable.
> Pacemaker integration may be useful in two node cluster as backup to
> avoid suicide in case of temporary disk access issues, but we still need
> disk as primary channel with all associated considerations for proper
> timeouts.
Right. If they do lose the device access in addition to being in a
non-quorate or dirty state, the node(s) will still suicide. So even if
it can't read the poison pill from the surviving other node, it'll
self-fence and the other node can trust this and proceed. That is, at
least, the idea.
--
SUSE Linux GmbH, GF: Felix Imendörffer, Mary Higgins, Sri Rasiah, HRB 21284 (AG Nürnberg)
"Architects should open possibilities and not determine everything." (Ueli Zbinden)
More information about the Users
mailing list