[ClusterLabs] Antw: SBD & Failed Peer
Ulrich Windl
Ulrich.Windl at rz.uni-regensburg.de
Mon Sep 7 06:15:35 UTC 2015
>>> Jorge Fábregas <jorge.fabregas at gmail.com> schrieb am 06.09.2015 um 22:23
in
Nachricht <55ECA0B8.6090708 at gmail.com>:
> Hi,
>
> I was reading one of the latest posts [1] from Andrew Beekhof on SBD and
> got me into thinking...
>
> Assume an active/active cluster using OCFS2 and SBD with shared storage.
> Then one node explodes (the hardware watchdog is gone as well
> obviously). At this point my guess is that the remaining node will
> notice that its partner hasn't updated its mailbox slot on the SBD
> shared-storage.
>
> My question: Is this enough proof (confirmation) that the other node
> isn't capable of causing corruption? And so...will DLM/OCFS2 resume
> operation?
IMHO it will wor differently: If the node goes down, the network layer
(corosync) will notice that (sooner or later depending on some settings). The a
remaining node will try a fencing operation. After some time (also
configurable) the remaining nodes will assume the other node was fenced
successfully. I doesn not mean that anything actually happened, but that's the
way it's designed. You'll have to make sure things work as configured.
>
> Thanks,
> Jorge
>
> [1]: http://blog.clusterlabs.org/blog/2015/sbd-fun-and-profit/
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Users
mailing list