[ClusterLabs] Antw: Re: Pacemaker startup-fencing

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Wed Mar 16 10:49:51 EDT 2016


>>> Ferenc Wágner <wferi at niif.hu> schrieb am 16.03.2016 um 13:47 in Nachricht
<87k2l2zj0n.fsf at lant.ki.iif.hu>:
[...]
> Then I wonder why I hear the "must have working fencing if you value
> your data" mantra so often (and always without explanation).  After all,
> it does not risk the data, only the automatic cluster recovery, right?
[...]

Imagine this situation: You have a ext[234] filesystem on a shared disk that
is mounted on node n1 in a two node cluster.

Then network connection breaks. n1 thinks n2 crashed and continues to use the
shared disk and the filesystem (and maybe some application that modifies it)
(after having waited for the fencing to succeed). n2 thinks n1 crashed and
starts to use the shared disk (and filesystem, and maybe application) (after
having waited for the fencing to succeed).

Now if you have no fencing, n1 and n2 will both write on the ext[234]
filesystem, corrupting it.

We had that once, and it it no fun. Specifically the repair tools are not very
much tuned for that type of corruption!

You can also trigger that by misusing "clearstate" while a resource is still
in use...

Clear now?

Regards,
Ulrich





More information about the Users mailing list