[ClusterLabs] Wtrlt: Antw: Re: Antw: Re: how important would you consider to have two independent fencing device for each node ?

Fri Apr 21 04:10:30 EDT 2017

Ken Gaillot <kgaillot at redhat.com> writes:

>>> I think it works differently: One task periodically reads ist mailbox slot 
>>> for commands, and once a comment was read, it's executed immediately. Only
>> if 
>>> the read task does hang for a long time, the watchdog itself triggers a
>> reset 
>>> (as SBD seems dead). So the delay is actually made from the sum of "write 
>>> delay", "read delay", "command excution".
>
> I think you're right when sbd uses shared-storage, but there is a
> watchdog-only configuration that I believe digimer was referring to.
>
> With watchdog-only, the cluster will wait for the value of the
> stonith-watchdog-timeout property before considering the fencing successful.

I think there are some important distictions to make, to clarify what
SBD is and how it works:

* The original SBD model uses shared storage as its fencing mechanism
  (thus the name Shared-storage based death) - when talking about
  watchdog-only SBD, a new mode only introduced in a fork of the SBD
  project, it would probably help avoid confusion to be explicit about
  that.

* Watchdog-only SBD relies on quorum to avoid split-brain or fence
  loops, and thus requires at least three nodes or an additional qdevice
  node. This is my understanding, correct me if I am wrong. Also, this
  disqualifies watchdog-sbd from any of Digimers setups since they are
  2-node only, so that's probably something to be aware of in this
  discussion. ;)

* The watchdog fencing in SBD is not the primary fence mechanism when
  shared storage is available. In fact, it is an optional although
  strongly recommended component. [1]

[1]: We (as in SUSE) require use of a watchdog for supported
configurations, but technically it is optional.

-- 
// Kristoffer Grönlund
// kgronlund at suse.com