[ClusterLabs] Gracefully stop nodes one by one with disk-less sbd
YGao at suse.com
Fri Aug 9 15:06:32 EDT 2019
On 8/9/19 6:40 PM, Andrei Borzenkov wrote:
> 09.08.2019 16:34, Yan Gao пишет:
>> With disk-less sbd, it's fine to stop cluster service from the cluster
>> nodes all at the same time.
>> But if to stop the nodes one by one, for example with a 3-node cluster,
>> after stopping the 2nd node, the only remaining node resets itself with:
> That is sort of documented in SBD manual page:
> However, while the cluster is in such a degraded state, it can
> neither successfully fence nor be shutdown cleanly (as taking the
> cluster below the quorum threshold will immediately cause all remaining
> nodes to self-fence).
> SBD in shared-nothing mode is basically always in such degraded state
> and cannot tolerate loss of quorum.
Well, the context here is it loses quorum *expectedly* since the other
nodes gracefully shut down.
>> Aug 09 14:30:20 opensuse150-1 sbd: pcmk: debug:
>> notify_parent: Not notifying parent: state transient (2)
>> Aug 09 14:30:20 opensuse150-1 sbd: cluster: debug:
>> notify_parent: Notifying parent: healthy
>> Aug 09 14:30:20 opensuse150-1 sbd: warning: inquisitor_child:
>> Latency: No liveness for 4 s exceeds threshold of 3 s (healthy servants: 0)
>> I can think of the way to manipulate quorum with last_man_standing and
>> potentially also auto_tie_breaker, not to mention
>> last_man_standing_window would also be a factor... But is there a better
> Lack of cluster wide shutdown mode was mentioned more than once on this
> list. I guess the only workaround is to use higher level tools which
> basically simply try to stop cluster on all nodes at once. It is still
> susceptible to race condition.
Gracefully stopping nodes one by one on purpose is still a reasonable
need though ...
More information about the Users