[ClusterLabs] Gracefully stop nodes one by one with disk-less sbd

Yan Gao YGao at suse.com
Mon Aug 12 11:00:05 EDT 2019



On 8/12/19 3:24 PM, Klaus Wenninger wrote:
> On 8/12/19 2:30 PM, Yan Gao wrote:
>> Hi Klaus,
>>
>> On 8/12/19 1:39 PM, Klaus Wenninger wrote:
>>> On 8/9/19 9:06 PM, Yan Gao wrote:
>>>> On 8/9/19 6:40 PM, Andrei Borzenkov wrote:
>>>>> 09.08.2019 16:34, Yan Gao пишет:
>>>>>> Hi,
>>>>>>
>>>>>> With disk-less sbd,  it's fine to stop cluster service from the cluster
>>>>>> nodes all at the same time.
>>>>>>
>>>>>> But if to stop the nodes one by one, for example with a 3-node cluster,
>>>>>> after stopping the 2nd node, the only remaining node resets itself with:
>>>>>>
>>>>> That is sort of documented in SBD manual page:
>>>>>
>>>>> --><--
>>>>> However, while the cluster is in such a degraded state, it can
>>>>> neither successfully fence nor be shutdown cleanly (as taking the
>>>>> cluster below the quorum threshold will immediately cause all remaining
>>>>> nodes to self-fence).
>>>>> --><--
>>>>>
>>>>> SBD in shared-nothing mode is basically always in such degraded state
>>>>> and cannot tolerate loss of quorum.
>>>> Well, the context here is it loses quorum *expectedly* since the other
>>>> nodes gracefully shut down.
>>>>
>>>>>> Aug 09 14:30:20 opensuse150-1 sbd[1079]:       pcmk:    debug:
>>>>>> notify_parent: Not notifying parent: state transient (2)
>>>>>> Aug 09 14:30:20 opensuse150-1 sbd[1080]:    cluster:    debug:
>>>>>> notify_parent: Notifying parent: healthy
>>>>>> Aug 09 14:30:20 opensuse150-1 sbd[1078]:  warning: inquisitor_child:
>>>>>> Latency: No liveness for 4 s exceeds threshold of 3 s (healthy servants: 0)
>>>>>>
>>>>>> I can think of the way to manipulate quorum with last_man_standing and
>>>>>> potentially also auto_tie_breaker, not to mention
>>>>>> last_man_standing_window would also be a factor... But is there a better
>>>>>> solution?
>>>>>>
>>>>> Lack of cluster wide shutdown mode was mentioned more than once on this
>>>>> list. I guess the only workaround is to use higher level tools which
>>>>> basically simply try to stop cluster on all nodes at once. It is still
>>>>> susceptible to race condition.
>>>> Gracefully stopping nodes one by one on purpose is still a reasonable
>>>> need though ...
>>> If you do the teardown as e.g. pcs is doing it - first tear down
>>> pacemaker-instances and then corosync/sbd - it is at
>>> least possible to tear down the pacemaker-instances one-by one
>>> without risking a reboot due to quorum-loss.
>>> With kind of current sbd having in
>>> -
>>> https://github.com/ClusterLabs/sbd/commit/824fe834c67fb7bae7feb87607381f9fa8fa2945
>>> -
>>> https://github.com/ClusterLabs/sbd/commit/79b778debfee5b4ab2d099b2bfc7385f45597f70
>>> -
>>> https://github.com/ClusterLabs/sbd/commit/a716a8ddd3df615009bcff3bd96dd9ae64cb5f68
>>> this should be pretty robust although we are still thinking
>>> (probably together with some heartbeat to pacemakerd
>>> that assures pacemakerd is checking liveness of sub-daemons
>>> properly) of having a cleaner way to detect graceful
>>> pacemaker-shutdown.
>> These are all good improvements, thanks!
>>
>> But in this case the remaining node is not shutting down yet, or it's
>> intentionally not being shut down :-) Loss of quorum is as expected, so
>> is following no-quorum-policy, but self-reset is probably too much?
> Hmm ... not sure if I can follow ...
> If you shutdown solely pacemaker one-by-one on all nodes
> and these shutdowns are considered graceful then you are
> not gonna experience any reboots (e.g. 3 node cluster).
> Afterwards you can shutdown corosync one-by-one as well
> without experiencing reboots as without the cib-connection
> sbd isn't gonna check for quorum anymore (all resources
> down so no need to reboot in case of quorum-loss - extra
> care has to be taken care of with unmanaged resources but
> that isn't particular with sbd).
I meant if users would like shut down only 2 out of 3 nodes in the 
cluster and keep the last one online and alive, it's simply not possible 
for now, although the loss of quorum is expected.

Regards,
   Yan


More information about the Users mailing list