[ClusterLabs] Antw: [EXT] Re: Stopping the last node with pcs

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Thu Apr 29 02:53:36 EDT 2021


>>> Klaus Wenninger <kwenning at redhat.com> schrieb am 28.04.2021 um 16:26 in
Nachricht <65bb39be-918e-974c-82a4-522f49bb315b at redhat.com>:
> On 4/28/21 4:10 PM, Ken Gaillot wrote:
>> On Tue, 2021‑04‑27 at 23:23 ‑0400, Digimer wrote:
>>> Hi all,
>>>
>>>    I noticed something odd.
>>>
>>> ====
>>> [root at an‑a02n01 ~]# pcs cluster status
>>> Cluster Status:
>>>   Cluster Summary:
>>>     * Stack: corosync
>>>     * Current DC: an‑a02n01 (version 2.0.4‑6.el8_3.2‑2deceaa3ae) ‑
>>> partition with quorum
>>>     * Last updated: Tue Apr 27 23:20:45 2021
>>>     * Last change:  Tue Apr 27 23:12:40 2021 by root via cibadmin on
>>> an‑a02n01
>>>     * 2 nodes configured
>>>     * 12 resource instances configured (4 DISABLED)
>>>   Node List:
>>>     * Online: [ an‑a02n01 ]
>>>     * OFFLINE: [ an‑a02n02 ]
>>>
>>> PCSD Status:
>>>    an‑a02n01: Online
>>>    an‑a02n02: Offline
>>> ====
>>> [root at an‑a02n01 ~]# pcs cluster stop
>>> Error: Stopping the node will cause a loss of the quorum, use ‑‑force
>>> to
>>> override
>>> ====
>>>
>>>    Shouldn't pcs know it's the last node and shut down without
>>> complaint?
>> It knows, it's just not sure you know :)
>>
>> pcs's design philosophy is to hand‑hold users by default and give
>> expert users ‑‑force.
>>
>> The idea in this case is that (especially in 3‑to‑5‑node clusters)
>> someone might not realize that stopping one node could make all
>> resources stop cluster‑wide.
> Guess what we're seeing is tailored for >=3 node clusters.
> There you don't need to warn before stopping the last node
> but it definitely makes more sense to warn before stopping
> the node that would lead to quorum loss.
> And the statement somehow isn't wrong it is just a bit
> surprising.
> Having an adapted message for cases where we have just
> one node left that is still running resources might make
> things clearer (2‑node, LMS, qdevice, ...).

An alternative could be (if desired) to make the command "conditionally
interactive", like this:
Stopping the node will cause a loss of quorum, thus stopping all resources;
proceed (y/n)?

Of course that would only apply if the cluster is not in maintenace mode and
the loss of quorum still triggers stopping of all resources...

Or maybe even more simple for the advanced users:
Do you know what you are doing (y/n)?

;-)

Regards,
Ulrich


> 
> Klaus
> 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 





More information about the Users mailing list