[ClusterLabs] Antw: Re: questions about startup fencing

Mon Dec 4 12:31:06 CET 2017

Dne 4.12.2017 v 10:36 Jehan-Guillaume de Rorthais napsal(a):
> On Fri, 01 Dec 2017 16:34:08 -0600
> Ken Gaillot <kgaillot at redhat.com> wrote:
> 
>> On Thu, 2017-11-30 at 07:55 +0100, Ulrich Windl wrote:
>>>
>>>    
>>>> Kristoffer Gronlund <kgronlund at suse.com> wrote:
>>>>> Adam Spiers <aspiers at suse.com> writes:
>>>>>    
>>>>>> - The whole cluster is shut down cleanly.
>>>>>>
>>>>>> - The whole cluster is then started up again.  (Side question:
>>>>>> what
>>>>>>    happens if the last node to shut down is not the first to
>>>>>> start up?
>>>>>>    How will the cluster ensure it has the most recent version of
>>>>>> the
>>>>>>    CIB?  Without that, how would it know whether the last man
>>>>>> standing
>>>>>>    was shut down cleanly or not?)
>>>>>
>>>>> This is my opinion, I don't really know what the "official"
>>>>> pacemaker
>>>>> stance is: There is no such thing as shutting down a cluster
>>>>> cleanly. A
>>>>> cluster is a process stretching over multiple nodes - if they all
>>>>> shut
>>>>> down, the process is gone. When you start up again, you
>>>>> effectively have
>>>>> a completely new cluster.
>>>>
>>>> Sorry, I don't follow you at all here.  When you start the cluster
>>>> up
>>>> again, the cluster config from before the shutdown is still there.
>>>> That's very far from being a completely new cluster :-)
>>>
>>> The problem is you cannot "start the cluster" in pacemaker; you can
>>> only "start nodes". The nodes will come up one by one. As opposed (as
>>> I had said) to HP Sertvice Guard, where there is a "cluster formation
>>> timeout". That is, the nodes wait for the specified time for the
>>> cluster to "form". Then the cluster starts as a whole. Of course that
>>> only applies if the whole cluster was down, not if a single node was
>>> down.
>>
>> I'm not sure what that would specifically entail, but I'm guessing we
>> have some of the pieces already:
>>
>> - Corosync has a wait_for_all option if you want the cluster to be
>> unable to have quorum at start-up until every node has joined. I don't
>> think you can set a timeout that cancels it, though.
>>
>> - Pacemaker will wait dc-deadtime for the first DC election to
>> complete. (if I understand it correctly ...)
>>
>> - Higher-level tools can start or stop all nodes together (e.g. pcs has
>> pcs cluster start/stop --all).
> 
> Based on this discussion, I have some questions about pcs:
> 
> * how is it shutting down the cluster when issuing "pcs cluster stop --all"?

First, it sends a request to each node to stop pacemaker. The requests 
are sent in parallel which prevents resources from being moved from node 
to node. Once pacemaker stops on all nodes, corosync is stopped on all 
nodes in the same manner.

> * any race condition possible where the cib will record only one node up before
>    the last one shut down?
> * will the cluster start safely?
> 
> IIRC, crmsh does not implement the full cluster shutdown, only one node shut
> down at a time. Is it because Pacemaker has no way to shutdown the whole
> cluster by stopping all resources everywhere forbidding failovers in the
> process?
> 
> Is it required to include a bunch of "pcs resource disable <rid>" before
> shutting down the cluster?

No.

Regards,
Tomas

> 
> Thanks,
>