[ClusterLabs] Antw: Re: Antw: Re: questions about startup fencing

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Tue Dec 5 08:42:09 EST 2017



>>> Jehan-Guillaume de Rorthais <jgdr at dalibo.com> schrieb am 04.12.2017 um 14:21 in
Nachricht <20171204142148.446ec356 at firost>:
> On Mon, 4 Dec 2017 12:31:06 +0100
> Tomas Jelinek <tojeline at redhat.com> wrote:
> 
>> Dne 4.12.2017 v 10:36 Jehan-Guillaume de Rorthais napsal(a):
>> > On Fri, 01 Dec 2017 16:34:08 -0600
>> > Ken Gaillot <kgaillot at redhat.com> wrote:
>> >   
>> >> On Thu, 2017-11-30 at 07:55 +0100, Ulrich Windl wrote:  
>> >>>
>> >>>      
>> >>>> Kristoffer Gronlund <kgronlund at suse.com> wrote:  
>> >>>>> Adam Spiers <aspiers at suse.com> writes:
>> >>>>>      
>> >>>>>> - The whole cluster is shut down cleanly.
>> >>>>>>
>> >>>>>> - The whole cluster is then started up again.  (Side question:
>> >>>>>> what
>> >>>>>>    happens if the last node to shut down is not the first to
>> >>>>>> start up?
>> >>>>>>    How will the cluster ensure it has the most recent version of
>> >>>>>> the
>> >>>>>>    CIB?  Without that, how would it know whether the last man
>> >>>>>> standing
>> >>>>>>    was shut down cleanly or not?)  
>> >>>>>
>> >>>>> This is my opinion, I don't really know what the "official"
>> >>>>> pacemaker
>> >>>>> stance is: There is no such thing as shutting down a cluster
>> >>>>> cleanly. A
>> >>>>> cluster is a process stretching over multiple nodes - if they all
>> >>>>> shut
>> >>>>> down, the process is gone. When you start up again, you
>> >>>>> effectively have
>> >>>>> a completely new cluster.  
>> >>>>
>> >>>> Sorry, I don't follow you at all here.  When you start the cluster
>> >>>> up
>> >>>> again, the cluster config from before the shutdown is still there.
>> >>>> That's very far from being a completely new cluster :-)  
>> >>>
>> >>> The problem is you cannot "start the cluster" in pacemaker; you can
>> >>> only "start nodes". The nodes will come up one by one. As opposed (as
>> >>> I had said) to HP Sertvice Guard, where there is a "cluster formation
>> >>> timeout". That is, the nodes wait for the specified time for the
>> >>> cluster to "form". Then the cluster starts as a whole. Of course that
>> >>> only applies if the whole cluster was down, not if a single node was
>> >>> down.  
>> >>
>> >> I'm not sure what that would specifically entail, but I'm guessing we
>> >> have some of the pieces already:
>> >>
>> >> - Corosync has a wait_for_all option if you want the cluster to be
>> >> unable to have quorum at start-up until every node has joined. I don't
>> >> think you can set a timeout that cancels it, though.
>> >>
>> >> - Pacemaker will wait dc-deadtime for the first DC election to
>> >> complete. (if I understand it correctly ...)
>> >>
>> >> - Higher-level tools can start or stop all nodes together (e.g. pcs has
>> >> pcs cluster start/stop --all).  
>> > 
>> > Based on this discussion, I have some questions about pcs:
>> > 
>> > * how is it shutting down the cluster when issuing "pcs cluster stop
>> > --all"?  
>> 
>> First, it sends a request to each node to stop pacemaker. The requests 
>> are sent in parallel which prevents resources from being moved from node 
>> to node. Once pacemaker stops on all nodes, corosync is stopped on all 
>> nodes in the same manner.
> 
> What if for some external reasons one node is slower (load, network, 
> whatever)
> than the others and start reacting ? Sending queries in parallel doesn't
> feels safe enough in regard with all the race conditions that can occurs in 
> the
> same time.
> 
> Am I missing something ?

I can only agree that this type of "cluster ghutdown" is unclean, leaving each node with a different CIB most likely (and many aborted transitions).

> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org 
> http://lists.clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org





More information about the Users mailing list