[ClusterLabs] Antw: Re: questions about startup fencing

Fri Dec 1 17:34:08 EST 2017

On Thu, 2017-11-30 at 07:55 +0100, Ulrich Windl wrote:
> 
> 
> > Kristoffer Gronlund <kgronlund at suse.com> wrote:
> > > Adam Spiers <aspiers at suse.com> writes:
> > > 
> > > > - The whole cluster is shut down cleanly.
> > > > 
> > > > - The whole cluster is then started up again.  (Side question:
> > > > what
> > > >   happens if the last node to shut down is not the first to
> > > > start up?
> > > >   How will the cluster ensure it has the most recent version of
> > > > the
> > > >   CIB?  Without that, how would it know whether the last man
> > > > standing
> > > >   was shut down cleanly or not?)
> > > 
> > > This is my opinion, I don't really know what the "official"
> > > pacemaker
> > > stance is: There is no such thing as shutting down a cluster
> > > cleanly. A
> > > cluster is a process stretching over multiple nodes - if they all
> > > shut
> > > down, the process is gone. When you start up again, you
> > > effectively have
> > > a completely new cluster.
> > 
> > Sorry, I don't follow you at all here.  When you start the cluster
> > up
> > again, the cluster config from before the shutdown is still there.
> > That's very far from being a completely new cluster :-)
> 
> The problem is you cannot "start the cluster" in pacemaker; you can
> only "start nodes". The nodes will come up one by one. As opposed (as
> I had said) to HP Sertvice Guard, where there is a "cluster formation
> timeout". That is, the nodes wait for the specified time for the
> cluster to "form". Then the cluster starts as a whole. Of course that
> only applies if the whole cluster was down, not if a single node was
> down.

I'm not sure what that would specifically entail, but I'm guessing we
have some of the pieces already:

- Corosync has a wait_for_all option if you want the cluster to be
unable to have quorum at start-up until every node has joined. I don't
think you can set a timeout that cancels it, though.

- Pacemaker will wait dc-deadtime for the first DC election to
complete. (if I understand it correctly ...)

- Higher-level tools can start or stop all nodes together (e.g. pcs has
pcs cluster start/stop --all).

> > 
> > > When starting up, how is the cluster, at any point, to know if
> > > the
> > > cluster it has knowledge of is the "latest" cluster?
> > 
> > That was exactly my question.
> > 
> > > The next node could have a newer version of the CIB which adds
> > > yet
> > > more nodes to the cluster.
> > 
> > Yes, exactly.  If the first node to start up was not the last man
> > standing, the CIB history is effectively being forked.  So how is
> > this
> > issue avoided?
> 
> Quorum? "Cluster formation delay"?
> 
> > 
> > > The only way to bring up a cluster from being completely stopped
> > > is to
> > > treat it as creating a completely new cluster. The first node to
> > > start
> > > "creates" the cluster and later nodes join that cluster.
> > 
> > That's ignoring the cluster config, which persists even when the
> > cluster's down.
> > 
> > But to be clear, you picked a small side question from my original
> > post and answered that.  The main questions I had were about
> > startup
> > fencing :-)
-- 
Ken Gaillot <kgaillot at redhat.com>