[ClusterLabs] Antw: Re: questions about startup fencing

Mon Dec 4 10:36:41 CET 2017

On Fri, 01 Dec 2017 16:34:08 -0600
Ken Gaillot <kgaillot at redhat.com> wrote:

> On Thu, 2017-11-30 at 07:55 +0100, Ulrich Windl wrote:
> > 
> >   
> > > Kristoffer Gronlund <kgronlund at suse.com> wrote:  
> > > > Adam Spiers <aspiers at suse.com> writes:
> > > >   
> > > > > - The whole cluster is shut down cleanly.
> > > > > 
> > > > > - The whole cluster is then started up again.  (Side question:
> > > > > what
> > > > >   happens if the last node to shut down is not the first to
> > > > > start up?
> > > > >   How will the cluster ensure it has the most recent version of
> > > > > the
> > > > >   CIB?  Without that, how would it know whether the last man
> > > > > standing
> > > > >   was shut down cleanly or not?)  
> > > > 
> > > > This is my opinion, I don't really know what the "official"
> > > > pacemaker
> > > > stance is: There is no such thing as shutting down a cluster
> > > > cleanly. A
> > > > cluster is a process stretching over multiple nodes - if they all
> > > > shut
> > > > down, the process is gone. When you start up again, you
> > > > effectively have
> > > > a completely new cluster.  
> > > 
> > > Sorry, I don't follow you at all here.  When you start the cluster
> > > up
> > > again, the cluster config from before the shutdown is still there.
> > > That's very far from being a completely new cluster :-)  
> > 
> > The problem is you cannot "start the cluster" in pacemaker; you can
> > only "start nodes". The nodes will come up one by one. As opposed (as
> > I had said) to HP Sertvice Guard, where there is a "cluster formation
> > timeout". That is, the nodes wait for the specified time for the
> > cluster to "form". Then the cluster starts as a whole. Of course that
> > only applies if the whole cluster was down, not if a single node was
> > down.  
> 
> I'm not sure what that would specifically entail, but I'm guessing we
> have some of the pieces already:
> 
> - Corosync has a wait_for_all option if you want the cluster to be
> unable to have quorum at start-up until every node has joined. I don't
> think you can set a timeout that cancels it, though.
> 
> - Pacemaker will wait dc-deadtime for the first DC election to
> complete. (if I understand it correctly ...)
> 
> - Higher-level tools can start or stop all nodes together (e.g. pcs has
> pcs cluster start/stop --all).

Based on this discussion, I have some questions about pcs:

* how is it shutting down the cluster when issuing "pcs cluster stop --all"?
* any race condition possible where the cib will record only one node up before
  the last one shut down?
* will the cluster start safely?

IIRC, crmsh does not implement the full cluster shutdown, only one node shut
down at a time. Is it because Pacemaker has no way to shutdown the whole
cluster by stopping all resources everywhere forbidding failovers in the
process?

Is it required to include a bunch of "pcs resource disable <rid>" before
shutting down the cluster?

Thanks,
-- 
Jehan-Guillaume de Rorthais
Dalibo