[ClusterLabs] Antw: Re: questions about startup fencing

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Thu Nov 30 01:55:14 EST 2017

> Kristoffer Gronlund <kgronlund at suse.com> wrote:
>>Adam Spiers <aspiers at suse.com> writes:
>>> - The whole cluster is shut down cleanly.
>>> - The whole cluster is then started up again.  (Side question: what
>>>   happens if the last node to shut down is not the first to start up?
>>>   How will the cluster ensure it has the most recent version of the
>>>   CIB?  Without that, how would it know whether the last man standing
>>>   was shut down cleanly or not?)
>>This is my opinion, I don't really know what the "official" pacemaker
>>stance is: There is no such thing as shutting down a cluster cleanly. A
>>cluster is a process stretching over multiple nodes - if they all shut
>>down, the process is gone. When you start up again, you effectively have
>>a completely new cluster.
> Sorry, I don't follow you at all here.  When you start the cluster up
> again, the cluster config from before the shutdown is still there.
> That's very far from being a completely new cluster :-)

The problem is you cannot "start the cluster" in pacemaker; you can only "start nodes". The nodes will come up one by one. As opposed (as I had said) to HP Sertvice Guard, where there is a "cluster formation timeout". That is, the nodes wait for the specified time for the cluster to "form". Then the cluster starts as a whole. Of course that only applies if the whole cluster was down, not if a single node was down.

>>When starting up, how is the cluster, at any point, to know if the
>>cluster it has knowledge of is the "latest" cluster?
> That was exactly my question.
>>The next node could have a newer version of the CIB which adds yet
>>more nodes to the cluster.
> Yes, exactly.  If the first node to start up was not the last man
> standing, the CIB history is effectively being forked.  So how is this
> issue avoided?

Quorum? "Cluster formation delay"?

>>The only way to bring up a cluster from being completely stopped is to
>>treat it as creating a completely new cluster. The first node to start
>>"creates" the cluster and later nodes join that cluster.
> That's ignoring the cluster config, which persists even when the
> cluster's down.
> But to be clear, you picked a small side question from my original
> post and answered that.  The main questions I had were about startup
> fencing :-)
