[ClusterLabs] Antw: Re: questions about startup fencing

Mon Dec 4 15:47:18 UTC 2017

Dne 4.12.2017 v 16:02 Kristoffer Grönlund napsal(a):
> Tomas Jelinek <tojeline at redhat.com> writes:
> 
>>>
>>> * how is it shutting down the cluster when issuing "pcs cluster stop --all"?
>>
>> First, it sends a request to each node to stop pacemaker. The requests
>> are sent in parallel which prevents resources from being moved from node
>> to node. Once pacemaker stops on all nodes, corosync is stopped on all
>> nodes in the same manner.
>>
>>> * any race condition possible where the cib will record only one node up before
>>>     the last one shut down?
>>> * will the cluster start safely?
> 
> That definitely sounds racy to me. The best idea I can think of would be
> to set all nodes except one in standby, and then shutdown pacemaker
> everywhere...
> 

What issues does it solve? Which node should be the one?

How do you get the nodes out of standby mode on startup? Sure, 'pcs 
cluster start --all' could do that. If it is used to start the cluster 
that is. What if you start the cluster by restarting the nodes? Or by 
starting corosync and pacemaker via systemd without using pcs? Or by any 
other method?

There is no reliable way to get nodes out of standby/maintenance mode on 
start, so we must stick to simple pacemaker shutdown.

Moreover, even if pcs is used, how do we know a node was put into 
standby because the whole cluster was stopped and not because a user set 
it to standby manually for whatever reason?