[ClusterLabs] Antwort: Re: reboot node / cluster standby
kgaillot at redhat.com
Mon Jul 3 12:30:12 EDT 2017
On 07/03/2017 08:30 AM, philipp.achmueller at arz.at wrote:
> Ken Gaillot <kgaillot at redhat.com> schrieb am 29.06.2017 21:15:59:
>> Von: Ken Gaillot <kgaillot at redhat.com>
>> An: Ludovic Vaugeois-Pepin <ludovicvp at gmail.com>, Cluster Labs - All
>> topics related to open-source clustering welcomed <users at clusterlabs.org>
>> Datum: 29.06.2017 21:19
>> Betreff: Re: [ClusterLabs] reboot node / cluster standby
>> On 06/29/2017 01:38 PM, Ludovic Vaugeois-Pepin wrote:
>> > On Thu, Jun 29, 2017 at 7:27 PM, Ken Gaillot <kgaillot at redhat.com>
>> >> On 06/29/2017 04:42 AM, philipp.achmueller at arz.at wrote:
>> >>> Hi,
>> >>> In order to reboot a Clusternode i would like to set the node to
>> >>> first, so a clean takeover for running resources can take in place.
>> >>> Is there a default way i can set in pacemaker, or do i have to
> setup my
>> >>> own systemd implementation?
>> >>> thank you!
>> >>> regards
>> >>> ------------------------
>> >>> env:
>> >>> Pacemaker 1.1.15
>> >>> SLES 12.2
>> >> If a node cleanly shuts down or reboots, pacemaker will move all
>> >> resources off it before it exits, so that should happen as you're
>> >> describing, without needing an explicit standby.
> how does this work when evacuating e.g. 5 nodes out of a 10 node cluster
> at the same time?
A clean shutdown works the same regardless of the situation:
- the OS (systemd or whatever) sends a signal to pacemakerd to exit
- a pacemaker daemon on the local node sends a shutdown request to the
- the DC node moves all resources off the node
- the DC sends an "ok to shutdown" message to the node
- the node's pacemaker daemons exit
- the OS proceeds with system shutdown
The only wrinkle in 5 out of 10 nodes is that most likely (depending on
your configuration) you are losing quorum, and the cluster will stop all
resources on all nodes.
>> > This makes me wonder about timeouts. Specifically OS/systemd timeouts.
>> > Say the node being shut down or rebooted holds a resource as a master,
>> > and it takes a while for the demote to complete, say 100 seconds (less
>> > than the demote timeout of 120s in this hypothetical scenario). Will
>> > the OS/systemd wait until pacemaker exits cleanly on a regular CentOS
>> > or Debian?
>> Yes. The pacemaker systemd unit file uses TimeoutStopSec=30min.
>> >> Explicitly doing standby first would be useful mainly if you want to
>> >> manually check the results of the takeover before proceeding with the
>> >> reboot, and/or if you want the node to come back in standby mode next
>> >> time it joins.
More information about the Users