[ClusterLabs] Coming in Pacemaker 2.0.4: shutdown locks

Wed Feb 26 10:18:46 EST 2020

On Wed, 2020-02-26 at 14:45 +0900, Ondrej wrote:
> Hi Ken,
> 
> On 2/26/20 7:30 AM, Ken Gaillot wrote:
> > The use case is a large organization with few cluster experts and
> > many
> > junior system administrators who reboot hosts for OS updates during
> > planned maintenance windows, without any knowledge of what the host
> > does. The cluster runs services that have a preferred node and take
> > a
> > very long time to start.
> > 
> > In this scenario, pacemaker's default behavior of moving the
> > service to
> > a failover node when the node shuts down, and moving it back when
> > the
> > node comes back up, results in needless downtime compared to just
> > leaving the service down for the few minutes needed for a reboot.
> 
> 1. Do I understand it correctly that scenario will be when system 
> gracefully reboots (pacemaker service is stopped by system shutting 
> down) and also in case that users for example manually stop cluster
> but 
> doesn't reboot the node - something like `pcs cluster stop`?

Exactly. The idea is the user wants HA for node or resource failures,
but not clean cluster stops.

> > If you decide while the node is down that you need the resource to
> > be
> > recovered, you can manually clear a lock with "crm_resource --
> > refresh"
> > specifying both --node and --resource.
> 
> 2. I'm interested how the situation will look like in the 'crm_mon' 
> output or in 'crm_simulate'. Will there be some indication why the 
> resources are not moving like 'blocked-shutdown-lock' or they will
> just 
> appear as not moving (Stopped)?

Yes, resources will be shown as "Stopped (LOCKED)".

> Will this look differently from situation where for example the
> resource 
> is just not allowed by constraint to run on other nodes?

Only in logs and cluster status; internally it is implemented as
implicit constraints banning the resources from every other node.

Another point I should clarify is that the lock/constraint remains in
place until the node rejoins the cluster *and* the resource starts
again on that node. That ensures that the node is preferred even if
stickiness was the only thing holding the resource to the node
previously.

However once the resource starts on the node, the lock/constraint is
lifted, and the resource could theoretically immediately move to
another node. An example would be if there were no stickiness and new
resources were added to the configuration while the node was down, so
load balancing calculations end up different. Another would be if a
time-based rule kicked in while the node was down. However this feature
is only expected or likely to be used in a cluster where there are
preferred nodes, enforced by stickiness and/or location constraints, so
it shouldn't be significant in practice.

Special care was taken in a number of corner cases:

* If the resource start on the rejoined node fails, the lock is lifted.

* If the node is fenced (e.g. manually via stonith_admin) while it is
down, the lock is lifted.

* If the resource somehow started on another node while the node was
down (which shouldn't be possible, but just as a fail-safe), the lock
is ignored when the node rejoins.

* Maintenance mode, unmanaged resources, etc., work the same with
shutdown locks as they would with any other constraint.

> Thanks for heads up
> 
> --
> Ondrej Famera
-- 
Ken Gaillot <kgaillot at redhat.com>