[ClusterLabs] Single-node automated startup question

Wed Apr 14 12:35:33 EDT 2021

On Wed, 2021-04-14 at 18:00 +0300, Andrei Borzenkov wrote:
> On 14.04.2021 17:50, Digimer wrote:
> > Hi all,
> > 
> >   As we get close to finish our Anvil! switch to pacemaker, I'm
> > trying
> > to tie up loose ends. One that I want feedback on is the pacemaker
> > version of cman's old 'post_join_delay' feature.
> > 
> > Use case example;
> > 
> >   A common use for the Anvil! is remote deployments where there is
> > no
> > (IT) humans available. Think cargo ships, field data collection,
> > etc. So
> > it's entirely possible that a node could fail and not be repaired
> > for
> > weeks or even months. With this in mind, it's also feasible that a
> > solo
> > node later loses power, and then reboots. In such a case, 'pcs
> > cluster
> > start' would never go quorate as the peer is dead.
> > 
> >   In cman, during startup, if there was no reply from the peer
> > after
> > post_join_delay seconds, the peer would get fenced and then the
> > cluster
> > would finish coming up. Being two_node, it would also become
> > quorate and
> > start hosting services. Of course, this opens the risk of a fence
> > loop,
> > but we have other protections in place to prevent that, so a fence
> > loop
> > is not a concern.
> > 
> >   My question then is two-fold;
> > 
> > 1. Is there a pacemaker equivalent to 'post_join_delay'? (Fence the
> > peer
> > and, if successful, become quorate)?
> > 
> 
> Startup fencing is pacemaker default (startup-fencing cluster
> option).

Start-up fencing will have the desired effect in >2 node cluster, but
in 2-node cluster the corosync wait_for_all option is key.

If wait_for_all is true (which is the default when two_node is set),
then a node that comes up alone will wait until it sees the other node
at least once before becoming quorate. This prevents an isolated node
from coming up and fencing a node that's happily running.

Setting wait_for_all to false will make an isolated node immediately
become quorate. It will do what you want, which is fence the other node
and take over resources, but the danger is that this node is the one
that's having trouble (e.g. can't see the other node due to a network
card issue). The healthy node could fence the unhealthy node, which
might then reboot and come up and shoot the healthy node.

There's no direct equivalent of a delay before becoming quorate, but I
don't think that helps -- the boot time acts as a sort of random delay,
and a delay doesn't help the issue of an unhealthy node shooting a
healthy one.

My recommendation would be to set wait_for_all to true as long as both
nodes are known to be healthy. Once an unhealthy node is down and
expected to stay down, set wait_for_all to false on the healthy node so
it can reboot and bring the cluster up. (The unhealthy node will still
have wait_for_all=true, so it won't cause any trouble even if it comes
up.) 

> 
> > 2. If not, was this a conscious decision not to add it for some
> > reason,
> > or was it simply never added? If it was consciously decided to not
> > have
> > it, what was the reasoning behind it?
> > 
> >   I can replicate this behaviour in our code, but I don't want to
> > do
> > that if there is a compelling reason that I am not aware of.
> > 
> > So,
> > 
> > A) is there a pacemaker version of post_join_delay?
> > B) is there a compelling argument NOT to use post_join_delay
> > behaviour
> > in pacemaker I am not seeing?
> > 
> > Thanks!
> > 
> 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
> 
-- 
Ken Gaillot <kgaillot at redhat.com>