[ClusterLabs] Behavior of corosync kill

Tue Aug 25 10:36:50 EDT 2020

On Tue, 2020-08-25 at 12:28 +0530, Rohit Saini wrote:
> Hi All,
> I am seeing the following behavior. Can someone clarify if this is
> intended behavior. If yes, then why so? Please let me know if logs
> are needed for better clarity.
> 
> 1. Without Stonith:
> Continuous corosync kill on master causes switchover and makes
> another node as master. But as soon as this corosync recovers, it
> becomes master again. Shouldn't it become slave now?

Where resources are active or take on the master role depends on the
cluster configuration, not past node issues.

You may be interested in the resource-stickiness property:

https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html-single/Pacemaker_Explained/index.html#_resource_meta_attributes

> 2. With Stonith:
> Sometimes, on corosync kill, that node gets shooted by stonith but
> sometimes not. Not able to understand this fluctuating behavior. Does
> it have to do anything with faster recovery of corosync, which
> stonith fails to detect?

It's not failing to detect it, but recovering satisfactorily without
fencing.

At any given time, one of the cluster nodes is elected the designated
controller (DC). When new events occur, such as a node leaving the
corosync ring unexpectedly, the DC runs pacemaker's scheduler to see
what needs to be done about it. In the case of a lost node, it will
also erase the node's resource history, to indicate that the state of
resources on the node is no longer accurately known.

If no further events happened during that time, the scheduler would
schedule fencing, and the cluster would carry it out.

However, systemd monitors corosync and will restart it if it dies. If
systemd respawns corosync fast enough (it often is sub-second), the
node will rejoin the cluster before the scheduler completes its
calculations and fencing is initiated. Rejoining the cluster includes
re-sync'ing its resource history with the other nodes.

The node join is considered new information, so the former scheduler
run is cancelled (the "transition" is "aborted") and a new one is
started. Since the node is now happily part of the cluster, and the
resource history tells us the state of all resources on the node, no
fencing is needed.

> I am using
> corosync-2.4.5-4.el7.x86_64
> pacemaker-1.1.19-8.el7.x86_64
> centos 7.6.1810
> 
> Thanks,
> Rohit
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot <kgaillot at redhat.com>