[ClusterLabs] Strange fencing behavior with pacemaker-remoted node

zerapuka zerapuka at gmail.com
Mon Apr 8 13:41:32 EDT 2019


Hi,
I'm playing with a test 3 node cluster - two clustered node and one remote node.
Tried my config on Centos 7.4 and Fedora 29 with
pacemaker-1.1.19,corosync-2.4.3-4,resource-agents-4.1.1 and
pacemaker-2.0.0-4,corosync-3.0.1-1,resource-agents-4.2.0-1.

My problem is when remote node3 is fenced it never goes online in the
cluster after and pacemaker:remote resource is always in the stopped
state. I can't see any packets on pacemaker-remoted port after
fencing.
Also node3 was fenced twice for some reason, from node1 and then from node2.

If I fence any other node(corosync member) - the fenced node joins the
cluster normally just after a system start.

This is corosync log from remoted node from fence time to service start time:
https://paste.fedoraproject.org/paste/Nwqe7L~jeLWO8QXvAYMNbA

cibadmin -Q > https://paste.fedoraproject.org/paste/glS9TF21y07XKNJUSK0RJA

My second try was when I set reconnect_interval=5 option to
pacemaker:remote resource.
Then the remote node successfully "joins" my cluster when fenced
manually(pcs stonith fence) just after OS and service start but... if
the remote node was 'autofenced' after I kill pacemaker_remote service
or block port - then remoted node joins the cluster after long delay
~10 min.
Service log when node was started with a long delay after being
'autofenced' > https://paste.fedoraproject.org/paste/6ZqferpgUJW8tI~XjAsJhA

Is this a normal behavior?
Regards,
            Andrey


More information about the Users mailing list