[ClusterLabs] Fwd: After failover Pacemaker moves resource back when dead node become up

Fri Jan 4 07:58:15 EST 2019

On Fri, 2019-01-04 at 15:27 +0300, Özkan Göksu  wrote:
> Hello.
> 
> I'm using Pacemaker & Corosync for my cluster. When a node dies
> pacemaker
> moving my resources to another online node. Everything ok here.
> But when the dead node comes back, Pacemaker moving the resource
> back. I
> don't have any "location" line in my config and also I tried with
> "unmove"
> command but nothing changed.
> corosync & pacemaker services are enabled and starting at boot. If I
> run it
> manually it does not move resources failback.
> 
> How can I stop moving the resource if it is running normally?

Configuring a positive resource-stickiness should take care of this for
you, so there has to be something else going on. Do you get any strange
errors reported for the resources on the second node? Check if there is
any failcount for the resources on that node using "crm_mon --
failcounts". Other than that, looking in the logs for anything unusual
would be my next move.

Another thing that stands out to me is that you configure a monitor
action for the gui resource, but you don't set a timeout. I'm not sure
what the default is there, so I would configure a timeout explicitly.

Finally, it looks like you have a 2-node cluster with STONITH disabled.
That's not going to work. You need some kind of stonith, or things will
behave badly. So that could be why you're seeing strange behavior.

Cheers,
Kristoffer

> 
> *crm configure sh*
> 
> node 1: DEV1
> node 2: DEV2
> primitive poolip IPaddr2 \
>     params ip=10.1.60.33 nic=enp2s0f0 cidr_netmask=24 \
>     meta migration-threshold=2 target-role=Started \
>     op monitor interval=20 timeout=20 on-fail=restart
> primitive gui systemd:gui \
>     op monitor interval=20s \
>     meta target-role=Started
> primitive gui-ip IPaddr2 \
>     params ip=10.1.60.35 nic=enp2s0f0 cidr_netmask=24 \
>     meta migration-threshold=2 target-role=Started \
>     op monitor interval=20 timeout=20 on-fail=restart
> colocation cluster-gui inf: gui gui-ip
> order gui-after-ip Mandatory: gui-ip gui
> property cib-bootstrap-options: \
>     have-watchdog=false \
>     dc-version=2.0.0-1-8cf3fe749e \
>     cluster-infrastructure=corosync \
>     cluster-name=mycluster \
>     stonith-enabled=false \
>     no-quorum-policy=ignore \
>     last-lrm-refresh=1545920437
> rsc_defaults rsc-options: \
>     migration-threshold=10 \
>     resource-stickiness=100
> 
> *pcs resource defaults*
> 
> migration-threshold=10
> resource-stickiness=100
> 
> *pcs resource show gui*
> 
> Resource: gui (class=systemd type=gui)
>  Meta Attrs: target-role=Started
>  Operations: monitor interval=20s (gui-monitor-20s)
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.
> pdf
> Bugs: http://bugs.clusterlabs.org