[ClusterLabs] reset of sticking service in peer node's reboot in Active/Passive configuration

Mon May 1 06:03:16 EDT 2017

Hello Ishii-san,

I could not reproduce the issue in my environment CentOS7 w/ Pacemaker 1.1.15.
Following configuration works fine when reboot a passive node.
(lighttpd is just for example of systemd resource)

---- %< ----
primitive ipaddr IPaddr2 \
        params nic=enp0s10 ip=172.22.23.254 cidr_netmask=24 \
        op start interval=0 timeout=20 on-fail=restart \
        op stop interval=0 timeout=20 on-fail=ignore \
        op monitor interval=10 timeout=20 on-fail=restart
primitive lighttpd systemd:lighttpd \
        op start interval=0 timeout=20 on-fail=restart \
        op stop interval=0 timeout=20 on-fail=ignore \
        op monitor interval=10 timeout=20 on-fail=restart
colocation vip-colocation inf: ipaddr lighttpd
order web-order inf: lighttpd ipaddr
property cib-bootstrap-options: \
        have-watchdog=false \
        dc-version=1.1.15-1.el7-e174ec8 \
        cluster-infrastructure=corosync \
        no-quorum-policy=ignore \
        startup-fencing=no \
        stonith-enabled=no \
        cluster-recheck-interval=1m
rsc_defaults rsc-options: \
        resource-stickiness=infinity \
        migration-threshold=1
---- %< ----

I made sure resources did not restart and did not move by changing
resource-stickiness to some positive values such as 10, 100 and 0.
Also it works replacing colocation and order constraints by "group" constraint.

If you are watching cluster's status by crm_mon, please run with "-t"
option and watch "last-run" in line of "start" operation for each
resource.
If the time is not change when you rebooted a passive node, the
resource should not restarted actually.

Thanks,

Takehiro Matsushima

2017-04-30 19:32 GMT+09:00 石井 俊直 <i_j_e_x_a at yahoo.co.jp>:
> Hi.
>
> We have 2-node Active/Passive cluster each of which are CentOS7 and there are two cluster services,
> one is ocf:heartbeat:IPaddr2 and the other is systemd based service. They have colocation constraint.
> The configuration looks almost good so that they are normally running without problems.
>
> When one of the OS reboots, there happens a thing we do not want to have, which is 5) of the following.
> Suppose nodes are node-1 and node-2, cluster resource is running on node-1 and we reboot node-2.
> Following is events sequence that happens.
>
>   1) node-2 shutdowns
>   2) node-1 detects node-2 is OFFLINE
>   3) node-2 boots up
>   4) node-1 detects node-2 is Online, node-2 detects both are Online
>   5) cluster services running on node-1 Stops
>   6) cluster services starts on node-1
>
> 6) is based on our configuration of resource-stickiness to be something like 100. In the case the service
> does not move to node-2, we do not our service stopped even just for a while.
>
> If someone knows how to configure pacemaker not to behave like 5), please let us know.
>
> Thanks you.
>
>
>
>
>
>
>
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org