[ClusterLabs] Resources are stopped and started when one node rejoins

Sat Aug 26 12:36:30 EDT 2017

Thank you for your reply.

There is no reason to set location for the resources, I think, because all
the resources are set with clone options so they are started on all nodes
at the same time. And when it comes to stickiness I forgot to mention that
but it set to 200. and also I have stonith configured  to use vmware esxi.

Best regards
Octavian Ciobanu

On Sat, Aug 26, 2017 at 6:16 PM, John Keates <john at keates.nl> wrote:

> While I am by no means a CRM/Pacemaker expert, I only see the resource
> primitives and the order constraints. Wouldn’t you need location and/or
> colocation as well as stickiness settings to prevent this from happening?
> What I think it might be doing is seeing the new node, then trying to move
> the resources (but not finding it a suitable target) and then moving them
> back where they came from, but fast enough for you to only see it as a
> restart.
>
> If you crm_resource -P, it should also restart all resources, but put them
> in the preferred spot. If they end up in the same place, you probably
> didn’t put and weighing in the config or have stickiness set to INF.
>
> Kind regards,
>
> John Keates
>
> On 26 Aug 2017, at 14:23, Octavian Ciobanu <coctavian1979 at gmail.com>
> wrote:
>
> Hello all,
>
> While playing with cluster configuration I noticed a strange behavior. If
> I stop/standby cluster services on one node and reboot it, when it joins
> the cluster all the resources that were started and working on active nodes
> get stopped and restarted.
>
> My testing configuration is based on 4 nodes. One node is a storage node
> that makes 3 iSCSI targets available for the other nodes to use,it is not
> configured to join cluster, and three nodes that are configured in a
> cluster using the following commands.
>
> pcs resource create DLM ocf:pacemaker:controld op monitor interval="60"
> on-fail="fence" clone meta clone-max="3" clone-node-max="1"
> interleave="true" ordered="true"
> pcs resource create iSCSI1 ocf:heartbeat:iscsi portal="10.0.0.1:3260"
> target="iqn.2017-08.example.com:tgt1" op start interval="0" timeout="20"
> op stop interval="0" timeout="20" op monitor interval="120" timeout="30"
> clone meta clone-max="3" clone-node-max="1"
> pcs resource create iSCSI2 ocf:heartbeat:iscsi portal="10.0.0.1:3260"
> target="iqn.2017-08.example.com:tgt2" op start interval="0" timeout="20"
> op stop interval="0" timeout="20" op monitor interval="120" timeout="30"
> clone meta clone-max="3" clone-node-max="1"
> pcs resource create iSCSI3 ocf:heartbeat:iscsi portal="10.0.0.1:3260"
> target="iqn.2017-08.example.com:tgt3" op start interval="0" timeout="20"
> op stop interval="0" timeout="20" op monitor interval="120" timeout="30"
> clone meta clone-max="3" clone-node-max="1"
> pcs resource create Mount1 ocf:heartbeat:Filesystem
> device="/dev/disk/by-label/MyCluster:Data1" directory="/mnt/data1"
> fstype="gfs2" options="noatime,nodiratime,rw" op monitor interval="90"
> on-fail="fence" clone meta clone-max="3" clone-node-max="1"
> interleave="true"
> pcs resource create Mount2 ocf:heartbeat:Filesystem
> device="/dev/disk/by-label/MyCluster:Data2" directory="/mnt/data2"
> fstype="gfs2" options="noatime,nodiratime,rw" op monitor interval="90"
> on-fail="fence" clone meta clone-max="3" clone-node-max="1"
> interleave="true"
> pcs resource create Mount3 ocf:heartbeat:Filesystem
> device="/dev/disk/by-label/MyCluster:Data3" directory="/mnt/data3"
> fstype="gfs2" options="noatime,nodiratime,rw" op monitor interval="90"
> on-fail="fence" clone meta clone-max="3" clone-node-max="1"
> interleave="true"
> pcs constraint order DLM-clone then iSCSI1-clone
> pcs constraint order DLM-clone then iSCSI2-clone
> pcs constraint order DLM-clone then iSCSI3-clone
> pcs constraint order iSCSI1-clone then Mount1-clone
> pcs constraint order iSCSI2-clone then Mount2-clone
> pcs constraint order iSCSI3-clone then Mount3-clone
>
> If I issue the command "pcs cluster standby node1" or "pcs cluster stop"
> on node 1 and after that I reboot the node. When the node gets back online
> (unstandby if it was put in standby mode) all the "MountX" resources get
> stopped on node 3 and 4 and started again.
>
> Can anyone help me figure out where and what is the mistake in my
> configuration as I would like to keep the started resources on active nodes
> (avoid stop and start of resources)?
>
> Thank you in advance
> Octavian Ciobanu
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20170826/62b34b87/attachment-0003.html>