[ClusterLabs] Antw: Re: Antw: Unexpected Resource movement after failover

Ken Gaillot kgaillot at redhat.com
Fri Oct 14 15:06:47 EDT 2016


On 10/14/2016 06:56 AM, Nikhil Utane wrote:
> Hi,
> 
> Thank you for the responses so far.
> I added reverse colocation as well. However seeing some other issue in
> resource movement that I am analyzing.
> 
> Thinking further on this, why doesn't "/a not with b" does not imply "b
> not with a"?/
> Coz wouldn't putting "b with a" violate "a not with b"?
> 
> Can someone confirm that colocation is required to be configured both ways?

The anti-colocation should only be defined one-way. Otherwise, you get a
dependency loop (as seen in logs you showed elsewhere).

The one-way constraint is enough to keep the resources apart. However,
the question is whether the cluster might move resources around
unnecessarily.

For example, "A not with B" means that the cluster will place B first,
then place A somewhere else. So, if B's node fails, can the cluster
decide that A's node is now the best place for B, and move A to a free
node, rather than simply start B on the free node?

The cluster does take dependencies into account when placing a resource,
so I would hope that wouldn't happen. But I'm not sure. Having some
stickiness might help, so that A has some preference against moving.

> -Thanks
> Nikhil
> 
> /
> /
> 
> On Fri, Oct 14, 2016 at 1:09 PM, Vladislav Bogdanov
> <bubble at hoster-ok.com <mailto:bubble at hoster-ok.com>> wrote:
> 
>     On October 14, 2016 10:13:17 AM GMT+03:00, Ulrich Windl
>     <Ulrich.Windl at rz.uni-regensburg.de
>     <mailto:Ulrich.Windl at rz.uni-regensburg.de>> wrote:
>     >>>> Nikhil Utane <nikhil.subscribed at gmail.com
>     <mailto:nikhil.subscribed at gmail.com>> schrieb am 13.10.2016 um
>     >16:43 in
>     >Nachricht
>     ><CAGNWmJUbPucnBGXroHkHSbQ0LXovwsLFPkUPg1R8gJqRFqM9Dg at mail.gmail.com
>     <mailto:CAGNWmJUbPucnBGXroHkHSbQ0LXovwsLFPkUPg1R8gJqRFqM9Dg at mail.gmail.com>>:
>     >> Ulrich,
>     >>
>     >> I have 4 resources only (not 5, nodes are 5). So then I only need 6
>     >> constraints, right?
>     >>
>     >>      [,1]   [,2]   [,3]   [,4]   [,5]  [,6]
>     >> [1,] "A"  "A"  "A"    "B"   "B"    "C"
>     >> [2,] "B"  "C"  "D"   "C"  "D"    "D"
>     >
>     >Sorry for my confusion. As Andrei Borzenkovsaid in
>     ><CAA91j0W+epAHFLg9u6VX_X8LgFkf9Rp55g3nocY4oZNA9BbZ+g at mail.gmail.com
>     <mailto:CAA91j0W%2BepAHFLg9u6VX_X8LgFkf9Rp55g3nocY4oZNA9BbZ%2Bg at mail.gmail.com>>
>     >you probably have to add (A, B) _and_ (B, A)! Thinking about it, I
>     >wonder whether an easier solution would be using "utilization": If
>     >every node has one token to give, and every resource needs on token, no
>     >two resources will run on one node. Sounds like an easier solution to
>     >me.
>     >
>     >Regards,
>     >Ulrich
>     >
>     >
>     >>
>     >> I understand that if I configure constraint of R1 with R2 score as
>     >> -infinity, then the same applies for R2 with R1 score as -infinity
>     >(don't
>     >> have to configure it explicitly).
>     >> I am not having a problem of multiple resources getting schedule on
>     >the
>     >> same node. Rather, one working resource is unnecessarily getting
>     >relocated.
>     >>
>     >> -Thanks
>     >> Nikhil
>     >>
>     >>
>     >> On Thu, Oct 13, 2016 at 7:45 PM, Ulrich Windl <
>     >> Ulrich.Windl at rz.uni-regensburg.de
>     <mailto:Ulrich.Windl at rz.uni-regensburg.de>> wrote:
>     >>
>     >>> Hi!
>     >>>
>     >>> Don't you need 10 constraints, excluding every possible pair of your
>     >5
>     >>> resources (named A-E here), like in this table (produced with R):
>     >>>
>     >>>      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
>     >>> [1,] "A"  "A"  "A"  "A"  "B"  "B"  "B"  "C"  "C"  "D"
>     >>> [2,] "B"  "C"  "D"  "E"  "C"  "D"  "E"  "D"  "E"  "E"
>     >>>
>     >>> Ulrich
>     >>>
>     >>> >>> Nikhil Utane <nikhil.subscribed at gmail.com
>     <mailto:nikhil.subscribed at gmail.com>> schrieb am 13.10.2016
>     >um
>     >>> 15:59 in
>     >>> Nachricht
>     >>>
>     ><CAGNWmJW0CWMr3bvR3L9xZCAcJUzyczQbZEzUzpaJxi+Pn7Oj_A at mail.gmail.com
>     <mailto:CAGNWmJW0CWMr3bvR3L9xZCAcJUzyczQbZEzUzpaJxi%2BPn7Oj_A at mail.gmail.com>>:
>     >>> > Hi,
>     >>> >
>     >>> > I have 5 nodes and 4 resources configured.
>     >>> > I have configured constraint such that no two resources can be
>     >>> co-located.
>     >>> > I brought down a node (which happened to be DC). I was expecting
>     >the
>     >>> > resource on the failed node would be migrated to the 5th waiting
>     >node
>     >>> (that
>     >>> > is not running any resource).
>     >>> > However what happened was the failed node resource was started on
>     >another
>     >>> > active node (after stopping it's existing resource) and that
>     >node's
>     >>> > resource was moved to the waiting node.
>     >>> >
>     >>> > What could I be doing wrong?
>     >>> >
>     >>> > <nvpair id="cib-bootstrap-options-have-watchdog" value="true"
>     >>> > name="have-watchdog"/>
>     >>> > <nvpair id="cib-bootstrap-options-dc-version"
>     >value="1.1.14-5a6cdd1"
>     >>> > name="dc-version"/>
>     >>> > <nvpair id="cib-bootstrap-options-cluster-infrastructure"
>     >>> value="corosync"
>     >>> > name="cluster-infrastructure"/>
>     >>> > <nvpair id="cib-bootstrap-options-stonith-enabled" value="false"
>     >>> > name="stonith-enabled"/>
>     >>> > <nvpair id="cib-bootstrap-options-no-quorum-policy" value="ignore"
>     >>> > name="no-quorum-policy"/>
>     >>> > <nvpair id="cib-bootstrap-options-default-action-timeout"
>     >value="240"
>     >>> > name="default-action-timeout"/>
>     >>> > <nvpair id="cib-bootstrap-options-symmetric-cluster" value="false"
>     >>> > name="symmetric-cluster"/>
>     >>> >
>     >>> > # pcs constraint
>     >>> > Location Constraints:
>     >>> >   Resource: cu_2
>     >>> >     Enabled on: Redun_CU4_Wb30 (score:0)
>     >>> >     Enabled on: Redund_CU2_WB30 (score:0)
>     >>> >     Enabled on: Redund_CU3_WB30 (score:0)
>     >>> >     Enabled on: Redund_CU5_WB30 (score:0)
>     >>> >     Enabled on: Redund_CU1_WB30 (score:0)
>     >>> >   Resource: cu_3
>     >>> >     Enabled on: Redun_CU4_Wb30 (score:0)
>     >>> >     Enabled on: Redund_CU2_WB30 (score:0)
>     >>> >     Enabled on: Redund_CU3_WB30 (score:0)
>     >>> >     Enabled on: Redund_CU5_WB30 (score:0)
>     >>> >     Enabled on: Redund_CU1_WB30 (score:0)
>     >>> >   Resource: cu_4
>     >>> >     Enabled on: Redun_CU4_Wb30 (score:0)
>     >>> >     Enabled on: Redund_CU2_WB30 (score:0)
>     >>> >     Enabled on: Redund_CU3_WB30 (score:0)
>     >>> >     Enabled on: Redund_CU5_WB30 (score:0)
>     >>> >     Enabled on: Redund_CU1_WB30 (score:0)
>     >>> >   Resource: cu_5
>     >>> >     Enabled on: Redun_CU4_Wb30 (score:0)
>     >>> >     Enabled on: Redund_CU2_WB30 (score:0)
>     >>> >     Enabled on: Redund_CU3_WB30 (score:0)
>     >>> >     Enabled on: Redund_CU5_WB30 (score:0)
>     >>> >     Enabled on: Redund_CU1_WB30 (score:0)
>     >>> > Ordering Constraints:
>     >>> > Colocation Constraints:
>     >>> >   cu_3 with cu_2 (score:-INFINITY)
>     >>> >   cu_4 with cu_2 (score:-INFINITY)
>     >>> >   cu_4 with cu_3 (score:-INFINITY)
>     >>> >   cu_5 with cu_2 (score:-INFINITY)
>     >>> >   cu_5 with cu_3 (score:-INFINITY)
>     >>> >   cu_5 with cu_4 (score:-INFINITY)
>     >>> >
>     >>> > -Thanks
>     >>> > Nikhil
>     >>>
>     >>>
>     >>>
>     >>>
>     >>>
>     >>> _______________________________________________
>     >>> Users mailing list: Users at clusterlabs.org
>     <mailto:Users at clusterlabs.org>
>     >>> http://clusterlabs.org/mailman/listinfo/users
>     <http://clusterlabs.org/mailman/listinfo/users>
>     >>>
>     >>> Project Home: http://www.clusterlabs.org
>     >>> Getting started:
>     >http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>     <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>     >>> Bugs: http://bugs.clusterlabs.org
>     >>>
>     >
>     >
>     >
>     >
>     >_______________________________________________
>     >Users mailing list: Users at clusterlabs.org
>     <mailto:Users at clusterlabs.org>
>     >http://clusterlabs.org/mailman/listinfo/users
>     <http://clusterlabs.org/mailman/listinfo/users>
>     >
>     >Project Home: http://www.clusterlabs.org
>     >Getting started:
>     >http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>     <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>     >Bugs: http://bugs.clusterlabs.org
> 
>     Hi,
> 
>     use of utilization (balanced strategy) has one caveat: resources are
>     not moved just because of utilization of one node is less, when
>     nodes have the same allocation score for the resource.
>     So, after the simultaneus outage of two nodes in a 5-node cluster,
>     it may appear that one node runs two resources and two recovered
>     nodes run nothing.
> 
>     Original 'utilization' strategy only limits resource placement, it
>     is not considered when choosing a node for a resource.
> 
>     Vladislav




More information about the Users mailing list