[ClusterLabs] Antw: Re: Antw: Re: Antw: Unexpected Resource movement after failover
Nikhil Utane
nikhil.subscribed at gmail.com
Tue Oct 18 06:22:04 UTC 2016
Yes Ulrich, Somehow I missed pursuing on that.
I will be doing both, configure stickiness to INFINITY and use utilization
attributes.
This should probably take care of it.
Thanks
Nikhil
On Tue, Oct 18, 2016 at 11:45 AM, Ulrich Windl <
Ulrich.Windl at rz.uni-regensburg.de> wrote:
> >>> Nikhil Utane <nikhil.subscribed at gmail.com> schrieb am 17.10.2016 um
> 16:46 in
> Nachricht
> <CAGNWmJUfVS1bcSfSG4=Rmu5u9ckC4HyUgE3psakrnWQsbi1O2w at mail.gmail.com>:
> > This is driving me insane.
>
> Why don't you try the utilization approach?
>
> >
> > This is how the resources were started. Redund_CU1_WB30 was the DC
> which I
> > rebooted.
> > cu_4 (ocf::redundancy:RedundancyRA): Started Redund_CU1_WB30
> > cu_2 (ocf::redundancy:RedundancyRA): Started Redund_CU5_WB30
> > cu_3 (ocf::redundancy:RedundancyRA): Started Redun_CU4_Wb30
> >
> > Since the standby node was not UP. I was expecting resource cu_4 to be
> > waiting to be scheduled.
> > But then it re-arranged everything as below.
> > cu_4 (ocf::redundancy:RedundancyRA): Started Redun_CU4_Wb30
> > cu_2 (ocf::redundancy:RedundancyRA): Stopped
> > cu_3 (ocf::redundancy:RedundancyRA): Started Redund_CU5_WB30
> >
> > There is not much information available in the logs on new DC. It just
> > shows what it has decided to do but nothing to suggest why it did it that
> > way.
> >
> > notice: Start cu_4 (Redun_CU4_Wb30)
> > notice: Stop cu_2 (Redund_CU5_WB30)
> > notice: Move cu_3 (Started Redun_CU4_Wb30 -> Redund_CU5_WB30)
> >
> > I have default stickiness set to 100 which is higher than any score that
> I
> > have configured.
> > I have migration_threshold set to 1. Should I bump that up instead?
> >
> > -Thanks
> > Nikhil
> >
> > On Sat, Oct 15, 2016 at 12:36 AM, Ken Gaillot <kgaillot at redhat.com>
> wrote:
> >
> >> On 10/14/2016 06:56 AM, Nikhil Utane wrote:
> >> > Hi,
> >> >
> >> > Thank you for the responses so far.
> >> > I added reverse colocation as well. However seeing some other issue in
> >> > resource movement that I am analyzing.
> >> >
> >> > Thinking further on this, why doesn't "/a not with b" does not imply
> "b
> >> > not with a"?/
> >> > Coz wouldn't putting "b with a" violate "a not with b"?
> >> >
> >> > Can someone confirm that colocation is required to be configured both
> >> ways?
> >>
> >> The anti-colocation should only be defined one-way. Otherwise, you get a
> >> dependency loop (as seen in logs you showed elsewhere).
> >>
> >> The one-way constraint is enough to keep the resources apart. However,
> >> the question is whether the cluster might move resources around
> >> unnecessarily.
> >>
> >> For example, "A not with B" means that the cluster will place B first,
> >> then place A somewhere else. So, if B's node fails, can the cluster
> >> decide that A's node is now the best place for B, and move A to a free
> >> node, rather than simply start B on the free node?
> >>
> >> The cluster does take dependencies into account when placing a resource,
> >> so I would hope that wouldn't happen. But I'm not sure. Having some
> >> stickiness might help, so that A has some preference against moving.
> >>
> >> > -Thanks
> >> > Nikhil
> >> >
> >> > /
> >> > /
> >> >
> >> > On Fri, Oct 14, 2016 at 1:09 PM, Vladislav Bogdanov
> >> > <bubble at hoster-ok.com <mailto:bubble at hoster-ok.com>> wrote:
> >> >
> >> > On October 14, 2016 10:13:17 AM GMT+03:00, Ulrich Windl
> >> > <Ulrich.Windl at rz.uni-regensburg.de
> >> > <mailto:Ulrich.Windl at rz.uni-regensburg.de>> wrote:
> >> > >>>> Nikhil Utane <nikhil.subscribed at gmail.com
> >> > <mailto:nikhil.subscribed at gmail.com>> schrieb am 13.10.2016 um
> >> > >16:43 in
> >> > >Nachricht
> >> > ><CAGNWmJUbPucnBGXroHkHSbQ0LXovwsLFPkUPg1R8gJqRFqM9Dg at mail.
> gmail.com
> >> > <mailto:CAGNWmJUbPucnBGXroHkHSbQ0LXovwsLFPkUPg1R8gJqRFqM9Dg@
> >> mail.gmail.com>>:
> >> > >> Ulrich,
> >> > >>
> >> > >> I have 4 resources only (not 5, nodes are 5). So then I only
> need
> >> 6
> >> > >> constraints, right?
> >> > >>
> >> > >> [,1] [,2] [,3] [,4] [,5] [,6]
> >> > >> [1,] "A" "A" "A" "B" "B" "C"
> >> > >> [2,] "B" "C" "D" "C" "D" "D"
> >> > >
> >> > >Sorry for my confusion. As Andrei Borzenkovsaid in
> >> > ><CAA91j0W+epAHFLg9u6VX_X8LgFkf9Rp55g3nocY4oZNA9BbZ+g@
> mail.gmail.com
> >> > <mailto:CAA91j0W%2BepAHFLg9u6VX_X8LgFkf9Rp55g3nocY4oZNA9BbZ%
> >> 2Bg at mail.gmail.com>>
> >> > >you probably have to add (A, B) _and_ (B, A)! Thinking about it,
> I
> >> > >wonder whether an easier solution would be using "utilization":
> If
> >> > >every node has one token to give, and every resource needs on
> >> token, no
> >> > >two resources will run on one node. Sounds like an easier
> solution
> >> to
> >> > >me.
> >> > >
> >> > >Regards,
> >> > >Ulrich
> >> > >
> >> > >
> >> > >>
> >> > >> I understand that if I configure constraint of R1 with R2
> score as
> >> > >> -infinity, then the same applies for R2 with R1 score as
> -infinity
> >> > >(don't
> >> > >> have to configure it explicitly).
> >> > >> I am not having a problem of multiple resources getting
> schedule
> >> on
> >> > >the
> >> > >> same node. Rather, one working resource is unnecessarily
> getting
> >> > >relocated.
> >> > >>
> >> > >> -Thanks
> >> > >> Nikhil
> >> > >>
> >> > >>
> >> > >> On Thu, Oct 13, 2016 at 7:45 PM, Ulrich Windl <
> >> > >> Ulrich.Windl at rz.uni-regensburg.de
> >> > <mailto:Ulrich.Windl at rz.uni-regensburg.de>> wrote:
> >> > >>
> >> > >>> Hi!
> >> > >>>
> >> > >>> Don't you need 10 constraints, excluding every possible pair
> of
> >> your
> >> > >5
> >> > >>> resources (named A-E here), like in this table (produced with
> R):
> >> > >>>
> >> > >>> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
> >> > >>> [1,] "A" "A" "A" "A" "B" "B" "B" "C" "C" "D"
> >> > >>> [2,] "B" "C" "D" "E" "C" "D" "E" "D" "E" "E"
> >> > >>>
> >> > >>> Ulrich
> >> > >>>
> >> > >>> >>> Nikhil Utane <nikhil.subscribed at gmail.com
> >> > <mailto:nikhil.subscribed at gmail.com>> schrieb am 13.10.2016
> >> > >um
> >> > >>> 15:59 in
> >> > >>> Nachricht
> >> > >>>
> >> > ><CAGNWmJW0CWMr3bvR3L9xZCAcJUzyczQbZEzUzpaJxi+Pn7Oj_A at mail.
> gmail.com
> >> > <mailto:CAGNWmJW0CWMr3bvR3L9xZCAcJUzyczQbZEzUzpaJxi%2BPn7Oj_
> >> A at mail.gmail.com>>:
> >> > >>> > Hi,
> >> > >>> >
> >> > >>> > I have 5 nodes and 4 resources configured.
> >> > >>> > I have configured constraint such that no two resources can
> be
> >> > >>> co-located.
> >> > >>> > I brought down a node (which happened to be DC). I was
> >> expecting
> >> > >the
> >> > >>> > resource on the failed node would be migrated to the 5th
> >> waiting
> >> > >node
> >> > >>> (that
> >> > >>> > is not running any resource).
> >> > >>> > However what happened was the failed node resource was
> started
> >> on
> >> > >another
> >> > >>> > active node (after stopping it's existing resource) and that
> >> > >node's
> >> > >>> > resource was moved to the waiting node.
> >> > >>> >
> >> > >>> > What could I be doing wrong?
> >> > >>> >
> >> > >>> > <nvpair id="cib-bootstrap-options-have-watchdog"
> value="true"
> >> > >>> > name="have-watchdog"/>
> >> > >>> > <nvpair id="cib-bootstrap-options-dc-version"
> >> > >value="1.1.14-5a6cdd1"
> >> > >>> > name="dc-version"/>
> >> > >>> > <nvpair id="cib-bootstrap-options-cluster-infrastructure"
> >> > >>> value="corosync"
> >> > >>> > name="cluster-infrastructure"/>
> >> > >>> > <nvpair id="cib-bootstrap-options-stonith-enabled"
> >> value="false"
> >> > >>> > name="stonith-enabled"/>
> >> > >>> > <nvpair id="cib-bootstrap-options-no-quorum-policy"
> >> value="ignore"
> >> > >>> > name="no-quorum-policy"/>
> >> > >>> > <nvpair id="cib-bootstrap-options-default-action-timeout"
> >> > >value="240"
> >> > >>> > name="default-action-timeout"/>
> >> > >>> > <nvpair id="cib-bootstrap-options-symmetric-cluster"
> >> value="false"
> >> > >>> > name="symmetric-cluster"/>
> >> > >>> >
> >> > >>> > # pcs constraint
> >> > >>> > Location Constraints:
> >> > >>> > Resource: cu_2
> >> > >>> > Enabled on: Redun_CU4_Wb30 (score:0)
> >> > >>> > Enabled on: Redund_CU2_WB30 (score:0)
> >> > >>> > Enabled on: Redund_CU3_WB30 (score:0)
> >> > >>> > Enabled on: Redund_CU5_WB30 (score:0)
> >> > >>> > Enabled on: Redund_CU1_WB30 (score:0)
> >> > >>> > Resource: cu_3
> >> > >>> > Enabled on: Redun_CU4_Wb30 (score:0)
> >> > >>> > Enabled on: Redund_CU2_WB30 (score:0)
> >> > >>> > Enabled on: Redund_CU3_WB30 (score:0)
> >> > >>> > Enabled on: Redund_CU5_WB30 (score:0)
> >> > >>> > Enabled on: Redund_CU1_WB30 (score:0)
> >> > >>> > Resource: cu_4
> >> > >>> > Enabled on: Redun_CU4_Wb30 (score:0)
> >> > >>> > Enabled on: Redund_CU2_WB30 (score:0)
> >> > >>> > Enabled on: Redund_CU3_WB30 (score:0)
> >> > >>> > Enabled on: Redund_CU5_WB30 (score:0)
> >> > >>> > Enabled on: Redund_CU1_WB30 (score:0)
> >> > >>> > Resource: cu_5
> >> > >>> > Enabled on: Redun_CU4_Wb30 (score:0)
> >> > >>> > Enabled on: Redund_CU2_WB30 (score:0)
> >> > >>> > Enabled on: Redund_CU3_WB30 (score:0)
> >> > >>> > Enabled on: Redund_CU5_WB30 (score:0)
> >> > >>> > Enabled on: Redund_CU1_WB30 (score:0)
> >> > >>> > Ordering Constraints:
> >> > >>> > Colocation Constraints:
> >> > >>> > cu_3 with cu_2 (score:-INFINITY)
> >> > >>> > cu_4 with cu_2 (score:-INFINITY)
> >> > >>> > cu_4 with cu_3 (score:-INFINITY)
> >> > >>> > cu_5 with cu_2 (score:-INFINITY)
> >> > >>> > cu_5 with cu_3 (score:-INFINITY)
> >> > >>> > cu_5 with cu_4 (score:-INFINITY)
> >> > >>> >
> >> > >>> > -Thanks
> >> > >>> > Nikhil
> >> > >>>
> >> > >>>
> >> > >>>
> >> > >>>
> >> > >>>
> >> > >>> _______________________________________________
> >> > >>> Users mailing list: Users at clusterlabs.org
> >> > <mailto:Users at clusterlabs.org>
> >> > >>> http://clusterlabs.org/mailman/listinfo/users
> >> > <http://clusterlabs.org/mailman/listinfo/users>
> >> > >>>
> >> > >>> Project Home: http://www.clusterlabs.org
> >> > >>> Getting started:
> >> > >http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >> > <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
> >> > >>> Bugs: http://bugs.clusterlabs.org
> >> > >>>
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >_______________________________________________
> >> > >Users mailing list: Users at clusterlabs.org
> >> > <mailto:Users at clusterlabs.org>
> >> > >http://clusterlabs.org/mailman/listinfo/users
> >> > <http://clusterlabs.org/mailman/listinfo/users>
> >> > >
> >> > >Project Home: http://www.clusterlabs.org
> >> > >Getting started:
> >> > >http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >> > <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
> >> > >Bugs: http://bugs.clusterlabs.org
> >> >
> >> > Hi,
> >> >
> >> > use of utilization (balanced strategy) has one caveat: resources
> are
> >> > not moved just because of utilization of one node is less, when
> >> > nodes have the same allocation score for the resource.
> >> > So, after the simultaneus outage of two nodes in a 5-node cluster,
> >> > it may appear that one node runs two resources and two recovered
> >> > nodes run nothing.
> >> >
> >> > Original 'utilization' strategy only limits resource placement, it
> >> > is not considered when choosing a node for a resource.
> >> >
> >> > Vladislav
> >>
> >> _______________________________________________
> >> Users mailing list: Users at clusterlabs.org
> >> http://clusterlabs.org/mailman/listinfo/users
> >>
> >> Project Home: http://www.clusterlabs.org
> >> Getting started: http://www.clusterlabs.org/
> doc/Cluster_from_Scratch.pdf
> >> Bugs: http://bugs.clusterlabs.org
> >>
>
>
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20161018/719066de/attachment.htm>
More information about the Users
mailing list