[ClusterLabs] How Pacemaker reacts to fast changes of the same parameter in configuration

Klaus Wenninger kwenning at redhat.com
Wed Nov 9 16:42:44 UTC 2016


On 11/09/2016 05:30 PM, Kostiantyn Ponomarenko wrote:
> When one problem seems to be solved, another one appears.
> Now my script looks this way:
>
>     crm --wait configure rsc_defaults resource-stickiness=50
>     crm configure rsc_defaults resource-stickiness=150
>
> While now I am sure that transactions caused by the first command
> won't be aborted, I see another possible problem here.
> With a minimum load in the cluster it took 22 sec for this script to
> finish. 
> I see here a weakness. 
> If a node on which this script is called goes down for any reasons,
> then "resource-stickiness" is not set back to its original value,
> which is vary bad.
>
> So, now I am thinking of how to solve this problem. I would appreciate
> any thoughts about this.
>
> Is there a way to ask Pacemaker to do these commands sequentially so
> there is no need to wait in the script?
> If it is possible, than I think that my concern from above goes away.
>
> Another thing which comes to my mind - is to use time based rules.
> This ways when I need to do a manual fail-back, I simply set (or
> update) a time-based rule from the script.
> And the rule will basically say - set "resource-stickiness" to 50
> right now and expire in 10 min.
> This looks good at the first glance, but there is no a reliable way to
> put a minimum sufficient time for it; at least not I am aware of.
> And the thing is - it is important to me that "resource-stickiness" is
> set back to its original value as soon as possible.
>
> Those are my thoughts. As I said, I appreciate any ideas here.

Have never tried --wait with crmsh but I would guess that the delay you
are observing
is really the time your resources are taking to stop and start somewhere
else.

Actually you would need the reduced stickiness just during the stop
phase - right.

So as there is no command like "wait till all stops are done" you could
still
do the 'crm_simulate -Ls' and check that it doesn't want to stop
anything anymore.
So you can save the time the starts would take.
Unfortunately you have to repeat that and thus put additional load on
pacemaker
possibly slowing down things if your poll-cycle is to short.

>
>
> Thank you,
> Kostia
>
> On Tue, Nov 8, 2016 at 10:19 PM, Dejan Muhamedagic
> <dejanmm at fastmail.fm <mailto:dejanmm at fastmail.fm>> wrote:
>
>     On Tue, Nov 08, 2016 at 12:54:10PM +0100, Klaus Wenninger wrote:
>     > On 11/08/2016 11:40 AM, Kostiantyn Ponomarenko wrote:
>     > > Hi,
>     > >
>     > > I need a way to do a manual fail-back on demand.
>     > > To be clear, I don't want it to be ON/OFF; I want it to be
>     more like
>     > > "one shot".
>     > > So far I found that the most reasonable way to do it - is to set
>     > > "resource stickiness" to a different value, and then set it
>     back to
>     > > what it was.
>     > > To do that I created a simple script with two lines:
>     > >
>     > >     crm configure rsc_defaults resource-stickiness=50
>     > >     crm configure rsc_defaults resource-stickiness=150
>     > >
>     > > There are no timeouts before setting the original value back.
>     > > If I call this script, I get what I want - Pacemaker moves
>     resources
>     > > to their preferred locations, and "resource stickiness" is set
>     back to
>     > > its original value.
>     > >
>     > > Despite it works, I still have few concerns about this approach.
>     > > Will I get the same behavior under a big load with delays on
>     systems
>     > > in cluster (which is truly possible and a normal case in my
>     environment)?
>     > > How Pacemaker treats fast change of this parameter?
>     > > I am worried that if "resource stickiness" is set back to its
>     original
>     > > value to fast, then no fail-back will happen. Is it possible, or I
>     > > shouldn't worry about it?
>     >
>     > AFAIK pengine is interrupted when calculating a more complicated
>     transition
>     > and if the situation has changed a transition that is just being
>     executed
>     > is aborted if the input from pengine changed.
>     > So I would definitely worry!
>     > What you could do is to issue 'crm_simulate -Ls' in between and
>     grep for
>     > an empty transition.
>     > There might be more elegant ways but that should be safe.
>
>     crmsh has an option (-w) to wait for the PE to settle after
>     committing configuration changes.
>
>     Thanks,
>
>     Dejan
>     >
>     > > Thank you,
>     > > Kostia
>     > >
>     > >
>     > > _______________________________________________
>     > > Users mailing list: Users at clusterlabs.org
>     <mailto:Users at clusterlabs.org>
>     > > http://clusterlabs.org/mailman/listinfo/users
>     <http://clusterlabs.org/mailman/listinfo/users>
>     > >
>     > > Project Home: http://www.clusterlabs.org
>     > > Getting started:
>     http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>     <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>     > > Bugs: http://bugs.clusterlabs.org
>     >
>     >
>     >
>     > _______________________________________________
>     > Users mailing list: Users at clusterlabs.org
>     <mailto:Users at clusterlabs.org>
>     > http://clusterlabs.org/mailman/listinfo/users
>     <http://clusterlabs.org/mailman/listinfo/users>
>     >
>     > Project Home: http://www.clusterlabs.org
>     > Getting started:
>     http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>     <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>     > Bugs: http://bugs.clusterlabs.org
>
>     _______________________________________________
>     Users mailing list: Users at clusterlabs.org
>     <mailto:Users at clusterlabs.org>
>     http://clusterlabs.org/mailman/listinfo/users
>     <http://clusterlabs.org/mailman/listinfo/users>
>
>     Project Home: http://www.clusterlabs.org
>     Getting started:
>     http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>     <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>     Bugs: http://bugs.clusterlabs.org
>
>





More information about the Users mailing list