[ClusterLabs] How Pacemaker reacts to fast changes of the same parameter in configuration
Kostiantyn Ponomarenko
konstantin.ponomarenko at gmail.com
Wed Nov 9 18:57:37 CET 2016
>> Actually you would need the reduced stickiness just during the stop phase
- right.
Oh, that is good to know.
While I can reduce time when waiting for only "stop" commands to finish, I
don't think that this is worth it.
Because this doesn't address my problem fully.
Does that mean that the reality is cruel, and there is no way to tell
Pacemaker - here you have this two commands, execute them sequentially?
It is all about usability for the end user.
As a last resort I was thinking about not providing this "do a fail-back"
one-shot button to a user.
But instead provide "fail-back ON/OFF" switch-button, with some kind of
indicator "resources are placed optimally".
Anyways, maybe there still are some other ideas?
I really want to have this "one shot fail-back" rock-solid solution, and
maybe I am missing here something =)
Or maybe it can be a feature request =)
Thank you,
Kostia
On Wed, Nov 9, 2016 at 6:42 PM, Klaus Wenninger <kwenning at redhat.com> wrote:
> On 11/09/2016 05:30 PM, Kostiantyn Ponomarenko wrote:
> > When one problem seems to be solved, another one appears.
> > Now my script looks this way:
> >
> > crm --wait configure rsc_defaults resource-stickiness=50
> > crm configure rsc_defaults resource-stickiness=150
> >
> > While now I am sure that transactions caused by the first command
> > won't be aborted, I see another possible problem here.
> > With a minimum load in the cluster it took 22 sec for this script to
> > finish.
> > I see here a weakness.
> > If a node on which this script is called goes down for any reasons,
> > then "resource-stickiness" is not set back to its original value,
> > which is vary bad.
> >
> > So, now I am thinking of how to solve this problem. I would appreciate
> > any thoughts about this.
> >
> > Is there a way to ask Pacemaker to do these commands sequentially so
> > there is no need to wait in the script?
> > If it is possible, than I think that my concern from above goes away.
> >
> > Another thing which comes to my mind - is to use time based rules.
> > This ways when I need to do a manual fail-back, I simply set (or
> > update) a time-based rule from the script.
> > And the rule will basically say - set "resource-stickiness" to 50
> > right now and expire in 10 min.
> > This looks good at the first glance, but there is no a reliable way to
> > put a minimum sufficient time for it; at least not I am aware of.
> > And the thing is - it is important to me that "resource-stickiness" is
> > set back to its original value as soon as possible.
> >
> > Those are my thoughts. As I said, I appreciate any ideas here.
>
> Have never tried --wait with crmsh but I would guess that the delay you
> are observing
> is really the time your resources are taking to stop and start somewhere
> else.
>
> Actually you would need the reduced stickiness just during the stop
> phase - right.
>
> So as there is no command like "wait till all stops are done" you could
> still
> do the 'crm_simulate -Ls' and check that it doesn't want to stop
> anything anymore.
> So you can save the time the starts would take.
> Unfortunately you have to repeat that and thus put additional load on
> pacemaker
> possibly slowing down things if your poll-cycle is to short.
>
> >
> >
> > Thank you,
> > Kostia
> >
> > On Tue, Nov 8, 2016 at 10:19 PM, Dejan Muhamedagic
> > <dejanmm at fastmail.fm <mailto:dejanmm at fastmail.fm>> wrote:
> >
> > On Tue, Nov 08, 2016 at 12:54:10PM +0100, Klaus Wenninger wrote:
> > > On 11/08/2016 11:40 AM, Kostiantyn Ponomarenko wrote:
> > > > Hi,
> > > >
> > > > I need a way to do a manual fail-back on demand.
> > > > To be clear, I don't want it to be ON/OFF; I want it to be
> > more like
> > > > "one shot".
> > > > So far I found that the most reasonable way to do it - is to set
> > > > "resource stickiness" to a different value, and then set it
> > back to
> > > > what it was.
> > > > To do that I created a simple script with two lines:
> > > >
> > > > crm configure rsc_defaults resource-stickiness=50
> > > > crm configure rsc_defaults resource-stickiness=150
> > > >
> > > > There are no timeouts before setting the original value back.
> > > > If I call this script, I get what I want - Pacemaker moves
> > resources
> > > > to their preferred locations, and "resource stickiness" is set
> > back to
> > > > its original value.
> > > >
> > > > Despite it works, I still have few concerns about this approach.
> > > > Will I get the same behavior under a big load with delays on
> > systems
> > > > in cluster (which is truly possible and a normal case in my
> > environment)?
> > > > How Pacemaker treats fast change of this parameter?
> > > > I am worried that if "resource stickiness" is set back to its
> > original
> > > > value to fast, then no fail-back will happen. Is it possible, or
> I
> > > > shouldn't worry about it?
> > >
> > > AFAIK pengine is interrupted when calculating a more complicated
> > transition
> > > and if the situation has changed a transition that is just being
> > executed
> > > is aborted if the input from pengine changed.
> > > So I would definitely worry!
> > > What you could do is to issue 'crm_simulate -Ls' in between and
> > grep for
> > > an empty transition.
> > > There might be more elegant ways but that should be safe.
> >
> > crmsh has an option (-w) to wait for the PE to settle after
> > committing configuration changes.
> >
> > Thanks,
> >
> > Dejan
> > >
> > > > Thank you,
> > > > Kostia
> > > >
> > > >
> > > > _______________________________________________
> > > > Users mailing list: Users at clusterlabs.org
> > <mailto:Users at clusterlabs.org>
> > > > http://clusterlabs.org/mailman/listinfo/users
> > <http://clusterlabs.org/mailman/listinfo/users>
> > > >
> > > > Project Home: http://www.clusterlabs.org
> > > > Getting started:
> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
> > > > Bugs: http://bugs.clusterlabs.org
> > >
> > >
> > >
> > > _______________________________________________
> > > Users mailing list: Users at clusterlabs.org
> > <mailto:Users at clusterlabs.org>
> > > http://clusterlabs.org/mailman/listinfo/users
> > <http://clusterlabs.org/mailman/listinfo/users>
> > >
> > > Project Home: http://www.clusterlabs.org
> > > Getting started:
> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
> > > Bugs: http://bugs.clusterlabs.org
> >
> > _______________________________________________
> > Users mailing list: Users at clusterlabs.org
> > <mailto:Users at clusterlabs.org>
> > http://clusterlabs.org/mailman/listinfo/users
> > <http://clusterlabs.org/mailman/listinfo/users>
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started:
> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
> > Bugs: http://bugs.clusterlabs.org
> >
> >
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://clusterlabs.org/pipermail/users/attachments/20161109/57b568bf/attachment-0001.html>
More information about the Users
mailing list