[ClusterLabs] Antw: Antw: notice: throttle_handle_load: High CPU load detected

Tue Apr 5 09:37:21 EDT 2016

Thank you, Ken.
This helps a lot.
Now I am sure that my current approach fits best for me =)

Thank you,
Kostia

On Wed, Mar 30, 2016 at 11:10 PM, Ken Gaillot <kgaillot at redhat.com> wrote:

> On 03/29/2016 08:22 AM, Kostiantyn Ponomarenko wrote:
> > Ken, thank you for the answer.
> >
> > Every node in my cluster under normal conditions has "load average" of
> > about 420. It is mainly connected to the high disk IO on the system.
> > My system is designed to use almost 100% of its hardware (CPU/RAM/disks),
> > so the situation when the system consumes almost all HW resources is
> > normal.
>
> 420 suggests that HW resources are outstripped -- anything above the
> system's number of cores means processes are waiting for some resource.
> (Although with an I/O-bound workload like this, the number of cores
> isn't very important -- most will be sitting idle despite the high
> load.) And if that's during normal conditions, what will happen during a
> usage spike? It sounds like a recipe for less-than-HA.
>
> Under high load, there's a risk of negative feedback, where monitors
> time out, causing pacemaker to schedule recovery actions, which cause
> load to go higher and more monitors to time out, etc. That's why
> throttling is there.
>
> > I would like to get rid of "High CPU load detected" messages in the
> > log, because
> > they flood corosync.log as well as system journal.
> >
> > Maybe you can give an advice what would be the best way do to it?
> >
> > So far I came up with the idea of setting "load-threshold" to 1000% ,
> > because of:
> >     420(load average) / 24 (cores) = 17.5 (adjusted_load);
> >     2 (THROTLE_FACTOR_HIGH) * 10 (throttle_load_target) = 20
> >
> >     if(adjusted_load > THROTTLE_FACTOR_HIGH * throttle_load_target) {
> >         crm_notice("High %s detected: %f", desc, load);
>
> That should work, as far as reducing the log messages, though of course
> it also reduces the amount of throttling pacemaker will do.
>
> > In this case do I need to set "node-action-limit" to something less than
> "2
> > x cores" (which is default).
>
> It's not necessary, but it would help compensate for the reduced
> throttling by imposing a maximum number of actions run at one time.
>
> I usually wouldn't recommend reducing log verbosity, because detailed
> logs are often necessary for troubleshooting cluster issues, but if your
> logs are on the same I/O controller that is overloaded, you might
> consider logging only to syslog and not to an additional detail file.
> That would cut back on the amount of I/O due to pacemaker itself. You
> could even drop PCMK_logpriority to warning, but then you're losing even
> more information.
>
> > Because the logic is (crmd/throttle.c):
> >
> >     switch(r->mode) {
> >         case throttle_extreme:
> >         case throttle_high:
> >             jobs = 1; /* At least one job must always be allowed */
> >             break;
> >         case throttle_med:
> >             jobs = QB_MAX(1, r->max / 4);
> >             break;
> >         case throttle_low:
> >             jobs = QB_MAX(1, r->max / 2);
> >             break;
> >         case throttle_none:
> >             jobs = QB_MAX(1, r->max);
> >             break;
> >         default:
> >             crm_err("Unknown throttle mode %.4x on %s", r->mode, node);
> >             break;
> >     }
> >     return jobs;
> >
> >
> > The thing is, I know that there is "High CPU load" and this is normal
> > state, but I wont Pacemaker to not saying it to me and treat this state
> the
> > best it can.
>
> If you can't improve your I/O performance, what you suggested is
> probably the best that can be done.
>
> When I/O is that critical to you, there are many tweaks that can make a
> big difference in performance. I'm not sure how familiar you are with
> them already. Options depend on what your storage is (local or network,
> hardware/software/no RAID, etc.) and what your I/O-bound application is
> (database, etc.), but I'd look closely at cache/buffer settings at all
> levels from hardware to application, RAID stripe alignment, filesystem
> choice and tuning, log verbosity, etc.
>
> >
> > Thank you,
> > Kostia
> >
> > On Mon, Mar 14, 2016 at 7:18 PM, Ken Gaillot <kgaillot at redhat.com>
> wrote:
> >
> >> On 02/29/2016 07:00 AM, Kostiantyn Ponomarenko wrote:
> >>> I am back to this question =)
> >>>
> >>> I am still trying to understand the impact of "High CPU load detected"
> >>> messages in the log.
> >>> Looking in the code I figured out that setting "load-threshold"
> parameter
> >>> to something higher than 100% solves the problem.
> >>> And actually for 8 cores (12 with Hyper Threading) load-threshold=400%
> >> kind
> >>> of works.
> >>>
> >>> Also I noticed that this parameter may have an impact on the number of
> >> "the
> >>> maximum number of jobs that can be scheduled per node". As there is a
> >>> formula to limit F_CRM_THROTTLE_MAX based on F_CRM_THROTTLE_MODE.
> >>>
> >>> Is my understanding correct that the impact of setting "load-threshold"
> >>> high enough (so there is no noisy messages) will lead only to the
> >>> "throttle_job_max" and nothing more.
> >>> Also, if I got it correct, than "throttle_job_max" is a number of
> allowed
> >>> parallel actions per node in lrmd.
> >>> And a child of the lrmd is actually an RA process running some actions
> >>> (monitor, start, etc).
> >>>
> >>> So there is no impact on how many RA (resources) can run on a node, but
> >> how
> >>> Pacemaker will operate with them in parallel (I am not sure I
> understand
> >>> this part correct).
> >>
> >> I believe that is an accurate description. I think the job limit applies
> >> to fence actions as well as lrmd actions.
> >>
> >> Note that if /proc/cpuinfo exists, pacemaker will figure out the number
> >> of cores from there, and divide the actual reported load by that number
> >> before comparing against load-threshold.
> >>
> >>> Thank you,
> >>> Kostia
> >>>
> >>> On Wed, Jun 3, 2015 at 12:17 AM, Andrew Beekhof <andrew at beekhof.net>
> >> wrote:
> >>>
> >>>>
> >>>>> On 27 May 2015, at 10:09 pm, Kostiantyn Ponomarenko <
> >>>> konstantin.ponomarenko at gmail.com> wrote:
> >>>>>
> >>>>> I think I wasn't precise in my questions.
> >>>>> So I will try to ask more precise questions.
> >>>>> 1. why the default value for "load-threshold" is 80%?
> >>>>
> >>>> Experimentation showed it better to begin throttling before the node
> >>>> became saturated.
> >>>>
> >>>>> 2. what would be the impact to the cluster in case of
> >>>> "load-threshold=100%”?
> >>>>
> >>>> Your nodes will be busier.  Will they be able to handle your load or
> >> will
> >>>> it result in additional recovery actions (creating more load and more
> >>>> failures)?  Only you will know when you try.
> >>>>
> >>>>>
> >>>>> Thank you,
> >>>>> Kostya
> >>>>>
> >>>>> On Mon, May 25, 2015 at 4:11 PM, Kostiantyn Ponomarenko <
> >>>> konstantin.ponomarenko at gmail.com> wrote:
> >>>>> Guys, please, if anyone can help me to understand this parameter
> >> better,
> >>>> I would be appreciated.
> >>>>>
> >>>>>
> >>>>> Thank you,
> >>>>> Kostya
> >>>>>
> >>>>> On Fri, May 22, 2015 at 4:15 PM, Kostiantyn Ponomarenko <
> >>>> konstantin.ponomarenko at gmail.com> wrote:
> >>>>> Another question - is it crmd specific to measure CPU usage by "I/O
> >>>> wait"?
> >>>>> And if I need to get the most performance of the running resources in
> >>>> cluster, should I set "load-threshold=95%" (or even 100%)?
> >>>>> Will it impact the cluster behavior in any ways?
> >>>>> The man page for crmd says that it will "The cluster will slow down
> its
> >>>> recovery process when the amount of system resources used (currently
> >> CPU)
> >>>> approaches this limit".
> >>>>> Does it mean there will be delays in cluster in moving resources in
> >> case
> >>>> a node goes down, or something else?
> >>>>> I just want to understand in better.
> >>>>>
> >>>>> That you in advance for the help =)
> >>>>>
> >>>>> P.S.: The main resource does a lot of disk I/Os.
> >>>>>
> >>>>>
> >>>>> Thank you,
> >>>>> Kostya
> >>>>>
> >>>>> On Fri, May 22, 2015 at 3:30 PM, Kostiantyn Ponomarenko <
> >>>> konstantin.ponomarenko at gmail.com> wrote:
> >>>>> I didn't know that.
> >>>>> You mentioned "as opposed to other Linuxes", but I am using Debian
> >> Linux.
> >>>>> Does it also measure CPU usage by I/O waits?
> >>>>> You are right about "I/O waits" (a screenshot of "top" is attached).
> >>>>> But why it shows 50% of CPU usage for a single process (that is the
> >> main
> >>>> one) while "I/O waits" shows a bigger number?
> >>>>>
> >>>>>
> >>>>> Thank you,
> >>>>> Kostya
> >>>>>
> >>>>> On Fri, May 22, 2015 at 9:40 AM, Ulrich Windl <
> >>>> Ulrich.Windl at rz.uni-regensburg.de> wrote:
> >>>>>>>> "Ulrich Windl" <Ulrich.Windl at rz.uni-regensburg.de> schrieb am
> >>>> 22.05.2015 um
> >>>>> 08:36 in Nachricht <
> 555EEA72020000A10001A71D at gwsmtp1.uni-regensburg.de
> >>> :
> >>>>>> Hi!
> >>>>>>
> >>>>>> I Linux I/O waits are considered for load (as opposed to other
> >>>> Linuxes) Thus
> >>>>> ^^ "In"
> >>>>                             s/Linux/UNIX/
> >>>>>
> >>>>> (I should have my coffee now to awake ;-) Sorry.
> >>
> >> _______________________________________________
> >> Users mailing list: Users at clusterlabs.org
> >> http://clusterlabs.org/mailman/listinfo/users
> >>
> >> Project Home: http://www.clusterlabs.org
> >> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >> Bugs: http://bugs.clusterlabs.org
> >>
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20160405/5099cb90/attachment-0002.html>