[ClusterLabs] Antw: Re: Never join a list without a problem...
Ulrich Windl
Ulrich.Windl at rz.uni-regensburg.de
Wed Mar 1 04:38:08 EST 2017
>>> Kai Dupke <kdupke at suse.com> schrieb am 01.03.2017 um 09:55 in Nachricht
<bf07f253-6d47-6194-6460-bb31ec7f531f at suse.com>:
> On 02/27/2017 02:26 PM, Jeffrey Westgate wrote:
>> We use Nagios to monitor, and once every 20 to 40 hours - sometimes longer,
> and we cannot set a clock by it - while the machine is 95% idle (or more
> according to 'top'), the host load shoots up to 50 or 60%. It takes about 20
> minutes to peak, and another 30 to 45 minutes to come back down to baseline,
> which is mostly 0.00.
>
> So, you have a time window of ~1h where the system is under load, right?
> This is somewhat different to what Ulrich had, but his approach might be
> useful for you, too.
>
> Something against running some monitoring and capturing the processes,
> process states and load say, every 5 minutes?
>
> Of course, the peaks might correlate to something in the logs - like
> cron, logins, logrotates or whatever.
The main issue is "expected load" vs. "unexpected load". In my case the system was expected to be completely idle at night, so I had set the thresholds rather low. Other systems can use different approaches. I hope to hear what caused the problem in your case.
Ulrich
More information about the Users
mailing list