[ClusterLabs] Antw: Re: CRIT: Emergency, Shutdown: Master Control process died.

Keisuke MORI keisuke.mori+ha at gmail.com
Thu May 7 09:03:25 UTC 2015


Hi,

I have seen a same issue on Ubuntu12.04 once.

We investigated it by using a systemtap script around
http://lxr.linux.no/#linux+v3.16/kernel/posix-cpu-timers.c#L993
and discovered that the kernel actually did consider that a heartbeat process
exceeded the CPU soft limits depending on the system load,
even if the heartbeat process was not so busy.
(we are not sure exactly when it happens, though)


Also, another discussion was made in the ML;
http://www.gossamer-threads.com/lists/linuxha/users/25814#25814

The CPU soft limits is enforced by heartbeat itself
when you enable "debug" directive in ha.cf.

In essence, "debug" shall not be enabled on a production system,
because it is only meant for a developer of Heartbeat itself.


So our solution was just turned off "debug" in ha.cf.

Hope it helps,

Keisuke MORI


2015-04-30 17:35 GMT+09:00 danielk <danielk_lists at z9d.de>:
> Hey,
>
> can anyone confirm that the CPU limiting comes from heartbeat itself ?
> Any ideas what i should do for debugging ?
>
> thanks,
> daniel
>
>
>
> On 04/24/2015 03:47 PM, danielk wrote:
>>
>> On 04/24/2015 03:17 PM, Ulrich Windl wrote:
>> ...
>>
>>> You don't have CPU-time quota, do you?
>>>
>> As far as i know: no.
>>
>> This may be outdated information but i think heartbeat is doing the
>> limiting. In http://www.gossamer-threads.com/lists/linuxha/users/33467
>> it says:
>>
>>   You have to be running with debug on. Heartbeat limits its CPU
>> consumption and then periodically extends it to keep from hitting the
>> limit. See cl_cpu_limit_setpercent() and friends.
>>
>> Nevertheless the question is: why MCP is hitting the CPU limit.
>
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org



-- 
Keisuke MORI




More information about the Users mailing list