[Pacemaker] killing corosync leaves crmd, stonithd, lrmd, cib and attrd to hog up the cpu

Mon Nov 14 14:35:32 UTC 2011

On 11/14/2011 02:19 PM, ihjaz Mohamed wrote:
> nope. Am not using stonith.

Highly recommended -- and a must have if shared storage is in use -- for
every pacemaker cluster ... since IPMI is available with most of the
current serverhardware no extra effort beside pacemaker configuration is
necessary.

Regards,
Andreas

-- 
Need help with Pacemaker?
http://www.hastexo.com/now

> 
> ------------------------------------------------------------------------
> *From:* Andreas Kurz <andreas at hastexo.com>
> *To:* pacemaker at oss.clusterlabs.org
> *Sent:* Monday, 14 November 2011 6:08 PM
> *Subject:* Re: [Pacemaker] killing corosync leaves crmd, stonithd, lrmd,
> cib and attrd to hog up the cpu
> 
> On 11/14/2011 12:32 PM, ihjaz Mohamed wrote:
>> Hi All,
>>
>> As part of some robustness test for my cluster, I tried killing the
>> corosync process using kill -9 <pid>. After this I see that the
>> pacemakerd service is stopped but the processes crmd, stonithd, lrmd,
>> cib and attrd are still running and are hogging up the cpu.
> 
> Then fix your stonith setup if you want a "robust" cluster setup .... of
> course you are using stonith, aren't you?
> 
> Regards,
> Andreas
> 
> -- 
> Need help with Pacemaker?
> http://www.hastexo.com/now
> 
>>
>>
>> top - 06:26:51 up  2:01,  4 users,  load average: 12.04, 12.01, 11.98
>> Tasks: 330 total,  13 running, 317 sleeping,  0 stopped,  0 zombie
>> Cpu(s):  7.1%us, 17.1%sy,  0.0%ni, 75.6%id,  0.1%wa,  0.0%hi,  0.0%si,
>> 0.0%st
>> Mem:  8015444k total,  4804412k used,  3211032k free,    54800k buffers
>> Swap: 10256376k total,        0k used, 10256376k free,  1604464k cached
>>
>>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>  2053 hacluste  RT  0 90492 3324 2476 R 100.0  0.0 113:40.61 crmd
>>  2047 root      RT  0 81480 2108 1712 R 99.8  0.0 113:40.43 stonithd
>>  2048 hacluste  RT  0 83404 5260 2992 R 99.8  0.1 113:40.90 cib
>>  2050 hacluste  RT  0 85896 2388 1952 R 99.8  0.0 113:40.43 attrd
>>  5018 root      20  0 8787m 345m  56m S  2.0  4.4  0:56.95 java
>> 19017 root      20  0 15068 1252  796 R  2.0  0.0  0:00.01 top
>>    1 root      20  0 19232 1444 1156 S  0.0  0.0  0:01.71 init
>>    2 root      20  0    0    0    0 S  0.0  0.0  0:00.00 kthreadd
>>    3 root      RT  0    0    0    0 S  0.0  0.0  0:00.00 migration/0
>>    4 root      20  0    0    0    0 S  0.0  0.0  0:00.00 ksoftirqd/0
>>
>>
>> Is there a way to cleanup these processes ? OR Do I need to kill them
>> one by one before respawning the corosync?
>>
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> <mailto:Pacemaker at oss.clusterlabs.org>
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> 
> 
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> <mailto:Pacemaker at oss.clusterlabs.org>
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> 
> 
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 286 bytes
Desc: OpenPGP digital signature
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20111114/b3eea666/attachment-0004.sig>