[Pacemaker] killing corosync leaves crmd, stonithd, lrmd, cib and attrd to hog up the cpu
    Florian Haas 
    florian at hastexo.com
       
    Mon Nov 14 13:52:24 CET 2011
    
    
  
On 2011-11-14 13:18, Dan Frincu wrote:
> Hi,
> 
> On Mon, Nov 14, 2011 at 1:32 PM, ihjaz Mohamed <ihjazmohamed at yahoo.co.in> wrote:
>> Hi All,
>> As part of some robustness test for my cluster, I tried killing the corosync
>> process using kill -9 <pid>. After this I see that the pacemakerd service is
>> stopped but the processes crmd, stonithd, lrmd, cib and attrd are still
>> running and are hogging up the cpu.
> 
> I have seen this kind of testing before and I have to say I don't
> consider it the recommended way of testing the cluster stack's
> "robustness". Pacemaker processes rely on corosync for proper
> functioning. You kill corosync and then want to "cleanup" the
> processes? You have to go through a lot more literature in order to
> understand how this cluster stack works.
Well I, for my part, don't consider this kind of testing unreasonable at
all. If Corosync dies, say due to a segfault, then the cluster had
better recover to a consistent state.
Thus, this (very valid) testing highlights that the cluster is evidently
misconfigured; it's either not using Pacemaker MCP at all, or doesn't
have STONITH configured, or neither.
Florian
-- 
Need help with High Availability?
http://www.hastexo.com/now
    
    
More information about the Pacemaker
mailing list