[Pacemaker] corosync crash

Steven Dake sdake at redhat.com
Tue Mar 1 12:51:07 EST 2011


On 02/25/2011 12:38 AM, Andrew Beekhof wrote:
> This is the same one you sent to the openais list right?
> 

Andrew,

This was root caused to a faulty network setup resulting in the failed
to receive abort we are working on currently.  One key detail missing
from this thread is the implementation worked great on VMW ESX 4.0 but
then started having problem in ESX 4.1.....

Regards
-steve

> On Thu, Feb 24, 2011 at 10:32 AM,  <u.schmeling at online.de> wrote:
>>
>> Hi,
>>
>> my configuration has 2 nodes, one has a set of virtual adresses and a webservice. The situation before crash:
>> node1: has all resources
>> node2: online, no resources
>>
>> action on node2: crm standby node2
>> result on node1: corosync crashes, the child processes consume all available cpu time
>>
>> my actions: stop all child processes on node1 (kill -9) and restart corosync
>>
>> result on node1:
>> node1: online, all resources
>> node2: offline
>>
>> result on node2:
>> node1: offline
>> node2: online, all resources
>>
>> The only way I found to workaround this problem: remove node2 from the cluster and add it again.
>> There should be other solutions, maybe someone can help. Appended the coredump and fplay.
>>
>> Update: If I keep the cluster in the split brain state, it recovers after about 9 hours (logfile available)
>>
>> regards Uwe
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>
>>
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker





More information about the Pacemaker mailing list