[Pacemaker] Corosync 2.3 dies randomly

Andrew Beekhof andrew at beekhof.net
Tue May 7 00:12:46 EDT 2013


On 06/05/2013, at 3:27 AM, Robert Parsons <rparsons at tappublishing.com> wrote:

> 
> I'm trying to build out a web farm cluster using Corosync/Pacemaker. I started with the stock versions in Ubuntu 12.04 but did not have a lot of success. I removed the corosync (1.x) and pacemaker packages and built Corosync 2.3 and Pacemaker 1.1.9 from source. It generally seems to run better but I am having big issues with Corosync. I have 14 completely identical nodes. They differ only in ip address and host name. Periodically, corosync will fail to start up on boot. It's not consistent and happens randomly. What's even worse, once the cluster is up Corosync will occasionally die for no apparent reason. There are not errors logged. Nothing. The process simply disappears, taking the node offline.
> 
> My cluster has zero stability thanks to this Corosync issue. Anyone got any ideas?

Corosync has a blackbox - did you interrogate that too?





More information about the Pacemaker mailing list