[Pacemaker] Backup ring is marked faulty

Steven Dake sdake at redhat.com
Thu Aug 4 18:59:03 UTC 2011


On 08/04/2011 11:43 AM, Sebastian Kaps wrote:
> Hi Steven,
> 
> On 04.08.2011, at 18:27, Steven Dake wrote:
> 
>> redundant ring is only supported upstream in corosync 1.4.1 or later.
> 
> What does "supported" mean in this context, exactly? 
> 

meaning the corosync community doesn't investigate redundant ring issues
prior to corosync versions 1.4.1.

I expect the root of ypur problem is already fixed (the retransmit list
problem) however in the repos and latest released versions.

Regards
-steve

> I'm asking, because we're having serious issues with these systems since 
> they went into production (the testing phase did not show any problems, 
> but we also couldn't use real workloads then).
> 
> Since the cluster went productive, we're having issues with seemingly random 
> STONITH events that seem to be related to a high I/O load on a DRBD-mirrored
> OCFS2 volume - but I don't see any pattern yet. We've had these machines 
> running for nearly two weeks without major problems and suddenly they went 
> back to killing each other :-(
> 
>> The retransmit list message issues you are having is fixed in corosync
>> 1.3.3. and later  This is what is triggering the redundant ring faulty
>> error.
> 
> Could it also cause the instability problems we're seeing?
> Thanks again, for helping!

yes

> 





More information about the Pacemaker mailing list