[Pacemaker] cluster failover under load

Mon Feb 20 05:59:19 EST 2012

On Fri, Feb 17, 2012 at 10:22 PM, Dejan Muhamedagic <dejanmm at fastmail.fm> wrote:
> Hi,
>
> On Fri, Feb 17, 2012 at 10:37:39AM +0100, fatcharly at gmx.de wrote:
>> Hi,
>>
>> we are using a pacemaker 1.0.5-4.7

1.0.5 is pretty long in the tooth.  Please consider something a little
more recent.

> with heartbeat 3.0.0-33.10
>> and a drbd device on a CentOS 5.7 webserver-cluster. When the
>> active node gets under heavy load, the cluster sometimes starts
>> to failover.
>
> What does failover? Resources? All resources? Is the node
> considered down/lost?
>
>> Is there a way to make this behavior less
>> sensitive like changing the retry/recheck time/counter ?
>
> Indeed there is, but depends on what's happening. If it's on a
> heartbeat level, then you need to tweak ha.cf. If on a resource
> level, then I guess monitor timeouts.
>
> Thanks,
>
> Dejan
>
>> Any Suggestions are welcome
>>
>> kind regards
>>
>> fatcharly
>> --
>> Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir
>> belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.gmx.de
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org