[Pacemaker] Need to relax corosync due to backup of VM through snapshot

Sun Nov 24 15:47:45 UTC 2013

On 11/21/2013 06:26 AM, Gianluca Cecchi wrote:
> On Thu, Nov 21, 2013 at 9:09 AM, Lars Marowsky-Bree wrote:
>> On 2013-11-20T16:58:01, Gianluca Cecchi <gianluca.cecchi at gmail.com> wrote:
>>
>>> Based on docs  I thought that the timeout should be
>>>
>>> token x token_retransmits_before_loss_const
>> No, the comments in the corosync.conf.example and man corosync.conf
>> should be pretty clear, I hope. Can you recommend which phrasing we
>> should improve?
> I have not understood exact relationship between token and
> token_retransmits_before_loss_const.
> When one comes into play and when the other one...
> So perhaps the second one could be given more details.
> Or some web links

The token retransmit is a timer that is started each time a token is 
transmitted.  This is the maximum timer that exists - it is not token * 
retransmits_before_loss_const.

The retrans_before_loss_const says "please transmit a replacement token 
x many times in the token period".  Since the token is UDP, it could be 
lost in network overflow situations or other scenarios.

Using a real-world example
token: 10000
retrans_before_loss_const: 10

token will be retransmitted roughly every 1000 msec and the token will 
be determined lost after 10000msec.

Regards
-steve

>>> SO my current test config is:
>>>    # diff corosync.conf corosync.conf.pre181113
>>> 24,25c24
>>> < #token: 5000
>>> < token: 120000
>> A 120s node timeout? That is really, really long. Why is the backup tool
>> interfering with the scheduling of high priority processes so much? That
>> sounds like the real bug.
> In fact I inherited analysis for a previous production cluster and I'm
> setting up a test environment to demonstrate that one of the realistic
> outputs could well be that a cluster is not the right solution
> implemented because the underlying infra is not stable enough.
> I'm not given a great visibility for what is VMware and SAN details,
> but I'm stressing to get them.
> I saw sometimes disk latencies going at 8000milliseceonds.... ;-(
> SO another possible output could be to make a more reliable infra
> before going with cluster.
> I'm putting deliberately high values to see what happens and lower
> them step by step
> BTW: I remember in the past some thread with other having problems
> with Netbackup (or similar backup software ) using snapshot and that
> putting higher values solved the sporadic problems (possibly 20000 for
> token and 10 for retransmit but I couldn't find them ...)
>
>
>>> Any comment?
>>> Any different strategies successfully used in similar environments
>>> where high latencies get in place at snapshot deletion when
>>> consolidate phase of disks is executed?
>> A setup where a VM apparently can freeze for almost 120s is not suitable
>> for HA.
>>
> I see from previous logs that sometimes drbd disconnect and reconnect
> only after 30-40 seconds with default timeouts...
>
> Thanks for your inputs.
>
> Gianluca
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org