[ClusterLabs] Antw: Establishing Timeouts

Mon Oct 10 06:38:39 UTC 2016

>>> Eric Robinson <eric.robinson at psmnv.com> schrieb am 10.10.2016 um 06:51 in
Nachricht
<DM5PR03MB2729841C99A2D3F431E4064CFADB0 at DM5PR03MB2729.namprd03.prod.outlook.com>

> I have about a dozen corosync+pacemaker clusters and I am just now getting 
> around to understanding timeouts.
> 
> Most of my corosync.conf files look something like this:
> 
>         version:        2
>         token:          5000
>         token_retransmits_before_loss_const: 10
>         join:           1000
>         consensus:      7500
>         vsftype:        none
>         max_messages:   20
>         secauth:        off
>         threads:        0
>         clear_node_high_bit: yes
>         rrp_mode: active
> 
> If I understand this correctly, this means the node will wait 50 seconds 
> (5000ms x 10) before deciding that a cluster reconfig is necessary (perhaps 
> after a link failure). Is that correct?
> 
> I'm trying to understand how this works together with my bonded NIC's 
> arp_interval settings. I normally set arp_interval=1000. My question is, how 
> many arp losses are required before the bonding driver decides to failover to 
> the other link? If arp_interval=1000, how many times does the driver send an 
> arp and fail to receive a reply before it decides that the link is dead?

AFAIK, it _all_ ARP targets did not respond _once_ the link will be considered down after "Down Delay".
I guess you want to use multiple (and the correct ones) ARP IP targets... If you have/need a gateway, it's not the worst choice to try that.

> 
> I think I need to know this so I can set my corosync.conf settings correctly 
> to avoid "false positive" cluster failovers. In other words, if there is a 
> link or switch failure, I want to make sure that the cluster allows plenty of 
> time for link communication to recover before deciding that a node has 
> actually died. 
> 
> --
> Eric Robinson
> 
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org 
> http://clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org