[ClusterLabs] Cluster node loss detection.
Digimer
lists at alteeve.ca
Fri Oct 16 15:03:40 UTC 2015
On 16/10/15 10:51 AM, Vallevand, Mark K wrote:
> It looks like it takes 20s for a cluster to detect that a node has been
> lost.
Loss is detected by corosync, and it declares loss after X lost totem
tokens, each token being declared lost after Y milliseconds. By default,
node loss should be detected in about 1 second of no network traffic,
but you need to check corosync's settings.
> The detection seems to correlate to dlm reporting its lost connection to
> the node.
Negative. DLM is informed when a node is declared lost and blocks until
fenced/stonithd tells it that the peer has been successfully fenced.
After which time, it reaps lost locks and recovers.
> Not sure if correlation is causation.
Correlation.
> Anyway, can someone tell me where that 20s might be coming from and if
> it is adjustable?
>
> Ubuntu 12.04 LTS
> pacemaker 1.1.10
> cman 3.1.7
> corosync 1.4.6
>
> Thanks!
>
>
>
> Regards.
> Mark K Vallevand Mark.Vallevand at Unisys.com
> <mailto:Mark.Vallevand at Unisys.com>
> Never try and teach a pig to sing: it's a waste of time, and it annoys
> the pig.
>
> THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY
> MATERIAL and is thus for use only by the intended recipient. If you
> received this in error, please contact the sender and delete the e-mail
> and its attachments from all computers.
This suffix has zero legal bearing, just saying. Anything posted to this
list is 100% open and public.
--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
More information about the Users
mailing list