<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Hi, <br>
<br>
</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
I’ve been researching the Corosync communication layer and would like to understand how to calculate the total failure timeout for a host. From what I’ve gathered, the relevant parameters include the
<b>base token</b> (defined in <code>corosync.conf</code>), the <b>runtime token timeout</b> (<code>runtime.config.totem.token</code>), as well as
<b>token_retransmit</b>, <b>token_retransmit_before_loss_const</b>, and <b>consensus</b>. Could you please clarify how these values contribute to the overall failure detection time?<br>
<br>
runtime.config.totem.token = base token + (number of nodes - 2) * token_coefficient</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Total failure detection time = runtime.config.totem.token + (token_retransmit x token_retransmit_before_loss_const)<br>
<br>
consensus = 1.2 * runtime.config.totem.token<br>
<br>
For example: 3 servers</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
base token (from corosync.conf) = 2000ms</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
runtime.config.totem.token = 2650ms</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
token_coefficient = 650ms</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
token_retransmit = 1000ms</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
token_retransmit_before_loss_const = 4</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
consensus = 3180</div>
<div class="elementToProof" style="margin-top: 1em; margin-bottom: 1em; font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Are those values correct? <br>
<br>
For example, if Server 2 goes down and the real token timeout (runtime.config.totem.token) is set to 2650 ms, the token is retransmitted 4 times at 1000 ms intervals, total 4000 ms. Added together, the total failure timeout is 6650 ms before the node is declared
failed. Is that correct?</div>
<div class="elementToProof" style="margin-top: 1em; margin-bottom: 1em; font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Then how does the consensus timeout work? After the 6650 ms timeout, the node is declared down. Does the system need to remove the node within the 3180 ms consensus timeout? Is there no grace period in Corosync? Is my analysis correct? Thank you!<br>
<br>
best regards,<br>
<br>
Vicki Chen<br>
<br>
</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
</body>
</html>