<div dir="ltr"><div dir="ltr">Hello Ulrich, Hello team,<div><br></div><div>This night the servers of the cluster restarted together twice ( 08h17m07 & 08h50m04 22/02/2022 for server A, 08h16m32 & 08h49m43 22/02/2022 for server B ). </div><div><br></div><div>Here is the result of the up/down test :</div><div><br></div><div><u>ServerA :</u></div><div><br></div><div><blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px"><div><u>Log of Qdevice from ServerA :</u></div></blockquote><blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px"><ul><li>None</li></ul></blockquote></div><div><blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px"><div><u>Log of ServerB from ServerA :</u></div></blockquote><blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px"><div><ul><li>21/02/2022 18:48:45 Down between 0 and 4 seconds</li><li>21/02/2022 18:58:33 Down between 0 and 4 seconds</li><li>21/02/2022 19:19:43 Down between 0 and 3 seconds</li><li><i>No trace of lost communication for 08h17 & 08h50 of server B because after the restart, the scripts of up/down test have not restarted.</i><br></li></ul></div></blockquote></div><div><u>ServerB :</u><br></div><div><br></div><blockquote style="margin:0 0 0 40px;border:none;padding:0px"><div><u>Log of Qdevice from ServerB :</u></div></blockquote><blockquote style="margin:0 0 0 40px;border:none;padding:0px"><ul><li>21/02/2022 08:30:26 Down between 0 and 3 seconds</li><li>21/02/2022 23:02:14 Down between 0 and 3 seconds</li></ul></blockquote><blockquote style="margin:0 0 0 40px;border:none;padding:0px"><div><u>Log of ServerA from ServerB : </u><br></div></blockquote><blockquote style="margin:0 0 0 40px;border:none;padding:0px"><ul><li>21/02/2022 18:47:38 Down between 0 and 4 seconds</li><li>21/02/2022 19:25:06 Down between 0 and 4 seconds</li><li>21/02/2022 19:42:39 Down between 0 and 4 seconds</li><li><i>No trace of lost communication for 08h16 & 08h49 of server B because after the restart, the scripts of up/down test have not restarted.</i><br></li></ul></blockquote><div><div><u>QDevice :</u></div></div><div><br></div><div><blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px"><div><u>Log of ServerA from Qdevice :</u></div></blockquote><blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px"><ul><li>22/02/2022 07:15:57 Down between 83 and 86 seconds => ( it match of restart of the server if we add 1 hour to the time )</li><li>22/02/2022 07:48:52 Down between 82 and 85 seconds => ( it match of restart of the server if we add 1 hour to the time )</li></ul></blockquote><blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px"><div><u>Log of ServerB from Qdevice :</u><br></div></blockquote><blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px"><div><ul><li>21/02/2022 23:02:22 Down between 0 and 4 seconds</li><li>22/02/2022 07:15:46 Down between 55 and 58 seconds => ( it match of restart of the server if we add 1 hour to the time )</li><li>22/02/2022 07:48:58 Down between 56 and 59 seconds => ( it match of restart of the server if we add 1 hour to the time )<br><br></li></ul></div></blockquote></div><div><div style="height:216px">Strangely, the clocks of the 3 computers are the same but, each time, the time of Qdevice is less than 1 hour than ServerA or ServerB.<br><br>I don't understand why I have no trace of lost connection between servers before they restarted.<br><br>If ServerS and ServerS lost connection with the qDevice, can someone confirm to me if they restart (fencing) or not ?<br><br>Thanks for your help.<br><br>Le lun. 21 févr. 2022 à 10:08, Sebastien BASTARD <<a href="mailto:sebastien@domalys.com" target="_blank">sebastien@domalys.com</a>> a écrit :</div></div></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Hello Ulrich,<div><br></div><div>I modified your script to add the capability to test the TCP connectivity. Currently, between servers A or B and the QDevice, there is a firewall which doesn't answer to ping request. So, I tested the 5403 port.</div><div><br>There is result of the week-end :<br></div><div><br></div><div>Logs of Server A :<br><div><br></div></div><blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px"><div><div>==> log_up_down_ServerB_from_ServerA.txt <==</div></div><div><div>---START 1645111039 (2022-02-17_15:17:19)</div></div><div><div>0 (11) -> 1 1645111050 (2022-02-17_15:17:30)</div></div><div><div>---EXIT 1645177062 (2022-02-18_09:37:42)</div></div><div><div>---START 1645199714 (2022-02-18_15:55:14)</div></div><div><div>0 (4) -> 1 1645199718 (2022-02-18_15:55:18)</div></div></blockquote><div><div><br></div></div><blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px"><div><div>==> log_up_down_qdevice_from_ServerA.txt <==</div></div><div><div>---START 1645117334 (2022-02-17_17:02:14)</div></div><div><div>0 (10) -> 1 1645117344 (2022-02-17_17:02:24)</div></div><div><div><b>1 (27820) -> 0 1645145164 (2022-02-18_00:46:04)</b></div></div><div><div>0 (10) -> 1 1645145174 (2022-02-18_00:46:14)</div></div><div><div>---EXIT 1645177062 (2022-02-18_09:37:42)</div></div><div><div>---START 1645199684 (2022-02-18_15:54:44)</div></div><div><div>0 (3) -> 1 1645199687 (2022-02-18_15:54:47)</div></div><div><div><b>1 (19519) -> 0 1645219206 (2022-02-18_21:20:06)</b></div></div><div><div>0 (3) -> 1 1645219209 (2022-02-18_21:20:09)</div></div><div><br></div></blockquote><div>The scripts on Server A stopped working because I forgot to launch it in the background. But we can see that server A lost connection with the Qdevice twice.</div><div><br></div><div>Logs of Server B :</div><div><br></div><blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px"><div>==> log_up_down_
ServerA_from_ServerB.txt <==</div><div>---START 1645110964 (2022-02-17_15:16:04)</div><div>0 (11) -> 1 1645110975 (2022-02-17_15:16:15)</div><div>---EXIT 1645199533 (2022-02-18_15:52:13)</div><div>---START 1645199576 (2022-02-18_15:52:56)</div><div>0 (4) -> 1 1645199580 (2022-02-18_15:53:00)</div></blockquote><div><blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px"><div><br></div><div>==> log_up_down_qdevice_from_
ServerB .txt <==</div><div>---START 1645117428 (2022-02-17_17:03:48)</div><div>0 (10) -> 1 1645117438 (2022-02-17_17:03:58)</div><div>---EXIT 1645199529 (2022-02-18_15:52:09)</div><div>---START 1645199546 (2022-02-18_15:52:26)</div><div>0 (3) -> 1 1645199549 (2022-02-18_15:52:29)</div><div><b>1 (232677) -> 0 1645432226 (2022-02-21_08:30:26)</b></div><div>0 (3) -> 1 1645432229 (2022-02-21_08:30:29)</div></blockquote></div><div><div><br></div></div><div><div>The scripts on Server B stopped working because I forgot to launch it in the background. But we can see that server B lost connection with the Qdevice one time.</div><div><br></div></div><div>Logs of qDevice :</div><div><br></div><blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px"><div>==> log_up_down_ServerA_from_qdevice.txt <==</div><div>---START 1645363302 (2022-02-20_13:21:42)</div><div>0 (4) -> 1 1645363306 (2022-02-20_13:21:46)</div></blockquote><br><blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px">==> log_up_down_
ServerB _from_qdevice.txt <==<br>---START 1645363310 (2022-02-20_13:21:50)<div>0 (4) -> 1 1645363314 (2022-02-20_13:21:54)</div></blockquote><div><br></div><div><div>The scripts on qDevice stopped working because the input was linked to the script and after some minutes, the OS killed the script. We can see the Qdevice never lost the connection with the 2 servers.</div><div><br></div><div>I continue to control the output of the scripts to see when the servers lost the connections and when they are fencing.<br></div><div><br></div><div>Best regards.</div><div><br></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Le ven. 18 févr. 2022 à 08:07, Ulrich Windl <<a href="mailto:Ulrich.Windl@rz.uni-regensburg.de" target="_blank">Ulrich.Windl@rz.uni-regensburg.de</a>> a écrit :<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">>>> Sebastien BASTARD <<a href="mailto:sebastien@domalys.com" target="_blank">sebastien@domalys.com</a>> schrieb am 17.02.2022 um 16:28 in<br>
Nachricht<br>
<<a href="mailto:CAAjZqdz9a2OorPyoSjdRFWNgJT5snOH2KehkpXdEbAuZrWOvEw@mail.gmail.com" target="_blank">CAAjZqdz9a2OorPyoSjdRFWNgJT5snOH2KehkpXdEbAuZrWOvEw@mail.gmail.com</a>>:<br>
> Thank you Ulrich for your script !<br>
> <br>
> I launched it, with 10 seconds delay :<br>
> <br>
> - on Server A, to ping Server B<br>
> - on Server B, to ping server A<br>
> - on QDevice, to ping server A and Server B<br>
> <br>
> I currently can't ping Qdevice from server A and B, because it is behind a<br>
> firewall which only authorizes port 5403.<br>
> <br>
> Tomorrow, I will see the results.<br>
<br>
Maybe another remark: The script was not desoigned for cluster, so it was good enough to reditrect the output of the script to a file.<br>
However bash may buffer some lines before they are written. If the script is killed, that's not a problem, but if the node is fenced, you might loose the last lines(s).<br>
So maybe you want do change the echo statement in log_time() to:<br>
echo "$@ $t ($(date -d@"$t" -u +%F_%T))" >> your_log_file<br>
<br>
Maybe you want to use a variable or parameter for that.<br>
<br>
Regards,<br>
Ulrich<br>
<br>
<br>
_______________________________________________<br>
Manage your subscription:<br>
<a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>
<br>
ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a><br>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr"><div dir="ltr"><div><div><div dir="ltr"><div><div dir="ltr"><table>
<tbody>
<tr>
<td>
<table>
<tbody>
<tr>
<td>
<br><img src="https://res.cloudinary.com/hxdnwvezo/image/asset/v1581501005/sebastien-1ca4a93f85de46095e67fba629dd919a.png" width="100" height="100">
<br>
</td>
<td style="padding:5px 20px 0px">
<span style="color:rgb(239,125,0)">Sébastien BASTARD</span>
<br>
<b>Ingénieur R&D</b> | Domalys • Créateurs d’autonomie
<br>
<br>
<font color="#4f3c71"> | phone :</font> +33 5 49 83 00 08
<br>
<font color="#4f3c71"> | site : </font>
<a href="http://www.domalys.com" style="text-decoration:none" target="_blank">www.domalys.com</a>
<br>
<font color="#4f3c71"> | email :</font>
<a href="mailto:sebastien@domalys.com" style="text-decoration:none" target="_blank">sebastien@domalys.com</a>
<br>
<font color="#4f3c71"> | address
:</font> 58 Rue du Vercors 86240 Fontaine-Le-Comte
<br>
<br>
</td>
</tr>
</tbody>
</table>
<table>
<tbody>
<tr>
<td>
<table>
<tbody>
<tr align="center">
<td>
<a href="https://www.domalys.com/" style="text-decoration:none" target="_blank">
<img src="https://res.cloudinary.com/hxdnwvezo/image/asset/app_logo-afaff0e455909cd6f414a066feecb4d4.png" width="90">
</a>
</td>
<td>
<a href="https://www.facebook.com/domalys/" style="text-decoration:none" target="_blank">
<img src="https://res.cloudinary.com/hxdnwvezo/image/upload/v1539349318/facebook_ai3qkl.png" width="50">
</a>
</td>
<td>
<a href="https://twitter.com/domalysfr" style="text-decoration:none" target="_blank">
<img src="https://res.cloudinary.com/hxdnwvezo/image/upload/v1539349318/twitter_ihhmxh.png" width="50">
</a>
</td>
<td>
<a href="https://www.youtube.com/channel/UCRLVU19hjkZ0dv29FaPJacw" style="text-decoration:none" target="_blank">
<img src="https://res.cloudinary.com/hxdnwvezo/image/upload/v1539349324/youtube_ngllux.png" width="50">
</a>
</td>
<td>
<a href="https://www.linkedin.com/company/domalys/?originalSubdomain=fr" style="text-decoration:none" target="_blank">
<img src="https://res.cloudinary.com/hxdnwvezo/image/upload/v1539349318/linkedin_l9whfl.png" width="50">
</a>
</td>
<td>
<a href="https://youtu.be/77t5rETTwQs" style="text-decoration:none" target="_blank">
<img src="https://res.cloudinary.com/hxdnwvezo/image/upload/v1539349542/team_pztc1j.png" width="50">
</a>
</td>
<td>
<a href="https://www.ces.tech" style="text-decoration:none" target="_blank">
<img src="https://res.cloudinary.com/hxdnwvezo/image/asset/v1542279889/ces_icon-cbefc04feb1bb0064f5e0c2e80d2fe45.png" width="55">
</a>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td><td style="padding:5px 20px 0px"><br></td></tr></tbody></table><a href="https://www.ces.tech" style="text-decoration:none" target="_blank">
</a></div></div></div></div></div><div><img src="https://docs.google.com/uc?export=download&id=16v-5uIvzUX7FG9anOADm0utq96zDMs8w&revid=0B5aDicP2dRSsa2xHUTdBNTI3WTNRaDF6YmZkcW5xcEw2bzkwPQ"><br></div></div></div>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr"><div dir="ltr"><div><div><div dir="ltr"><div><div dir="ltr"><table>
<tbody>
<tr>
<td>
<table>
<tbody>
<tr>
<td>
<br><img src="https://res.cloudinary.com/hxdnwvezo/image/asset/v1581501005/sebastien-1ca4a93f85de46095e67fba629dd919a.png" width="100" height="100">
<br>
</td>
<td style="padding:5px 20px 0px">
<span style="color:rgb(239,125,0)">Sébastien BASTARD</span>
<br>
<b>Ingénieur R&D</b> | Domalys • Créateurs d’autonomie
<br>
<br>
<font color="#4f3c71"> | phone :</font> +33 5 49 83 00 08
<br>
<font color="#4f3c71"> | site : </font>
<a href="http://www.domalys.com" style="text-decoration:none" target="_blank">www.domalys.com</a>
<br>
<font color="#4f3c71"> | email :</font>
<a href="mailto:sebastien@domalys.com" style="text-decoration:none" target="_blank">sebastien@domalys.com</a>
<br>
<font color="#4f3c71"> | address
:</font> 58 Rue du Vercors 86240 Fontaine-Le-Comte
<br>
<br>
</td>
</tr>
</tbody>
</table>
<table>
<tbody>
<tr>
<td>
<table>
<tbody>
<tr align="center">
<td>
<a href="https://www.domalys.com/" style="text-decoration:none" target="_blank">
<img src="https://res.cloudinary.com/hxdnwvezo/image/asset/app_logo-afaff0e455909cd6f414a066feecb4d4.png" width="90">
</a>
</td>
<td>
<a href="https://www.facebook.com/domalys/" style="text-decoration:none" target="_blank">
<img src="https://res.cloudinary.com/hxdnwvezo/image/upload/v1539349318/facebook_ai3qkl.png" width="50">
</a>
</td>
<td>
<a href="https://twitter.com/domalysfr" style="text-decoration:none" target="_blank">
<img src="https://res.cloudinary.com/hxdnwvezo/image/upload/v1539349318/twitter_ihhmxh.png" width="50">
</a>
</td>
<td>
<a href="https://www.youtube.com/channel/UCRLVU19hjkZ0dv29FaPJacw" style="text-decoration:none" target="_blank">
<img src="https://res.cloudinary.com/hxdnwvezo/image/upload/v1539349324/youtube_ngllux.png" width="50">
</a>
</td>
<td>
<a href="https://www.linkedin.com/company/domalys/?originalSubdomain=fr" style="text-decoration:none" target="_blank">
<img src="https://res.cloudinary.com/hxdnwvezo/image/upload/v1539349318/linkedin_l9whfl.png" width="50">
</a>
</td>
<td>
<a href="https://youtu.be/77t5rETTwQs" style="text-decoration:none" target="_blank">
<img src="https://res.cloudinary.com/hxdnwvezo/image/upload/v1539349542/team_pztc1j.png" width="50">
</a>
</td>
<td>
<a href="https://www.ces.tech" style="text-decoration:none" target="_blank">
<img src="https://res.cloudinary.com/hxdnwvezo/image/asset/v1542279889/ces_icon-cbefc04feb1bb0064f5e0c2e80d2fe45.png" width="55">
</a>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td><td style="padding:5px 20px 0px"><br></td></tr></tbody></table><a href="https://www.ces.tech" style="text-decoration:none" target="_blank">
</a></div></div></div></div></div><div><img src="https://docs.google.com/uc?export=download&id=16v-5uIvzUX7FG9anOADm0utq96zDMs8w&revid=0B5aDicP2dRSsa2xHUTdBNTI3WTNRaDF6YmZkcW5xcEw2bzkwPQ"><br></div></div></div>
</div>