<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=utf-8"><meta name=Generator content="Microsoft Word 15 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman",serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p
{mso-style-priority:99;
mso-margin-top-alt:auto;
margin-right:0cm;
mso-margin-bottom-alt:auto;
margin-left:0cm;
font-size:12.0pt;
font-family:"Times New Roman",serif;}
p.msonormal0, li.msonormal0, div.msonormal0
{mso-style-name:msonormal;
mso-style-priority:99;
mso-margin-top-alt:auto;
margin-right:0cm;
mso-margin-bottom-alt:auto;
margin-left:0cm;
font-size:12.0pt;
font-family:"Times New Roman",serif;}
p.m-38161929789188403msipheadera43f35f7, li.m-38161929789188403msipheadera43f35f7, div.m-38161929789188403msipheadera43f35f7
{mso-style-name:m_-38161929789188403msipheadera43f35f7;
mso-style-priority:99;
mso-margin-top-alt:auto;
margin-right:0cm;
mso-margin-bottom-alt:auto;
margin-left:0cm;
font-size:12.0pt;
font-family:"Times New Roman",serif;}
p.m-38161929789188403m-2632291695807224497msipheadera43f35f7, li.m-38161929789188403m-2632291695807224497msipheadera43f35f7, div.m-38161929789188403m-2632291695807224497msipheadera43f35f7
{mso-style-name:m_-38161929789188403m-2632291695807224497msipheadera43f35f7;
mso-style-priority:99;
mso-margin-top-alt:auto;
margin-right:0cm;
mso-margin-bottom-alt:auto;
margin-left:0cm;
font-size:12.0pt;
font-family:"Times New Roman",serif;}
p.m-38161929789188403m-2632291695807224497m-8500734528484967090msipheadera43f35f7, li.m-38161929789188403m-2632291695807224497m-8500734528484967090msipheadera43f35f7, div.m-38161929789188403m-2632291695807224497m-8500734528484967090msipheadera43f35f7
{mso-style-name:m_-38161929789188403m-2632291695807224497m-8500734528484967090msipheadera43f35f7;
mso-style-priority:99;
mso-margin-top-alt:auto;
margin-right:0cm;
mso-margin-bottom-alt:auto;
margin-left:0cm;
font-size:12.0pt;
font-family:"Times New Roman",serif;}
p.m-38161929789188403m-2632291695807224497m-8500734528484967090msipfooterfa6f9f96, li.m-38161929789188403m-2632291695807224497m-8500734528484967090msipfooterfa6f9f96, div.m-38161929789188403m-2632291695807224497m-8500734528484967090msipfooterfa6f9f96
{mso-style-name:m_-38161929789188403m-2632291695807224497m-8500734528484967090msipfooterfa6f9f96;
mso-style-priority:99;
mso-margin-top-alt:auto;
margin-right:0cm;
mso-margin-bottom-alt:auto;
margin-left:0cm;
font-size:12.0pt;
font-family:"Times New Roman",serif;}
p.m-38161929789188403m-2632291695807224497msipfooterfa6f9f96, li.m-38161929789188403m-2632291695807224497msipfooterfa6f9f96, div.m-38161929789188403m-2632291695807224497msipfooterfa6f9f96
{mso-style-name:m_-38161929789188403m-2632291695807224497msipfooterfa6f9f96;
mso-style-priority:99;
mso-margin-top-alt:auto;
margin-right:0cm;
mso-margin-bottom-alt:auto;
margin-left:0cm;
font-size:12.0pt;
font-family:"Times New Roman",serif;}
p.m-38161929789188403msipfooterfa6f9f96, li.m-38161929789188403msipfooterfa6f9f96, div.m-38161929789188403msipfooterfa6f9f96
{mso-style-name:m_-38161929789188403msipfooterfa6f9f96;
mso-style-priority:99;
mso-margin-top-alt:auto;
margin-right:0cm;
mso-margin-bottom-alt:auto;
margin-left:0cm;
font-size:12.0pt;
font-family:"Times New Roman",serif;}
span.EmailStyle25
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:#1F497D;}
p.msipheadera43f35f7, li.msipheadera43f35f7, div.msipheadera43f35f7
{mso-style-name:msipheadera43f35f7;
mso-style-priority:99;
mso-margin-top-alt:auto;
margin-right:0cm;
mso-margin-bottom-alt:auto;
margin-left:0cm;
font-size:12.0pt;
font-family:"Times New Roman",serif;}
p.msipfooterfa6f9f96, li.msipfooterfa6f9f96, div.msipfooterfa6f9f96
{mso-style-name:msipfooterfa6f9f96;
mso-style-priority:99;
mso-margin-top-alt:auto;
margin-right:0cm;
mso-margin-bottom-alt:auto;
margin-left:0cm;
font-size:12.0pt;
font-family:"Times New Roman",serif;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:70.85pt 70.85pt 70.85pt 70.85pt;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=FR link=blue vlink=purple><div class=WordSection1><p class=msipheadera43f35f7 style='margin:0cm;margin-bottom:.0001pt'><span style='font-size:10.0pt;font-family:"Calibri",sans-serif;color:black'>Classified as: {OPEN}</span><o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US'>Other strange thing.<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US'>On RHEL 7, corosync is restarted while the “Restart=on-failure » line is commented.<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US'>I think also that something changed in the pacemaker behavior, or somewhere else.<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US'><o:p> </o:p></span></p><p class=MsoNormal><b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>De :</span></b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'> Klaus Wenninger <kwenning@redhat.com> <br><b>Envoyé :</b> lundi 22 avril 2024 12:41<br><b>À :</b> NOLIBOS Christophe <christophe.nolibos@thalesgroup.com><br><b>Cc :</b> Cluster Labs - All topics related to open-source clustering welcomed <users@clusterlabs.org><br><b>Objet :</b> Re: [ClusterLabs] "pacemakerd: recover properly from Corosync crash" fix<o:p></o:p></span></p><p class=MsoNormal><o:p> </o:p></p><div><div><p class=MsoNormal><o:p> </o:p></p></div><p class=MsoNormal><o:p> </o:p></p><div><div><p class=MsoNormal>On Mon, Apr 22, 2024 at 12:32 PM NOLIBOS Christophe <<a href="mailto:christophe.nolibos@thalesgroup.com">christophe.nolibos@thalesgroup.com</a>> wrote:<o:p></o:p></p></div><blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0cm;margin-bottom:5.0pt'><div><div><div><p class=m-38161929789188403msipheadera43f35f7 style='margin:0cm;margin-bottom:.0001pt'><span style='font-size:10.0pt;font-family:"Calibri",sans-serif;color:black'>Classified as: {OPEN}</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>You are right : the “Restart=on-failure” line is commented and so, disabled per default.</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Uncommenting it resolves my issue.</span><o:p></o:p></p></div></div></div></blockquote><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>Maybe pacemaker changed behavior here without syncing enough with corosync behavior.<o:p></o:p></p></div><div><p class=MsoNormal>We'll look into that to see which approach is better - restart corosync on failure - or have<o:p></o:p></p></div><div><p class=MsoNormal>pacemaker be restarted by systemd which should in turn restart corosync as well.<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>Klaus <o:p></o:p></p></div><blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0cm;margin-bottom:5.0pt'><div><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> </span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Thanks a lot.</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Christophe.</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> </span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><b><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif'>De :</span></b><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif'> Klaus Wenninger <</span><a href="mailto:kwenning@redhat.com" target="_blank"><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif'>kwenning@redhat.com</span></a><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif'>> <br><b>Envoyé :</b> lundi 22 avril 2024 11:06<br><b>À :</b> NOLIBOS Christophe <christophe</span><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>.</span><a href="mailto:nolibos@thalesgroup.com" target="_blank"><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>nolibos@thalesgroup.com</span></a><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>><br><b>Cc :</b> Cluster Labs - All topics related to open-source clustering welcomed <</span><a href="mailto:users@clusterlabs.org" target="_blank"><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>users@clusterlabs.org</span></a><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>><br><b>Objet :</b> Re: [ClusterLabs] "pacemakerd: recover properly from Corosync crash" fix</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p></div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>On Mon, Apr 22, 2024 at 9:51 AM NOLIBOS Christophe <<a href="mailto:christophe.nolibos@thalesgroup.com" target="_blank">christophe.nolibos@thalesgroup.com</a>> wrote:<o:p></o:p></p></div><blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0cm;margin-bottom:5.0pt'><div><div><div><p class=m-38161929789188403m-2632291695807224497msipheadera43f35f7 style='margin:0cm;margin-bottom:.0001pt'><span style='font-size:10.0pt;font-family:"Calibri",sans-serif;color:black'>Classified as: {OPEN}</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>‘kill -9’ command.</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Is it gracefully exit?</span><o:p></o:p></p></div></div></div></blockquote><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>Looking as if corosync-unit-file has Restart=on-failure disabled per default.<o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>I'm not aware of another mechanism that would restart corosync and I<o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>think default behavior is not to restart.<o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>Comments suggest just to enable if using watchdog but that might just<o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>reference the RestartSec to provoke a watchdog-reboot instead of a<o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>restart via systemd.<o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>Any signal that isn't handled by the process - so that the exit-code could<o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>be set to 0 - should be fine.<o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>Klaus<o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p></div><blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0cm;margin-bottom:5.0pt'><div><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> </span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>De :</span></b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'> Klaus Wenninger <</span><a href="mailto:kwenning@redhat.com" target="_blank"><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>kwenning@redhat.com</span></a><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>> <br><b>Envoyé :</b> jeudi 18 avril 2024 20:17<br><b>À :</b> NOLIBOS Christophe <</span><a href="mailto:christophe.nolibos@thalesgroup.com" target="_blank"><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>christophe.nolibos@thalesgroup.com</span></a><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>><br><b>Cc :</b> Cluster Labs - All topics related to open-source clustering welcomed <</span><a href="mailto:users@clusterlabs.org" target="_blank"><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>users@clusterlabs.org</span></a><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>><br><b>Objet :</b> Re: [ClusterLabs] "pacemakerd: recover properly from Corosync crash" fix</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;margin-bottom:12.0pt'> <o:p></o:p></p><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>NOLIBOS Christophe <<a href="mailto:christophe.nolibos@thalesgroup.com" target="_blank">christophe.nolibos@thalesgroup.com</a>> schrieb am Do., 18. Apr. 2024, 19:01:<o:p></o:p></p></div><blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0cm;margin-bottom:5.0pt'><div><div><p class=m-38161929789188403m-2632291695807224497m-8500734528484967090msipheadera43f35f7 style='margin:0cm;margin-bottom:.0001pt'><span style='font-size:10.0pt;font-family:"Calibri",sans-serif;color:black'>Classified as: {OPEN}</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Hummm… my RHEL 8.8 OS has been hardened.</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>I am wondering if the problem does not come from that.</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> </span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>On another side, I get the same issue (i.e. corosync not restarted by system) with Pacemaker 2.1.5-8 deployed on RHEL 8.4 (not hardened).</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> </span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>I’m checking.</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> </span><o:p></o:p></p></div></div></blockquote></div></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>How did, you kill corosync? If it exits gracefully might not be restarted. Check journal. Sry cant try am on my mobile ATM. Klaus<o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p></div><div><div><blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0cm;margin-bottom:5.0pt'><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><p class=m-38161929789188403m-2632291695807224497m-8500734528484967090msipfooterfa6f9f96 align=center style='margin:0cm;margin-bottom:.0001pt;text-align:center'><span style='font-size:10.0pt;font-family:"Calibri",sans-serif;color:black'>{OPEN}</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><p class=m-38161929789188403m-2632291695807224497msipfooterfa6f9f96 align=center style='margin:0cm;margin-bottom:.0001pt;text-align:center'><span style='font-size:10.0pt;font-family:"Calibri",sans-serif;color:black'>{OPEN}</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><p class=m-38161929789188403msipfooterfa6f9f96 align=center style='margin:0cm;margin-bottom:.0001pt;text-align:center'><span style='font-size:10.0pt;font-family:"Calibri",sans-serif;color:black'>{OPEN}</span><o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=msipfooterfa6f9f96 align=center style='margin:0cm;margin-bottom:.0001pt;text-align:center'><span style='font-size:10.0pt;font-family:"Calibri",sans-serif;color:black'>{OPEN}</span><o:p></o:p></p><div><div style='border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm'><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>De :</span></b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'> Users <</span><a href="mailto:users-bounces@clusterlabs.org" target="_blank"><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>users-bounces@clusterlabs.org</span></a><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>> <b>De la part de</b> NOLIBOS Christophe via Users<br><b>Envoyé :</b> jeudi 18 avril 2024 18:34<br><b>À :</b> Klaus Wenninger <</span><a href="mailto:kwenning@redhat.com" target="_blank"><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>kwenning@redhat.com</span></a><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>>; Cluster Labs - All topics related to open-source clustering welcomed <</span><a href="mailto:users@clusterlabs.org" target="_blank"><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>users@clusterlabs.org</span></a><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>><br><b>Cc :</b> NOLIBOS Christophe <</span><a href="mailto:christophe.nolibos@thalesgroup.com" target="_blank"><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>christophe.nolibos@thalesgroup.com</span></a><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>><br><b>Objet :</b> Re: [ClusterLabs] "pacemakerd: recover properly from Corosync crash" fix</span><o:p></o:p></p></div></div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><p class=m-38161929789188403m-2632291695807224497m-8500734528484967090msipheadera43f35f7 style='margin:0cm;margin-bottom:.0001pt'><span style='font-size:10.0pt;font-family:"Calibri",sans-serif;color:black'>Classified as: {OPEN}</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>So, the issue is on systemd?</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> </span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>If I run the same test on RHEL 7 (3.10.0-693.11.1.el7) with pacemaker 1.1.13-10, corosync is correctly restarted by systemd.</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> </span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>[RHEL7 ~]# journalctl -f</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>-- Logs begin at Wed 2024-01-03 13:15:41 UTC. --</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Apr 18 16:26:55 - systemd[1]: corosync.service failed.</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Apr 18 16:26:55 - systemd[1]: pacemaker.service holdoff time over, scheduling restart.</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Apr 18 16:26:55 - systemd[1]: Starting Corosync Cluster Engine...</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Apr 18 16:26:55 - corosync[12179]: Starting Corosync Cluster Engine (corosync): [ OK ]</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Apr 18 16:26:55 - systemd[1]: Started Corosync Cluster Engine.</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Apr 18 16:26:55 - systemd[1]: Started Pacemaker High Availability Cluster Manager.</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Apr 18 16:26:55 - systemd[1]: Starting Pacemaker High Availability Cluster Manager...</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Apr 18 16:26:55 - pacemakerd[12192]: notice: Additional logging available in /var/log/pacemaker.log</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Apr 18 16:26:55 - pacemakerd[12192]: notice: Switching to /var/log/cluster/corosync.log</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Apr 18 16:26:55 - pacemakerd[12192]: notice: Additional logging available in /var/log/cluster/corosync.log</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> </span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>De :</span></b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'> Klaus Wenninger <</span><a href="mailto:kwenning@redhat.com" target="_blank"><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>kwenning@redhat.com</span></a><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>> <br><b>Envoyé :</b> jeudi 18 avril 2024 18:12<br><b>À :</b> NOLIBOS Christophe <</span><a href="mailto:christophe.nolibos@thalesgroup.com" target="_blank"><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>christophe.nolibos@thalesgroup.com</span></a><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>>; Cluster Labs - All topics related to open-source clustering welcomed <</span><a href="mailto:users@clusterlabs.org" target="_blank"><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>users@clusterlabs.org</span></a><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>><br><b>Objet :</b> Re: [ClusterLabs] "pacemakerd: recover properly from Corosync crash" fix</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p></div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:57.75pt'>On Thu, Apr 18, 2024 at 6:09 PM Klaus Wenninger <<a href="mailto:kwenning@redhat.com" target="_blank">kwenning@redhat.com</a>> wrote:<o:p></o:p></p></div><blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0cm;margin-bottom:5.0pt'><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p></div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:80.85pt'>On Thu, Apr 18, 2024 at 6:06 PM NOLIBOS Christophe <<a href="mailto:christophe.nolibos@thalesgroup.com" target="_blank">christophe.nolibos@thalesgroup.com</a>> wrote:<o:p></o:p></p></div><blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0cm;margin-bottom:5.0pt'><div><div><div><p style='margin:0cm;margin-bottom:.0001pt'><span style='font-size:10.0pt;font-family:"Calibri",sans-serif;color:black'>Classified as: {OPEN}</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Well… why do you say that « </span><span lang=EN-US>Well if corosync isn't there that this is to be expected and pacemaker won't recover corosync.”?</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>In my mind, Corosync is managed by Pacemaker as any other cluster resource and the "pacemakerd: recover properly from > Corosync crash" fix implemented in version 2.1.2 seems confirm that.</span><o:p></o:p></p></div></div></div></blockquote><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>Nope. Startup of the stack is done by systemd. And pacemaker is just started after corosync is up and<o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>systemd should be responsible for keeping the stack up.<o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>For completeness: if you have sbd in the mix that is as well being started by systemd but kind of<o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>parallel with corosync as part of it (systemd terminology).<o:p></o:p></p></div></div></div></blockquote><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>The "recover" above is referring to pacemaker recovering from corosync going away and coming back.<o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p></div><blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0cm;margin-bottom:5.0pt'><div><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>Klaus <o:p></o:p></p></div><blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0cm;margin-bottom:5.0pt'><div><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> </span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><p align=center style='margin:0cm;margin-bottom:.0001pt;text-align:center'><span style='font-size:10.0pt;font-family:"Calibri",sans-serif;color:black'>{OPEN}</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><p class=m-38161929789188403m-2632291695807224497m-8500734528484967090msipfooterfa6f9f96 align=center style='margin:0cm;margin-bottom:.0001pt;text-align:center'><span style='font-size:10.0pt;font-family:"Calibri",sans-serif;color:black'>{OPEN}</span><o:p></o:p></p><div><div style='border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm'><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>De :</span></b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'> NOLIBOS Christophe <br><b>Envoyé :</b> jeudi 18 avril 2024 17:56<br><b>À :</b> 'Klaus Wenninger' <</span><a href="mailto:kwenning@redhat.com" target="_blank"><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>kwenning@redhat.com</span></a><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>>; Cluster Labs - All topics related to open-source clustering welcomed <</span><a href="mailto:users@clusterlabs.org" target="_blank"><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>users@clusterlabs.org</span></a><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>><br><b>Cc :</b> Ken Gaillot <</span><a href="mailto:kgaillot@redhat.com" target="_blank"><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>kgaillot@redhat.com</span></a><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>><br><b>Objet :</b> RE: [ClusterLabs] "pacemakerd: recover properly from Corosync crash" fix</span><o:p></o:p></p></div></div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><p style='margin:0cm;margin-bottom:.0001pt'><span style='font-size:10.0pt;font-family:"Calibri",sans-serif;color:black'>Classified as: {OPEN}</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> </span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>[~]$ systemctl status corosync</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:161.7pt'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>● corosync.service - Corosync Cluster Engine</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> Loaded: loaded (/usr/lib/systemd/system/corosync.service; enabled; vendor preset: disabled)</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> Active: failed (Result: signal) since Thu 2024-04-18 14:58:42 UTC; 53min ago</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> Docs: man:corosync</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> man:corosync.conf</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> man:corosync_overview</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> Process: 2027251 ExecStop=/usr/sbin/corosync-cfgtool -H --force (code=exited, status=0/SUCCESS)</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> Process: 1324906 ExecStart=/usr/sbin/corosync -f $COROSYNC_OPTIONS (code=killed, signal=KILL)</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Main PID: 1324906 (code=killed, signal=KILL)</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> </span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Apr 18 13:16:04 - corosync[1324906]: [QUORUM] Sync joined[1]: 1</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Apr 18 13:16:04 - corosync[1324906]: [TOTEM ] A new membership (1.1c8) was formed. Members joined: 1</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Apr 18 13:16:04 - corosync[1324906]: [VOTEQ ] Waiting for all cluster members. Current votes: 1 expected_votes: 2</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Apr 18 13:16:04 - corosync[1324906]: [VOTEQ ] Waiting for all cluster members. Current votes: 1 expected_votes: 2</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Apr 18 13:16:04 - corosync[1324906]: [VOTEQ ] Waiting for all cluster members. Current votes: 1 expected_votes: 2</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Apr 18 13:16:04 - corosync[1324906]: [QUORUM] Members[1]: 1</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Apr 18 13:16:04 - corosync[1324906]: [MAIN ] Completed service synchronization, ready to provide service.</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Apr 18 13:16:04 - systemd[1]: Started Corosync Cluster Engine.</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Apr 18 14:58:42 - systemd[1]: corosync.service: Main process exited, code=killed, status=9/KILL</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Apr 18 14:58:42 - systemd[1]: corosync.service: Failed with result 'signal'.</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>[~]$</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> </span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> </span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><b><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif'>De :</span></b><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif'> Klaus Wenninger <</span><a href="mailto:kwenning@redhat.com" target="_blank"><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif'>kwenning@redhat.com</span></a><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif'>> <br><b>Envoyé :</b> jeudi 18 avril 2024 17:43<br><b>À :</b> Cluster Labs - All topics related to open-source clustering welcomed <</span><a href="mailto:users@clusterlabs.org" target="_blank"><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif'>users@clusterlabs.org</span></a><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif'>><br><b>Cc :</b> Ken Gaillot <</span><a href="mailto:kgaillot@redhat.com" target="_blank"><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif'>kgaillot@redhat.com</span></a><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif'>>; NOLIBOS Christophe <</span><a href="mailto:christophe.nolibos@thalesgroup.com" target="_blank"><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif'>christophe.nolibos@thalesgroup.com</span></a><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif'>><br><b>Objet :</b> Re: [ClusterLabs] "pacemakerd: recover properly from Corosync crash" fix</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US> </span><o:p></o:p></p><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US> </span><o:p></o:p></p></div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US> </span><o:p></o:p></p><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:184.8pt'>On Thu, Apr 18, 2024 at 5:07 PM NOLIBOS Christophe via Users <<a href="mailto:users@clusterlabs.org" target="_blank">users@clusterlabs.org</a>> wrote:<o:p></o:p></p></div><blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0cm;margin-bottom:5.0pt'><p class=MsoNormal style='mso-margin-top-alt:auto;margin-bottom:12.0pt'>Classified as: {OPEN}<br><br>I'm using RedHat 8.8 (4.18.0-477.21.1.el8_8.x86_64).<br>When I kill Corosync, no new corosync process is created and pacemaker is in failure.<br>The only solution is to restart the pacemaker service.<br><br>[~]$ pcs status<br>Error: unable to get cib<br>[~]$<br><br>[~]$systemctl status pacemaker<br>● pacemaker.service - Pacemaker High Availability Cluster Manager<br> Loaded: loaded (/usr/lib/systemd/system/pacemaker.service; enabled; vendor preset: disabled)<br> Active: active (running) since Thu 2024-04-18 13:16:04 UTC; 1h 43min ago<br> Docs: man:pacemakerd<br> <a href="https://clusterlabs.org/pacemaker/doc/" target="_blank">https://clusterlabs.org/pacemaker/doc/</a><br> Main PID: 1324923 (pacemakerd)<br> Tasks: 91<br> Memory: 132.1M<br> CGroup: /system.slice/pacemaker.service<br>...<br>Apr 18 14:59:02 - pacemakerd[1324923]: crit: Could not connect to Corosync CFG: CS_ERR_LIBRARY<br>Apr 18 14:59:03 - pacemakerd[1324923]: crit: Could not connect to Corosync CFG: CS_ERR_LIBRARY<br>Apr 18 14:59:04 - pacemakerd[1324923]: crit: Could not connect to Corosync CFG: CS_ERR_LIBRARY<br>Apr 18 14:59:05 - pacemakerd[1324923]: crit: Could not connect to Corosync CFG: CS_ERR_LIBRARY<br>Apr 18 14:59:06 - pacemakerd[1324923]: crit: Could not connect to Corosync CFG: CS_ERR_LIBRARY<br>Apr 18 14:59:07 - pacemakerd[1324923]: crit: Could not connect to Corosync CFG: CS_ERR_LIBRARY<br>Apr 18 14:59:08 - pacemakerd[1324923]: crit: Could not connect to Corosync CFG: CS_ERR_LIBRARY<br>Apr 18 14:59:09 - pacemakerd[1324923]: crit: Could not connect to Corosync CFG: CS_ERR_LIBRARY<br>Apr 18 14:59:10 - pacemakerd[1324923]: crit: Could not connect to Corosync CFG: CS_ERR_LIBRARY<br>Apr 18 14:59:11 - pacemakerd[1324923]: crit: Could not connect to Corosync CFG: CS_ERR_LIBRARY<br>[~]$<o:p></o:p></p></blockquote><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>Well if corosync isn't there that this is to be expected and pacemaker won't recover corosync.<o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>Can you check what systemd thinks about corosync (status/journal). <o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>Klaus<o:p></o:p></p></div><blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0cm;margin-bottom:5.0pt'><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><br>{OPEN}<br><br>-----Message d'origine-----<br>De : Ken Gaillot <<a href="mailto:kgaillot@redhat.com" target="_blank">kgaillot@redhat.com</a>> <br>Envoyé : jeudi 18 avril 2024 16:40<br>À : Cluster Labs - All topics related to open-source clustering welcomed <<a href="mailto:users@clusterlabs.org" target="_blank">users@clusterlabs.org</a>><br>Cc : NOLIBOS Christophe <<a href="mailto:christophe.nolibos@thalesgroup.com" target="_blank">christophe.nolibos@thalesgroup.com</a>><br>Objet : Re: [ClusterLabs] "pacemakerd: recover properly from Corosync crash" fix<br><br>What OS are you using? Does it use systemd?<br><br>What does happen when you kill Corosync?<br><br>On Thu, 2024-04-18 at 13:13 +0000, NOLIBOS Christophe via Users wrote:<br>> Classified as: {OPEN}<br>> <br>> Dear All,<br>> <br>> I have a question about the "pacemakerd: recover properly from <br>> Corosync crash" fix implemented in version 2.1.2.<br>> I have observed the issue when testing pacemaker version 2.0.5, just <br>> by killing the ‘corosync’ process: Corosync was not recovered.<br>> <br>> I am using now pacemaker version 2.1.5-8.<br>> Doing the same test, I have the same result: Corosync is still not <br>> recovered.<br>> <br>> Please confirm the "pacemakerd: recover properly from Corosync crash"<br>> fix implemented in version 2.1.2 covers this scenario.<br>> If it is, did I miss something in the configuration of my cluster?<br>> <br>> Best Regard.<br>> <br>> Christophe.<br>> <br>> <br>> <br>> {OPEN}<br>> _______________________________________________<br>> Manage your subscription:<br>> <a href="https://lists.clusterlabs.org/mailman/listinfo/users" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>> <br>> ClusterLabs home: <a href="https://www.clusterlabs.org/" target="_blank">https://www.clusterlabs.org/</a><br>--<br>Ken Gaillot <<a href="mailto:kgaillot@redhat.com" target="_blank">kgaillot@redhat.com</a>><br>_______________________________________________<br>Manage your subscription:<br><a href="https://lists.clusterlabs.org/mailman/listinfo/users" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br><br>ClusterLabs home: <a href="https://www.clusterlabs.org/" target="_blank">https://www.clusterlabs.org/</a><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><p align=center style='margin:0cm;margin-bottom:.0001pt;text-align:center'><span style='font-size:10.0pt;font-family:"Calibri",sans-serif;color:black'>{OPEN}</span><o:p></o:p></p></blockquote></div></div></div></div></div></blockquote></div></div></blockquote></div></div></div></div></blockquote></div></div></div></div></div></div></blockquote></div></div></div></div></div></blockquote></div></div></div></body></html>