<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
</head>
<body>
<div dir="ltr">
<div dir="ltr">Hi Ken,</div>
<div dir="ltr"><br>
</div>
<div dir="ltr">Thank<span> you for quic<span>k response.</span></span></div>
<div dir="ltr"><span><span><br>
</span></span></div>
<div dir="ltr"><span><span>We have checked pacemaker logs found signal 15 on pacemaker component . Post that we have executed pcs cluster start then pacemaker and corosync service started properly and joined cluster also.<span></span></span></span></div>
<div dir="ltr"><span><span><br>
</span></span></div>
<div dir="ltr"><span><span>With respect to reboot query , In our application pacemaker cluster no quorum or fencing is configured. Please find reboot procedure followed in our upgrade <span>procedure which will be executed parallelly on all 9 nodes cluster.
Whether it is recommended way to reboot?</span></span></span></div>
<div dir="ltr"><span><span><span><br>
</span></span></span></div>
<div dir="ltr">
<ol data-editing-info="{"orderedStyleType":3,"unorderedStyleType":1}" style="margin-top: 0px; margin-bottom: 0px;" data-listchain="__List_Chain_81">
<li style="list-style-type: "1) ";"><span><span><span><span></span></span></span> pacemaker cluster in maintenance mode.</span></li><li style="list-style-type: "2) ";"><span>Bring down pacemaker cluster service using below command.</span>
<div style="list-style-type: "3) ";"># pcs cluster stop</div>
<div style="list-style-type: "3) ";"># pcs <span>cluster disable</span></div>
</li></ol>
<div><span> 3) reboot </span></div>
<div dir="ltr"><span> 4) Bring up <span>pacemaker cluster Service</span></span></div>
<div dir="ltr"><span><span><br>
</span></span></div>
<div dir="ltr"><br>
</div>
</div>
<div id="ms-outlook-mobile-signature" dir="ltr">Regards,</div>
<div id="ms-outlook-mobile-signature" dir="ltr">S Sathish S</div>
<div id="mail-editor-reference-message-container" class="ms-outlook-mobile-reference-message">
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif"><b>From:</b> Ken Gaillot <kgaillot@redhat.com><br>
<b>Sent:</b> Tuesday, July 16, 2024 7:53 PM<br>
<b>To:</b> Cluster Labs - All topics related to open-source clustering welcomed <users@clusterlabs.org><br>
<b>Cc:</b> S Sathish S <s.s.sathish@ericsson.com><br>
<b>Subject:</b> Re: [ClusterLabs] 9 nodes pacemaker cluster setup non-DC nodes reboot parallelly
<div> </div>
</font></div>
<meta name="Generator" content="Microsoft Exchange Server">
<!-- converted from text --><font size="2"><span style="font-size:11pt;">
<div class="PlainText">On Tue, 2024-07-16 at 11:18 +0000, S Sathish S via Users wrote:<br>
> Hi Team,<br>
> <br>
> In our product we have 9 nodes pacemaker cluster setup non-DC nodes<br>
> reboot parallelly. Most of nodes join cluster properly and only one<br>
> node pacemaker and corosync service is not came up properly with<br>
> below error message.<br>
> <br>
> Error Message:<br>
> Error: error running crm_mon, is pacemaker running?<br>
> crm_mon: Connection to cluster failed: Connection refused<br>
<br>
All that indicates is that Pacemaker is not responding. You'd have to<br>
look at the system log and/or pacemaker.log from that time to find out<br>
more.<br>
<br>
> <br>
> Query : Is it recommended to reboot parallelly of non-DC nodes ?<br>
<br>
As long as they are cleanly rebooted, there should be no fencing or<br>
other actual problems. However the cluster will lose quorum and have to<br>
stop all resources. If you reboot less than half of the nodes at one<br>
time and wait for them to rejoin before rebooting more, you would avoid<br>
that.<br>
<br>
> <br>
> Thanks and Regards,<br>
> S Sathish S<br>
> _______________________________________________<br>
> Manage your subscription:<br>
> <a href="https://lists.clusterlabs.org/mailman/listinfo/users">https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.clusterlabs.org%2Fmailman%2Flistinfo%2Fusers&data=05%7C02%7Cs.s.sathish%40ericsson.com%7C5e391698a47643d1c7fb08dca5a2ec0e%7C92e84cebfbfd47abbe52080c6b87953f%7C0%7C0%7C638567366368643199%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=QIk47YY2QLsIBwA1lWM%2BeG%2FEFfEL%2B5D7GEn0nOTeRV8%3D&reserved=0</a><br>
> <br>
> ClusterLabs home: <a href="https://www.clusterlabs.org/">https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.clusterlabs.org%2F&data=05%7C02%7Cs.s.sathish%40ericsson.com%7C5e391698a47643d1c7fb08dca5a2ec0e%7C92e84cebfbfd47abbe52080c6b87953f%7C0%7C0%7C638567366368652616%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=WJe0xE95VNwHECBIB8onLtn537l9p6teIrHQGQwU24U%3D&reserved=0</a><br>
-- <br>
Ken Gaillot <kgaillot@redhat.com><br>
<br>
</div>
</span></font></div>
</div>
</body>
</html>