<div dir="ltr"><div>Thank you very much Alexey, I will certainly try that and update you on the result. <br></div><div><br></div><div>Best regards!</div><div><br></div></div><br><div class="gmail_quote gmail_quote_container"><div dir="ltr" class="gmail_attr">Il giorno lun 12 mag 2025 alle ore 22:36 <<a href="mailto:alexey@pavlyuts.ru">alexey@pavlyuts.ru</a>> ha scritto:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="msg7756243052031874241"><div lang="RU" style="overflow-wrap: break-word;"><div class="m_7756243052031874241WordSection1"><p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif">Hi,<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif"><u></u> <u></u></span></p><p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif">Occasionally, I have pacemaker as a base layer of custom clustering solution and I have a script to rebuild the second node from the first one. I can’t share the script itself as is has a lot of solution-dependent references, but I can share the sequence to rebuild the failed node:<u></u><u></u></span></p><ol style="margin-top:0cm" start="1" type="1"><li class="m_7756243052031874241MsoListParagraph" style="margin-left:0cm"><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif">Setup the new node with the same IP and hostname<u></u><u></u></span></li><li class="m_7756243052031874241MsoListParagraph" style="margin-left:0cm"><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif">(optional) setup passwordless mutual key-based SSH access. It is not necessary, but make a lot of things easy.<u></u><u></u></span></li><li class="m_7756243052031874241MsoListParagraph" style="margin-left:0cm"><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif">Copy files from survived host to the new one:<u></u><u></u></span></li><ol style="margin-top:0cm" start="1" type="a"><li class="m_7756243052031874241MsoListParagraph" style="margin-left:0cm;line-height:14.25pt;background:white"><span style="font-size:10.5pt;font-family:"Courier New";color:black">/etc/corosync/authkey</span><span style="font-size:10.5pt;font-family:"Courier New""><u></u><u></u></span></li><li class="m_7756243052031874241MsoListParagraph" style="margin-left:0cm;line-height:14.25pt;background:white"><span style="font-size:10.5pt;font-family:"Courier New";color:black">/etc/corosync/corosync.conf</span><span style="font-size:10.5pt;font-family:"Courier New""><u></u><u></u></span></li><li class="m_7756243052031874241MsoListParagraph" style="margin-left:0cm;line-height:14.25pt;background:white"><span lang="EN-US" style="font-size:10.5pt;font-family:Consolas;color:black">/</span><span style="font-size:10.5pt;font-family:"Courier New";color:black">etc/drbd.d/*.res</span><span lang="EN-US" style="font-size:10.5pt;font-family:Consolas"><u></u><u></u></span></li><li class="m_7756243052031874241MsoListParagraph" style="margin-left:0cm;line-height:14.25pt;background:white"><span style="font-size:10.5pt;font-family:"Courier New";color:black">/etc/pacemaker/authkey</span><span style="font-size:10.5pt;font-family:"Courier New""><u></u><u></u></span></li></ol><li class="m_7756243052031874241MsoListParagraph" style="margin-left:0cm"><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif">Set <b>hacluster</b> user pass to the same as it was on the survived node.<u></u><u></u></span></li><li class="m_7756243052031874241MsoListParagraph" style="margin-left:0cm"><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif">Re-auth pcs nodes with command<br></span><span style="font-size:10.5pt;font-family:"Courier New"">pcs host auth <host1_name> <host2_name> -u hacluster -p <ha_cluster_pass><u></u><u></u></span></li><li class="m_7756243052031874241MsoListParagraph" style="margin-left:0cm"><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif">Reboot the restored server<u></u><u></u></span></li><li class="m_7756243052031874241MsoListParagraph" style="margin-left:0cm"><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif">PROFIT!!!<u></u><u></u></span></li></ol><p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif"><u></u> <u></u></span></p><p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif">If you use no arbiter (corosync-qnetd) this should be enough for your new cluster node up and running. If you use corosync-qnetd, you need also restore corosync-qdevice nssdb keys for the second host connect the arbiter node:<u></u><u></u></span></p><ol style="margin-top:0cm" start="1" type="1"><li class="m_7756243052031874241MsoListParagraph" style="margin-left:0cm"><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif">On old host, extract your arbiter certificate from nssdb on the survived host:<br></span><span style="font-size:10.5pt;font-family:"Courier New"">certutil -L -d /etc/corosync/qdevice/net/nssdb -n 'QNet CA' -r > /root/qnetd-cert.crt<u></u><u></u></span></li><li class="m_7756243052031874241MsoListParagraph" style="margin-left:0cm"><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif">Copy certificate to the new host, assume the path on the new host is the same<u></u><u></u></span></li><li class="m_7756243052031874241MsoListParagraph" style="margin-left:0cm"><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif">On the new host, Init new nssdb with certificate:<br></span><span style="font-size:10.5pt;font-family:"Courier New"">corosync-qdevice-net-certutil -i -c /root/qnetd-cert.crt<u></u><u></u></span></li><li class="m_7756243052031874241MsoListParagraph" style="margin-left:0cm"><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif">Copy certificate and key at location </span><span style="font-size:10.5pt;font-family:"Courier New"">/etc/corosync/qdevice/net/nssdb/qdevice-net-node.p12</span><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif"> from old node to new one<u></u><u></u></span></li><li class="m_7756243052031874241MsoListParagraph" style="margin-left:0cm"><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif">On the new node: Import certificate and key:<br></span><span style="font-size:10.5pt;font-family:"Courier New"">corosync-qdevice-net-certutil -m -c /etc/corosync/qdevice/net/nssdb/qdevice-net-node.p12</span><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif"><u></u><u></u></span></li><li class="m_7756243052031874241MsoListParagraph" style="margin-left:0cm"><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif">Enable or restart corosync-qdevice:<br></span><span lang="EN-US" style="font-size:10.5pt;font-family:"Courier New"">systemctl enable –now corosync-qdevice.service<br></span><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif">or<br></span><span lang="EN-US" style="font-size:10.5pt;font-family:"Courier New"">systemctl restart corosync-qdevice.service<u></u><u></u></span></li><li class="m_7756243052031874241MsoListParagraph" style="margin-left:0cm"><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif">Enjoy!<u></u><u></u></span></li></ol><p class="m_7756243052031874241MsoListParagraph"><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif"><u></u> <u></u></span></p><p class="m_7756243052031874241MsoListParagraph" style="margin-left:0cm"><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif">That’s what practically work for me and included in service scripts of our product, based on Pacemaker.<u></u><u></u></span></p><p class="m_7756243052031874241MsoListParagraph" style="margin-left:0cm"><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif"><u></u> <u></u></span></p><p class="m_7756243052031874241MsoListParagraph" style="margin-left:0cm"><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif">Hope this could help!<u></u><u></u></span></p><p class="m_7756243052031874241MsoListParagraph" style="margin-left:0cm"><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif"><u></u> <u></u></span></p><p class="m_7756243052031874241MsoListParagraph" style="margin-left:0cm"><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif">Sincerely,<u></u><u></u></span></p><p class="m_7756243052031874241MsoListParagraph" style="margin-left:0cm"><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif"><u></u> <u></u></span></p><p class="m_7756243052031874241MsoListParagraph" style="margin-left:0cm"><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif">Alex<u></u><u></u></span></p><p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif"><u></u> <u></u></span></p><p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:"Calibri",sans-serif"><u></u> <u></u></span></p><div style="border-width:medium medium medium 1.5pt;border-style:none none none solid;border-color:currentcolor currentcolor currentcolor blue;padding:0cm 0cm 0cm 4pt"><div><div style="border-width:1pt medium medium;border-style:solid none none;border-color:rgb(225,225,225) currentcolor currentcolor;padding:3pt 0cm 0cm"><p class="MsoNormal"><b><span style="font-size:11pt;font-family:"Calibri",sans-serif">From:</span></b><span style="font-size:11pt;font-family:"Calibri",sans-serif"> Users <<a href="mailto:users-bounces@clusterlabs.org" target="_blank">users-bounces@clusterlabs.org</a>> <b>On Behalf Of </b>Fabrizio Ermini<br><b>Sent:</b> Friday, May 9, 2025 5:26 PM<br><b>To:</b> <a href="mailto:users@clusterlabs.org" target="_blank">users@clusterlabs.org</a><br><b>Subject:</b> [ClusterLabs] Rebuild of failed node<u></u><u></u></span></p></div></div><p class="MsoNormal"><u></u> <u></u></p><div><div><p class="MsoNormal">Hi all! Freshmen here, just joined. <u></u><u></u></p></div><div><p class="MsoNormal"><u></u> <u></u></p></div><div><p class="MsoNormal">I'm currently in the need to rebuild a failed node on a pacemaker2.1/corosync3.1 2-node cluster with drbd storage. <u></u><u></u></p></div><div><p class="MsoNormal">I've searched in Pacemaker docs and in the list archives, but I haven't found a clear guide on how to proceed in this task. So far, I've reinstalled a new server, configured the same IP and hostname of the failed one, and installed all the software. I've also fixed DRBD layer and started the resync of the volumes. But it's not clear to me how to proceed - I've found some hints online pointing to the need of manually copying corosync config, but they were quite old and probably obsolete. I'm using pcs as a shell and I haven't found a command designed to replace a node, only to add or remove them. <u></u><u></u></p></div><div><p class="MsoNormal">It seems really strange to me that there isn't a guide, since this should be a very basic operation and it's quite important to know how to do it - HW breaks, as a matter of fact :D<u></u><u></u></p></div><div><p class="MsoNormal">So I'll be very grateful if anyone can point me in the right direction.<u></u><u></u></p></div><div><p class="MsoNormal">Thanks in advance, and best regards<u></u><u></u></p></div><div><p class="MsoNormal"><u></u> <u></u></p></div><div><p class="MsoNormal">Fabrizio<u></u><u></u></p></div><div><p class="MsoNormal"><u></u> <u></u></p></div></div></div></div></div>_______________________________________________<br>
Manage your subscription:<br>
<a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>
<br>
ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a><br>
</div></blockquote></div>