<div dir="ltr">That appears to support IPMI, so fence_ipmilan is likely an option. Further, it probably has a watchdog device. If so, then sbd is an option.<br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Jul 30, 2020 at 2:00 AM Gabriele Bulfon <<a href="mailto:gbulfon@sonicle.com">gbulfon@sonicle.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div style="font-family:Tahoma;font-size:14px;color:rgb(0,0,0)">It is this system:<br><br><a href="https://www.supermicro.com/products/system/1u/1029/SYS-1029TP-DC0R.cfm" target="_blank">https://www.supermicro.com/products/system/1u/1029/SYS-1029TP-DC0R.cfm</a></div>

<div style="font-family:Tahoma;font-size:14px;color:rgb(0,0,0)"> </div>

<div style="font-family:Tahoma;font-size:14px;color:rgb(0,0,0)">it has a sas3 backplane with hotswap sas disks that are visible to both nodes at the same time.</div>

<div style="font-family:Tahoma;font-size:14px;color:rgb(0,0,0)"> </div>

<div style="font-family:Tahoma;font-size:14px;color:rgb(0,0,0)">Gabriele <br><br>

<div id="gmail-m_5685836564098136189wt-mailcard">

<div> </div>

<div> </div>

<div><span style="font-size:14px;font-family:Helvetica"><strong>Sonicle S.r.l. </strong>: <a href="http://www.sonicle.com/" target="_blank">http://www.sonicle.com</a></span></div>

<div><span style="font-size:14px;font-family:Helvetica"><strong>Music: </strong><a href="http://www.gabrielebulfon.com/" target="_blank">http://www.gabrielebulfon.com</a></span></div>

<div><span style="font-size:14px;font-family:Helvetica"><strong>Quantum Mechanics : </strong><a href="http://www.cdbaby.com/cd/gabrielebulfon" target="_blank">http://www.cdbaby.com/cd/gabrielebulfon</a></span></div>

</div>

<tt><br><br><br>----------------------------------------------------------------------------------<br><br>Da: Ulrich Windl <<a href="mailto:Ulrich.Windl@rz.uni-regensburg.de" target="_blank">Ulrich.Windl@rz.uni-regensburg.de</a>><br>A: <a href="mailto:users@clusterlabs.org" target="_blank">users@clusterlabs.org</a> <br>Data: 29 luglio 2020 15.15.17 CEST<br>Oggetto: [ClusterLabs] Antw: Re: Antw: Re: Antw: [EXT] Stonith failing<br><br></tt>

<blockquote style="border-left:2px solid rgb(0,0,128);margin-left:5px;padding-left:5px"><tt>>>> Gabriele Bulfon <<a href="mailto:gbulfon@sonicle.com" target="_blank">gbulfon@sonicle.com</a>> schrieb am 29.07.2020 um 14:18 in<br>Nachricht <479956351.444.1596025101064@www>:<br>> Hi, it's a single controller, shared to both nodes, SM server.<br><br>You mean external controller, like NAS or SAN? I thought you are talking about<br>an internal controller like SCSI...<br>I don't know what an "SM server" is.<br><br>Regards,<br>Ulrich<br><br>> <br>> Thanks!<br>> Gabriele<br>> <br>> <br>> Sonicle S.r.l. <br>> : <br>> <a href="http://www.sonicle.com" target="_blank">http://www.sonicle.com</a> <br>> Music: <br>> <a href="http://www.gabrielebulfon.com" target="_blank">http://www.gabrielebulfon.com</a> <br>> Quantum Mechanics : <br>> <a href="http://www.cdbaby.com/cd/gabrielebulfon" target="_blank">http://www.cdbaby.com/cd/gabrielebulfon</a> <br>><br>----------------------------------------------------------------------------<br>> ------<br>> Da: Ulrich Windl<br>> A: <a href="mailto:users@clusterlabs.org" target="_blank">users@clusterlabs.org</a> <br>> Data: 29 luglio 2020 9.26.39 CEST<br>> Oggetto: [ClusterLabs] Antw: Re: Antw: [EXT] Stonith failing<br>> Gabriele Bulfon<br>> schrieb am 29.07.2020 um 08:01 in<br>> Nachricht<br>> :<br>> That one was taken from a specific implementation on Solaris 11.<br>> The situation is a dual node server with shared storage controller: both<br>> nodes see the same disks concurrently.<br>> You mean you have a dual-controler setup (one controller on each node, both<br>> connected to the same bus)? If so Use sbd!<br>> Here we must be sure that the two nodes are not going to import/mount the<br>> same zpool at the same time, or we will encounter data corruption: node 1<br>> will be perferred for pool 1, node 2 for pool 2, only in case one of the<br>> node<br>> goes down or is taken offline the resources should be first free by the<br>> leaving node and taken by the other node.<br>> Would you suggest one of the available stonith in this case?<br>> Thanks!<br>> Gabriele<br>> Sonicle S.r.l.<br>> :<br>> <a href="http://www.sonicle.com" target="_blank">http://www.sonicle.com</a> <br>> Music:<br>> <a href="http://www.gabrielebulfon.com" target="_blank">http://www.gabrielebulfon.com</a> <br>> Quantum Mechanics :<br>> <a href="http://www.cdbaby.com/cd/gabrielebulfon" target="_blank">http://www.cdbaby.com/cd/gabrielebulfon</a> <br>><br>----------------------------------------------------------------------------<br>> ------<br>> Da: Strahil Nikolov<br>> A: Cluster Labs - All topics related to open-source clustering welcomed<br>> Gabriele Bulfon<br>> Data: 29 luglio 2020 6.39.08 CEST<br>> Oggetto: Re: [ClusterLabs] Antw: [EXT] Stonith failing<br>> Do you have a reason not to use any stonith already available ?<br>> Best Regards,<br>> Strahil Nikolov<br>> На 28 юли 2020 г. 13:26:52 GMT+03:00, Gabriele Bulfon<br>> написа:<br>> Thanks, I attach here the script.<br>> It basically runs ssh on the other node with no password (must be<br>> preconfigured via authorization keys) with commands.<br>> This was taken from a script by OpenIndiana (I think).<br>> As it stated in the comments, we don't want to halt or boot via ssh,<br>> only reboot.<br>> Maybe this is the problem, we should at least have it shutdown when<br>> asked for.<br>> Actually if I stop corosync in node 2, I don't want it to shutdown the<br>> system but just let node 1 keep control of all resources.<br>> Same if I just shutdown manually node 2,<br>> node 1 should keep control of all resources and release them back on<br>> reboot.<br>> Instead, when I stopped corosync on node 2, log was showing the<br>> temptative to stonith node 2: why?<br>> Thanks!<br>> Gabriele<br>> Sonicle S.r.l.<br>> :<br>> <a href="http://www.sonicle.com" target="_blank">http://www.sonicle.com</a> <br>> Music:<br>> <a href="http://www.gabrielebulfon.com" target="_blank">http://www.gabrielebulfon.com</a> <br>> Quantum Mechanics :<br>> <a href="http://www.cdbaby.com/cd/gabrielebulfon" target="_blank">http://www.cdbaby.com/cd/gabrielebulfon</a> <br>> Da:<br>> Reid Wahl<br>> A:<br>> Cluster Labs - All topics related to open-source clustering welcomed<br>> Data:<br>> 28 luglio 2020 12.03.46 CEST<br>> Oggetto:<br>> Re: [ClusterLabs] Antw: [EXT] Stonith failing<br>> Gabriele,<br>> "No route to host" is a somewhat generic error message when we can't<br>> find anyone to fence the node. It doesn't mean there's necessarily a<br>> network routing issue at fault; no need to focus on that error message.<br>> I agree with Ulrich about needing to know what the script does. But<br>> based on your initial message, it sounds like your custom fence agent<br>> returns 1 in response to "on" and "off" actions. Am I understanding<br>> correctly? If so, why does it behave that way? Pacemaker is trying to<br>> run a poweroff action based on the logs, so it needs your script to<br>> support an off action.<br>> On Tue, Jul 28, 2020 at 2:47 AM Ulrich Windl<br>> <a href="mailto:Ulrich.Windl@rz.uni-regensburg.de" target="_blank">Ulrich.Windl@rz.uni-regensburg.de</a> <br>> wrote:<br>> Gabriele Bulfon<br>> <a href="mailto:gbulfon@sonicle.com" target="_blank">gbulfon@sonicle.com</a> <br>> schrieb am 28.07.2020 um 10:56 in<br>> Nachricht<br>> :<br>> Hi, now I have my two nodes (xstha1 and xstha2) with IPs configured by<br>> Corosync.<br>> To check how stonith would work, I turned off Corosync service on<br>> second<br>> node.<br>> First node try to attempt to stonith 2nd node and take care of its<br>> resources, but this fails.<br>> Stonith action is configured to run a custom script to run ssh<br>> commands,<br>> I think you should explain what that script does exactly.<br>> [...]<br>> _______________________________________________<br>> Manage your subscription:<br>> <a href="https://lists.clusterlabs.org/mailman/listinfo/users" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a> <br>> ClusterLabs home:<br>> <a href="https://www.clusterlabs.org/" target="_blank">https://www.clusterlabs.org/</a> <br>> --<br>> Regards,<br>> Reid Wahl, RHCA<br>> Software Maintenance Engineer, Red Hat<br>> CEE - Platform Support Delivery - ClusterHA<br>> _______________________________________________Manage your<br>> subscription:<a href="https://lists.clusterlabs.org/mailman/listinfo/usersClusterLabs" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/usersClusterLabs</a><br><br>> home: <a href="https://www.clusterlabs.org/" target="_blank">https://www.clusterlabs.org/</a> <br>> _______________________________________________<br>> Manage your subscription:<br>> <a href="https://lists.clusterlabs.org/mailman/listinfo/users" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a> <br>> ClusterLabs home: <a href="https://www.clusterlabs.org/" target="_blank">https://www.clusterlabs.org/</a> <br><br><br><br>_______________________________________________<br>Manage your subscription:<br><a href="https://lists.clusterlabs.org/mailman/listinfo/users" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br><br>ClusterLabs home: <a href="https://www.clusterlabs.org/" target="_blank">https://www.clusterlabs.org/</a><br><br><br></tt></blockquote>

</div></div>_______________________________________________<br>

Manage your subscription:<br>

<a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>

<br>

ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a><br>

</blockquote></div><br clear="all"><br>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div>Regards,<br><br></div>Reid Wahl, RHCA<br></div><div>Software Maintenance Engineer, Red Hat<br></div>CEE - Platform Support Delivery - ClusterHA</div></div></div></div></div></div></div></div></div></div></div></div></div></div>