<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=utf-8"><meta name=Generator content="Microsoft Word 12 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
{font-family:Consolas;
panose-1:2 11 6 9 2 2 4 3 2 4;}
@font-face
{font-family:Baskerville;
panose-1:0 0 0 0 0 0 0 0 0 0;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman","serif";
color:black;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
pre
{mso-style-priority:99;
mso-style-link:"HTML Preformatted Char";
margin:0in;
margin-bottom:.0001pt;
font-size:10.0pt;
font-family:"Courier New";
color:black;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
{mso-style-priority:34;
margin-top:0in;
margin-right:0in;
margin-bottom:0in;
margin-left:.5in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman","serif";
color:black;}
span.apple-converted-space
{mso-style-name:apple-converted-space;}
span.HTMLPreformattedChar
{mso-style-name:"HTML Preformatted Char";
mso-style-priority:99;
mso-style-link:"HTML Preformatted";
font-family:Consolas;
color:black;}
span.EmailStyle20
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:1469395080;
mso-list-type:hybrid;
mso-list-template-ids:-964102742 67698703 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l0:level1
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l1
{mso-list-id:1686784591;
mso-list-type:hybrid;
mso-list-template-ids:1658195548 67698703 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l1:level1
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
ol
{margin-bottom:0in;}
ul
{margin-bottom:0in;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body bgcolor=white lang=EN-US link=blue vlink=purple><div class=WordSection1><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Sometimes IPMI fence devices use shared power of the node, and it cannot be avoided.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>In such scenarios the HA cluster is NOT able to handle the power failure of a node, since the power is shared with its own fence device.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>The failure of IPMI based fencing can also exist due to other reasons also.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>A failure to fence the failed node will cause cluster to be marked UNCLEAN.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>To get over it, the following command needs to be invoked on the surviving node.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>pcs stonith confirm <failed_node_name> --force<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>This can be automated by hooking a recovery script, when the the Stonith resource ‘Timed Out’ event.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>To be more specific, the Pacemaker Alerts can be used for watch for Stonith timeouts and failures.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'> <o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>In that script, all that’s essentially to be executed is the aforementioned command.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Since the alerts are issued from ‘hacluster’ login, sudo permissions for ‘hacluster’ needs to be configured.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Thanx.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><div style='border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt'><div><div style='border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in'><p class=MsoNormal><b><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif";color:windowtext'>From:</span></b><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif";color:windowtext'> Klaus Wenninger [mailto:kwenning@redhat.com] <br><b>Sent:</b> Monday, July 24, 2017 9:24 PM<br><b>To:</b> Kristián Feldsam; Cluster Labs - All topics related to open-source clustering welcomed<br><b>Subject:</b> Re: [ClusterLabs] Two nodes cluster issue<o:p></o:p></span></p></div></div><p class=MsoNormal><o:p> </o:p></p><div><p class=MsoNormal>On 07/24/2017 05:37 PM, Kristián Feldsam wrote:<o:p></o:p></p></div><blockquote style='margin-top:5.0pt;margin-bottom:5.0pt'><p class=MsoNormal>I personally think that power off node by switched pdu is more safe, or not?<o:p></o:p></p></blockquote><p class=MsoNormal><br>True if that is working in you environment. If you can't do a physical setup<br>where you aren't simultaneously loosing connection to both your node and<br>the switch-device (or you just want to cover cases where that happens)<br>you have to come up with something else.<br><br><br><o:p></o:p></p><div><p class=MsoNormal><br><span style='font-family:"Baskerville","serif"'>S pozdravem Kristián Feldsam<br>Tel.: +420 773 303 353, +421 944 137 535<br>E-mail.: <a href="mailto:support@feldhost.cz">support@feldhost.cz</a><br><br><a href="http://www.feldhost.cz">www.feldhost.cz</a> -<span class=apple-converted-space> </span><b>Feld</b>Host</span>™<span class=apple-converted-space><span style='font-family:"Baskerville","serif"'> </span></span><span style='font-family:"Baskerville","serif"'>– profesionální hostingové a serverové služby za adekvátní ceny.<br><br>FELDSAM s.r.o.<br>V rohu 434/3<br>Praha 4 – Libuš, PSČ 142 00<br>IČ: 290 60 958, DIČ: CZ290 60 958<br>C 200350 vedená u Městského soudu v Praze<br><br>Banka: Fio banka a.s.<br>Číslo účtu: 2400330446/2010<br>BIC: FIOBCZPPXX<br>IBAN: CZ82 2010 0000 0024 0033 0446</span> <o:p></o:p></p></div><p class=MsoNormal><o:p> </o:p></p><div><blockquote style='margin-top:5.0pt;margin-bottom:5.0pt'><div><p class=MsoNormal>On 24 Jul 2017, at 17:27, Klaus Wenninger <<a href="mailto:kwenning@redhat.com">kwenning@redhat.com</a>> wrote:<o:p></o:p></p></div><p class=MsoNormal><o:p> </o:p></p><div><div><p class=MsoNormal style='background:white'><span style='font-family:"Baskerville","serif"'>On 07/24/2017 05:15 PM, Tomer Azran wrote:<o:p></o:p></span></p></div><blockquote style='margin-top:5.0pt;margin-bottom:5.0pt;font-variant-caps: normal;orphans:
auto;text-align:start;widows: auto;-webkit-text-size-adjust: auto;-webkit-text-stroke-width: 0px;background-color:rgb(255,
255, 255);word-spacing:0px'><div><p class=MsoNormal style='background:white'><span style='font-size:11.0pt;font-family:"Arial","sans-serif"'>I still don't understand why the qdevice concept doesn't help on this situation. Since the master node is down, I would expect the quorum to declare it as dead.<o:p></o:p></span></p></div><div><p class=MsoNormal style='background:white'><span style='font-size:11.0pt;font-family:"Arial","sans-serif"'>Why doesn't it happens?<o:p></o:p></span></p></div></blockquote><p class=MsoNormal><span style='font-family:"Baskerville","serif"'><br>That is not how quorum works. It just limits the decision-making to the quorate subset of the cluster.<br>Still the unknown nodes are not sure to be down.<br>That is why I suggested to have quorum-based watchdog-fencing with sbd.<br>That would assure that within a certain time all nodes of the non-quorate part<br>of the cluster are down.<br><br style='font-variant-caps: normal;text-align:start;-webkit-text-stroke-width: 0px;background-color:rgb(255,
255, 255);word-spacing:0px'><br></span><o:p></o:p></p><p class=MsoNormal style='margin-bottom:12.0pt'><span style='font-family:"Baskerville","serif"'><br><br><o:p></o:p></span></p><div><p class=MsoNormal style='margin-bottom:12.0pt'><span style='font-family:"Baskerville","serif"'>On Mon, Jul 24, 2017 at 4:15 PM +0300, "Dmitri Maziuk"<span class=apple-converted-space> </span><<a href="mailto:dmitri.maziuk@gmail.com" target="_blank">dmitri.maziuk@gmail.com</a>><span class=apple-converted-space> </span>wrote:<o:p></o:p></span></p><div><pre>On 2017-07-24 07:51, Tomer Azran wrote:<o:p></o:p></pre><pre>> We don't have the ability to use it.<o:p></o:p></pre><pre>> Is that the only solution?<o:p></o:p></pre><pre><o:p> </o:p></pre><pre>No, but I'd recommend thinking about it first. Are you sure you will <o:p></o:p></pre><pre>care about your cluster working when your server room is on fire? 'Cause <o:p></o:p></pre><pre>unless you have halon suppression, your server room is a complete <o:p></o:p></pre><pre>write-off anyway. (Think water from sprinklers hitting rich chunky volts <o:p></o:p></pre><pre>in the servers.)<o:p></o:p></pre><pre><o:p> </o:p></pre><pre>Dima<o:p></o:p></pre><pre><o:p> </o:p></pre><pre>_______________________________________________<o:p></o:p></pre><pre>Users mailing list: <a href="mailto:Users@clusterlabs.org">Users@clusterlabs.org</a><o:p></o:p></pre><pre><a href="http://lists.clusterlabs.org/mailman/listinfo/users">http://lists.clusterlabs.org/mailman/listinfo/users</a><o:p></o:p></pre><pre><o:p> </o:p></pre><pre>Project Home: <a href="http://www.clusterlabs.org/">http://www.clusterlabs.org</a><o:p></o:p></pre><pre>Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><o:p></o:p></pre><pre>Bugs: <a href="http://bugs.clusterlabs.org/">http://bugs.clusterlabs.org</a><o:p></o:p></pre></div></div><p class=MsoNormal><span style='font-family:"Baskerville","serif"'><br><br><br><o:p></o:p></span></p><pre>_______________________________________________<o:p></o:p></pre><pre>Users mailing list: <a href="mailto:Users@clusterlabs.org">Users@clusterlabs.org</a><o:p></o:p></pre><pre><a href="http://lists.clusterlabs.org/mailman/listinfo/users">http://lists.clusterlabs.org/mailman/listinfo/users</a><o:p></o:p></pre><pre><o:p> </o:p></pre><pre>Project Home: <a href="http://www.clusterlabs.org/">http://www.clusterlabs.org</a><o:p></o:p></pre><pre>Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><o:p></o:p></pre><pre>Bugs: <a href="http://bugs.clusterlabs.org/">http://bugs.clusterlabs.org</a><o:p></o:p></pre><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;font-variant-caps: normal;text-align:start;-webkit-text-stroke-width: 0px;background-color:rgb(255,
255, 255);word-spacing:0px'><span style='font-family:"Baskerville","serif"'><o:p> </o:p></span></p><pre style='background:white;font-variant-caps: normal;text-align:start;-webkit-text-stroke-width: 0px;word-spacing:0px'><span style='font-size:12.0pt'>-- <o:p></o:p></span></pre><pre style='background:white'><span style='font-size:12.0pt'>Klaus Wenninger<o:p></o:p></span></pre><pre style='background:white'><span style='font-size:12.0pt'><o:p> </o:p></span></pre><pre style='background:white'><span style='font-size:12.0pt'>Senior Software Engineer, EMEA ENG Openstack Infrastructure<o:p></o:p></span></pre><pre style='background:white'><span style='font-size:12.0pt'><o:p> </o:p></span></pre><pre style='background:white'><span style='font-size:12.0pt'>Red Hat<o:p></o:p></span></pre><pre style='background:white'><span style='font-size:12.0pt'><o:p> </o:p></span></pre><pre style='background:white'><span style='font-size:12.0pt'><a href="mailto:kwenning@redhat.com">kwenning@redhat.com</a> <o:p></o:p></span></pre><p class=MsoNormal><span style='font-family:"Baskerville","serif"'>_______________________________________________<br>Users mailing list:<span class=apple-converted-space> </span></span><a href="mailto:Users@clusterlabs.org"><span style='font-family:"Baskerville","serif";background:white'>Users@clusterlabs.org</span></a><span style='font-family:"Baskerville","serif"'><br></span><a href="http://lists.clusterlabs.org/mailman/listinfo/users"><span style='font-family:"Baskerville","serif"'>http://lists.clusterlabs.org/mailman/listinfo/users</span></a><span style='font-family:"Baskerville","serif"'><br><br>Project Home:<span class=apple-converted-space> </span></span><a href="http://www.clusterlabs.org/"><span style='font-family:"Baskerville","serif";background:white'>http://www.clusterlabs.org</span></a><span style='font-family:"Baskerville","serif"'><br>Getting started:<span class=apple-converted-space> </span></span><a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf"><span style='font-family:"Baskerville","serif"'>http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</span></a><span style='font-family:"Baskerville","serif"'><br>Bugs:<span class=apple-converted-space> </span></span><a href="http://bugs.clusterlabs.org/"><span style='font-family:"Baskerville","serif";background:white'>http://bugs.clusterlabs.org</span></a><o:p></o:p></p></div></blockquote></div><p class=MsoNormal><o:p> </o:p></p></div></div></body></html>