<html>
<head>
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
</head>
<body>
<div class="moz-cite-prefix">On 5/31/21 10:53 AM, Emil Penchev
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:AM0PR0402MB38902F4689E059BF96F741EAE53F9@AM0PR0402MB3890.eurprd04.prod.outlook.com">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
<style type="text/css" style="display:none;">P {margin-top:0;margin-bottom:0;}</style>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
Hi all,</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
<div><br>
</div>
<div>I'm writing about an issue we have received from a
pacemaker user about RA timeout.</div>
<div>Some users have encountered a timeout from RA
script/program and this led to a major outage for them.</div>
<div>Typical of these types of cases, there is no additional
useful information to explain why this happened.</div>
<div>There is a proposed solution, a POC from the user to
instrument pacemaker directly and insert a method to activate
further debugging via an external callout program.</div>
<span>One can set an environment variable, for example<b>
PCMK_timeout_prog</b> that points to an external program or
a script to be executed to get more useful debug information
for example.</span><br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
<br>
Here is the proposed POC change with minor changes.<br>
<a
href="https://github.com/tickbg/pacemaker/compare/master...a453d30"
id="LPlnk299320" moz-do-not-send="true">https://github.com/tickbg/pacemaker/compare/master...a453d30</a><br>
<br>
</div>
</blockquote>
<p>If you directly create a pull-request we would be able</p>
<p>to use github for discussion.</p>
<p><br>
</p>
<p>In pacemaker we already have the alerts-feature that</p>
<p>allows calling scripts on various occasions.</p>
<p>One of those is resource-actions.</p>
<p>So it might make sense to consider an extension of <br>
</p>
<p>that feature as to cover your case here as well.</p>
<p>Atm you would get the return-code of the RA passed</p>
<p>to your script. I'm actually unsure what happens in</p>
<p>case of a timeout.</p>
<p>To just be called in case of a timeout additional</p>
<p>filtering might be handy to reduce load generated</p>
<p>if the filtering is done in the script and a synchronous-call</p>
<p>flag (atm alerts are called more in a fire and forget</p>
<p>manner as not to throttle pacemaker actions)</p>
<p>could be useful.</p>
<p><br>
</p>
<p>Klaus<br>
</p>
<blockquote type="cite"
cite="mid:AM0PR0402MB38902F4689E059BF96F741EAE53F9@AM0PR0402MB3890.eurprd04.prod.outlook.com">
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
<br>
Emil.<br>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
Manage your subscription:
<a class="moz-txt-link-freetext" href="https://lists.clusterlabs.org/mailman/listinfo/developers">https://lists.clusterlabs.org/mailman/listinfo/developers</a>
ClusterLabs home: <a class="moz-txt-link-freetext" href="https://www.clusterlabs.org/">https://www.clusterlabs.org/</a>
</pre>
</blockquote>
<br>
</body>
</html>