<html>

  <head>

    <meta content="text/html; charset=windows-1252"

      http-equiv="Content-Type">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <p>Hi all,</p>

    <p>As Ken said</p>

    <p>"Not currently, but that is planned for a future version",</p>

    <p> just want to remind how useful would be to have "ignore X

      monitoring timeouts" as an option in the newest pacemaker.</p>

    <p>Still having big problems with resources restarting because of a

      lost monitoring requests, which leads to service interruptions.</p>

    <p>Best regards,</p>

    <p>Klecho<br>

    </p>

    <br>

    <div class="moz-cite-prefix">On 1.09.2017 17:52, Klechomir wrote:<br>

    </div>

    <blockquote

      cite="mid:765cbf85-2b93-e755-9ff0-b8aba2dad260@gmail.com"

      type="cite">

      <meta content="text/html; charset=windows-1252"

        http-equiv="Content-Type">

      <div class="moz-cite-prefix">On 1.09.2017 17:21, Jan Pokorný

        wrote:<br>

      </div>

      <blockquote cite="mid:20170901142103.GB29380@redhat.com"

        type="cite">

        <pre wrap="">On 01/09/17 09:48 +0300, Klechomir wrote:

</pre>

        <blockquote type="cite">

          <pre wrap="">I have cases, when for an unknown reason a single monitoring request

never returns result.

So having bigger timeouts doesn't resolve this problem.

</pre>

        </blockquote>

        <pre wrap="">If I get you right, the pain point here is a command called by the

resource agents during monitor operation, while this command under

some circumstances _never_ terminates (for dead waiting, infinite

loop, or whatever other reason) or possibly terminates based on

external/asynchronous triggers (e.g. network connection gets

reestablished).

Stating obvious, the solution should be:

- work towards fixing such particular command if blocking

  is an unexpected behaviour (clarify this with upstream

  if needed)

- find more reliable way for the agent to monitor the resource

For the planned soft-recovery options Ken talked about, I am not

sure if it would be trivially possible to differentiate exceeded

monitor timeout from a plain monitor failure.</pre>

      </blockquote>

      In any case currently there is no differentiation between failed

      monitoring request and timeouts, so a parameter for ignoring X

      fails in a row would be very welcome for me.<br>

      <br>

      Here is one very fresh example, entirely unrelated to LV&I/O:<br>

      Aug 30 10:44:19 [1686093] CLUSTER-1       crmd:    error:

      process_lrm_event:    LRM operation p_PingD_monitor_0 (1148) Timed

      Out (timeout=20000ms)<br>

      Aug 30 10:44:56 [1686093] CLUSTER-1       crmd:   notice:

      process_lrm_event:    LRM operation p_PingD_stop_0 (call=1234,

      rc=0, cib-update=40, confirmed=true) ok<br>

      Aug 30 10:45:26 [1686093] CLUSTER-1       crmd:   notice:

      process_lrm_event:    LRM operation p_PingD_start_0 (call=1240,

      rc=0, cib-update=41, confirmed=true) ok<br>

      In this case PingD is fencing drbd and causes unneeded (as the

      next monitoring request is ok) restart of all related resources.<br>

      <blockquote cite="mid:20170901142103.GB29380@redhat.com"

        type="cite"><br>

        <fieldset class="mimeAttachmentHeader"></fieldset>

        <br>

        <pre wrap="">_______________________________________________

Users mailing list: <a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:Users@clusterlabs.org">Users@clusterlabs.org</a>

<a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://lists.clusterlabs.org/mailman/listinfo/users">http://lists.clusterlabs.org/mailman/listinfo/users</a>

Project Home: <a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://www.clusterlabs.org">http://www.clusterlabs.org</a>

Getting started: <a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a>

Bugs: <a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://bugs.clusterlabs.org">http://bugs.clusterlabs.org</a>

</pre>

      </blockquote>

      <p><br>

      </p>

    </blockquote>

    <br>

    <pre class="moz-signature" cols="72">-- 

Klecho</pre>

  </body>

</html>