<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">On 8/7/19 12:26 PM, Momcilo Medic
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CA+fgXntdCmPjL-ix8zR_4KP=e+8kjiWymCKjwXq5Agd2noGtEA@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="ltr"> We have three node cluster that is setup to stop
        resources on lost quorum.<br>
        Failure (network going down) handling is done properly, but
        recovery doesn't seem to work.<br>
      </div>
    </blockquote>
    <tt>What do you mean by 'network going down'?</tt><tt><br>
    </tt><tt>Loss of link? Does the IP persist on the interface</tt><tt><br>
    </tt><tt>in that case?</tt><tt><br>
    </tt><tt>That there are issue reconnecting the CPG-API</tt><tt><br>
    </tt><tt>sounds strange to me. Already the fact that</tt><tt><br>
    </tt><tt>something has to be reconnected. I got it</tt><tt><br>
    </tt><tt>that your nodes were persistently up during the</tt><tt><br>
    </tt><tt>network-disconnection. Although I would have</tt><tt><br>
    </tt><tt>expected fencing to kick in at least on those</tt><tt><br>
    </tt><tt>which are part of the non-quorate cluster-partition.</tt><tt><br>
    </tt><tt>Maybe a few words more on your scenario</tt><tt><br>
    </tt><tt>(fening-setup e.g.) would help to understand what</tt><tt><br>
    </tt><tt>is going on.</tt><tt><br>
    </tt><tt><br>
    </tt><tt>Klaus</tt><br>
    <blockquote type="cite"
cite="mid:CA+fgXntdCmPjL-ix8zR_4KP=e+8kjiWymCKjwXq5Agd2noGtEA@mail.gmail.com">
      <div dir="ltr"><br>
        What happens is, services crash when we re-enable network
        connection.<br>
        <br>
        From journal:<br>
        <br>
        ```<br>
        ...<br>
        Jul 12 00:27:32 <a href="http://itaftestkvmls02.dc.itaf.eu"
          moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
        corosync[9069]: corosync: totemsrp.c:1328:
        memb_consensus_agreed: Assertion `token_memb_entries >= 1'
        failed.<br>
        Jul 12 00:27:33 <a href="http://itaftestkvmls02.dc.itaf.eu"
          moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
        attrd[9104]:    error: Connection to the CPG API failed: Library
        error (2)<br>
        Jul 12 00:27:33 <a href="http://itaftestkvmls02.dc.itaf.eu"
          moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
        stonith-ng[9100]:    error: Connection to the CPG API failed:
        Library error (2)<br>
        Jul 12 00:27:33 <a href="http://itaftestkvmls02.dc.itaf.eu"
          moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
        systemd[1]: corosync.service: Main process exited, code=dumped,
        status=6/ABRT<br>
        Jul 12 00:27:33 <a href="http://itaftestkvmls02.dc.itaf.eu"
          moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
        cib[9098]:    error: Connection to the CPG API failed: Library
        error (2)<br>
        Jul 12 00:27:33 <a href="http://itaftestkvmls02.dc.itaf.eu"
          moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
        systemd[1]: corosync.service: Failed with result 'core-dump'.<br>
        Jul 12 00:27:33 <a href="http://itaftestkvmls02.dc.itaf.eu"
          moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
        pacemakerd[9087]:    error: Connection to the CPG API failed:
        Library error (2)<br>
        Jul 12 00:27:33 <a href="http://itaftestkvmls02.dc.itaf.eu"
          moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
        systemd[1]: pacemaker.service: Main process exited, code=exited,
        status=107/n/a<br>
        Jul 12 00:27:33 <a href="http://itaftestkvmls02.dc.itaf.eu"
          moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
        systemd[1]: pacemaker.service: Failed with result 'exit-code'.<br>
        Jul 12 00:27:33 <a href="http://itaftestkvmls02.dc.itaf.eu"
          moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
        systemd[1]: Stopped Pacemaker High Availability Cluster Manager.<br>
        Jul 12 00:27:33 <a href="http://itaftestkvmls02.dc.itaf.eu"
          moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
        lrmd[9102]:  warning: new_event_notification (9102-9107-7): Bad
        file descriptor (9)<br>
        ...<br>
        ```<br>
        Pacemaker's log shows no relevant info.<br>
        <br>
        This is from corosync's log:<br>
        <br>
        ```<br>
        Jul 12 00:27:33 [9107] <a
          href="http://itaftestkvmls02.dc.itaf.eu"
          moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>      
        crmd:     info: qb_ipcs_us_withdraw:    withdrawing server
        sockets<br>
        Jul 12 00:27:33 [9104] <a
          href="http://itaftestkvmls02.dc.itaf.eu"
          moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>    
         attrd:    error: pcmk_cpg_dispatch:      Connection to the CPG
        API failed: Library error (2)<br>
        Jul 12 00:27:33 [9100] <a
          href="http://itaftestkvmls02.dc.itaf.eu"
          moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
        stonith-ng:    error: pcmk_cpg_dispatch:      Connection to the
        CPG API failed: Library error (2)<br>
        Jul 12 00:27:33 [9098] <a
          href="http://itaftestkvmls02.dc.itaf.eu"
          moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>      
         cib:    error: pcmk_cpg_dispatch:      Connection to the CPG
        API failed: Library error (2)<br>
        Jul 12 00:27:33 [9087] <a
          href="http://itaftestkvmls02.dc.itaf.eu"
          moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
        pacemakerd:    error: pcmk_cpg_dispatch:      Connection to the
        CPG API failed: Library error (2)<br>
        Jul 12 00:27:33 [9104] <a
          href="http://itaftestkvmls02.dc.itaf.eu"
          moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>    
         attrd:     info: qb_ipcs_us_withdraw:    withdrawing server
        sockets<br>
        Jul 12 00:27:33 [9087] <a
          href="http://itaftestkvmls02.dc.itaf.eu"
          moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
        pacemakerd:     info: crm_xml_cleanup:        Cleaning up memory
        from libxml2<br>
        Jul 12 00:27:33 [9107] <a
          href="http://itaftestkvmls02.dc.itaf.eu"
          moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>      
        crmd:     info: crm_xml_cleanup:        Cleaning up memory from
        libxml2<br>
        Jul 12 00:27:33 [9100] <a
          href="http://itaftestkvmls02.dc.itaf.eu"
          moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
        stonith-ng:     info: qb_ipcs_us_withdraw:    withdrawing server
        sockets<br>
        Jul 12 00:27:33 [9104] <a
          href="http://itaftestkvmls02.dc.itaf.eu"
          moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>    
         attrd:     info: crm_xml_cleanup:        Cleaning up memory
        from libxml2<br>
        Jul 12 00:27:33 [9098] <a
          href="http://itaftestkvmls02.dc.itaf.eu"
          moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>      
         cib:     info: qb_ipcs_us_withdraw:    withdrawing server
        sockets<br>
        Jul 12 00:27:33 [9100] <a
          href="http://itaftestkvmls02.dc.itaf.eu"
          moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
        stonith-ng:     info: crm_xml_cleanup:        Cleaning up memory
        from libxml2<br>
        Jul 12 00:27:33 [9098] <a
          href="http://itaftestkvmls02.dc.itaf.eu"
          moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>      
         cib:     info: qb_ipcs_us_withdraw:    withdrawing server
        sockets<br>
        Jul 12 00:27:33 [9098] <a
          href="http://itaftestkvmls02.dc.itaf.eu"
          moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>      
         cib:     info: qb_ipcs_us_withdraw:    withdrawing server
        sockets<br>
        Jul 12 00:27:33 [9098] <a
          href="http://itaftestkvmls02.dc.itaf.eu"
          moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>      
         cib:     info: crm_xml_cleanup:        Cleaning up memory from
        libxml2<br>
        Jul 12 00:27:33 [9102] <a
          href="http://itaftestkvmls02.dc.itaf.eu"
          moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>      
        lrmd:  warning: qb_ipcs_event_sendv:    new_event_notification
        (9102-9107-7): Bad file descriptor (9)<br>
        ```<br>
        <br>
        Please let me know if you need any further info, I'll be more
        than happy to provide it.<br>
        <br>
        This is always reproducible in our environment:<br>
        Ubuntu 18.04.2<br>
        corosync 2.4.3-0ubuntu1.1<br>
        pcs 0.9.164-1<br>
        <div>pacemaker 1.1.18-0ubuntu1.1</div>
        <div><br>
        </div>
        <div>Kind regards,</div>
        <div>Momo.<br>
        </div>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <pre class="moz-quote-pre" wrap="">_______________________________________________
Manage your subscription:
<a class="moz-txt-link-freetext" href="https://lists.clusterlabs.org/mailman/listinfo/users">https://lists.clusterlabs.org/mailman/listinfo/users</a>

ClusterLabs home: <a class="moz-txt-link-freetext" href="https://www.clusterlabs.org/">https://www.clusterlabs.org/</a></pre>
    </blockquote>
    <br>
  </body>
</html>