<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">On 8/7/19 12:26 PM, Momcilo Medic
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CA+fgXntdCmPjL-ix8zR_4KP=e+8kjiWymCKjwXq5Agd2noGtEA@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr"> We have three node cluster that is setup to stop
resources on lost quorum.<br>
Failure (network going down) handling is done properly, but
recovery doesn't seem to work.<br>
</div>
</blockquote>
<tt>What do you mean by 'network going down'?</tt><tt><br>
</tt><tt>Loss of link? Does the IP persist on the interface</tt><tt><br>
</tt><tt>in that case?</tt><tt><br>
</tt><tt>That there are issue reconnecting the CPG-API</tt><tt><br>
</tt><tt>sounds strange to me. Already the fact that</tt><tt><br>
</tt><tt>something has to be reconnected. I got it</tt><tt><br>
</tt><tt>that your nodes were persistently up during the</tt><tt><br>
</tt><tt>network-disconnection. Although I would have</tt><tt><br>
</tt><tt>expected fencing to kick in at least on those</tt><tt><br>
</tt><tt>which are part of the non-quorate cluster-partition.</tt><tt><br>
</tt><tt>Maybe a few words more on your scenario</tt><tt><br>
</tt><tt>(fening-setup e.g.) would help to understand what</tt><tt><br>
</tt><tt>is going on.</tt><tt><br>
</tt><tt><br>
</tt><tt>Klaus</tt><br>
<blockquote type="cite"
cite="mid:CA+fgXntdCmPjL-ix8zR_4KP=e+8kjiWymCKjwXq5Agd2noGtEA@mail.gmail.com">
<div dir="ltr"><br>
What happens is, services crash when we re-enable network
connection.<br>
<br>
From journal:<br>
<br>
```<br>
...<br>
Jul 12 00:27:32 <a href="http://itaftestkvmls02.dc.itaf.eu"
moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
corosync[9069]: corosync: totemsrp.c:1328:
memb_consensus_agreed: Assertion `token_memb_entries >= 1'
failed.<br>
Jul 12 00:27:33 <a href="http://itaftestkvmls02.dc.itaf.eu"
moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
attrd[9104]: error: Connection to the CPG API failed: Library
error (2)<br>
Jul 12 00:27:33 <a href="http://itaftestkvmls02.dc.itaf.eu"
moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
stonith-ng[9100]: error: Connection to the CPG API failed:
Library error (2)<br>
Jul 12 00:27:33 <a href="http://itaftestkvmls02.dc.itaf.eu"
moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
systemd[1]: corosync.service: Main process exited, code=dumped,
status=6/ABRT<br>
Jul 12 00:27:33 <a href="http://itaftestkvmls02.dc.itaf.eu"
moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
cib[9098]: error: Connection to the CPG API failed: Library
error (2)<br>
Jul 12 00:27:33 <a href="http://itaftestkvmls02.dc.itaf.eu"
moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
systemd[1]: corosync.service: Failed with result 'core-dump'.<br>
Jul 12 00:27:33 <a href="http://itaftestkvmls02.dc.itaf.eu"
moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
pacemakerd[9087]: error: Connection to the CPG API failed:
Library error (2)<br>
Jul 12 00:27:33 <a href="http://itaftestkvmls02.dc.itaf.eu"
moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
systemd[1]: pacemaker.service: Main process exited, code=exited,
status=107/n/a<br>
Jul 12 00:27:33 <a href="http://itaftestkvmls02.dc.itaf.eu"
moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
systemd[1]: pacemaker.service: Failed with result 'exit-code'.<br>
Jul 12 00:27:33 <a href="http://itaftestkvmls02.dc.itaf.eu"
moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
systemd[1]: Stopped Pacemaker High Availability Cluster Manager.<br>
Jul 12 00:27:33 <a href="http://itaftestkvmls02.dc.itaf.eu"
moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
lrmd[9102]: warning: new_event_notification (9102-9107-7): Bad
file descriptor (9)<br>
...<br>
```<br>
Pacemaker's log shows no relevant info.<br>
<br>
This is from corosync's log:<br>
<br>
```<br>
Jul 12 00:27:33 [9107] <a
href="http://itaftestkvmls02.dc.itaf.eu"
moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
crmd: info: qb_ipcs_us_withdraw: withdrawing server
sockets<br>
Jul 12 00:27:33 [9104] <a
href="http://itaftestkvmls02.dc.itaf.eu"
moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
attrd: error: pcmk_cpg_dispatch: Connection to the CPG
API failed: Library error (2)<br>
Jul 12 00:27:33 [9100] <a
href="http://itaftestkvmls02.dc.itaf.eu"
moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
stonith-ng: error: pcmk_cpg_dispatch: Connection to the
CPG API failed: Library error (2)<br>
Jul 12 00:27:33 [9098] <a
href="http://itaftestkvmls02.dc.itaf.eu"
moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
cib: error: pcmk_cpg_dispatch: Connection to the CPG
API failed: Library error (2)<br>
Jul 12 00:27:33 [9087] <a
href="http://itaftestkvmls02.dc.itaf.eu"
moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
pacemakerd: error: pcmk_cpg_dispatch: Connection to the
CPG API failed: Library error (2)<br>
Jul 12 00:27:33 [9104] <a
href="http://itaftestkvmls02.dc.itaf.eu"
moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
attrd: info: qb_ipcs_us_withdraw: withdrawing server
sockets<br>
Jul 12 00:27:33 [9087] <a
href="http://itaftestkvmls02.dc.itaf.eu"
moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
pacemakerd: info: crm_xml_cleanup: Cleaning up memory
from libxml2<br>
Jul 12 00:27:33 [9107] <a
href="http://itaftestkvmls02.dc.itaf.eu"
moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
crmd: info: crm_xml_cleanup: Cleaning up memory from
libxml2<br>
Jul 12 00:27:33 [9100] <a
href="http://itaftestkvmls02.dc.itaf.eu"
moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
stonith-ng: info: qb_ipcs_us_withdraw: withdrawing server
sockets<br>
Jul 12 00:27:33 [9104] <a
href="http://itaftestkvmls02.dc.itaf.eu"
moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
attrd: info: crm_xml_cleanup: Cleaning up memory
from libxml2<br>
Jul 12 00:27:33 [9098] <a
href="http://itaftestkvmls02.dc.itaf.eu"
moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
cib: info: qb_ipcs_us_withdraw: withdrawing server
sockets<br>
Jul 12 00:27:33 [9100] <a
href="http://itaftestkvmls02.dc.itaf.eu"
moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
stonith-ng: info: crm_xml_cleanup: Cleaning up memory
from libxml2<br>
Jul 12 00:27:33 [9098] <a
href="http://itaftestkvmls02.dc.itaf.eu"
moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
cib: info: qb_ipcs_us_withdraw: withdrawing server
sockets<br>
Jul 12 00:27:33 [9098] <a
href="http://itaftestkvmls02.dc.itaf.eu"
moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
cib: info: qb_ipcs_us_withdraw: withdrawing server
sockets<br>
Jul 12 00:27:33 [9098] <a
href="http://itaftestkvmls02.dc.itaf.eu"
moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
cib: info: crm_xml_cleanup: Cleaning up memory from
libxml2<br>
Jul 12 00:27:33 [9102] <a
href="http://itaftestkvmls02.dc.itaf.eu"
moz-do-not-send="true">itaftestkvmls02.dc.itaf.eu</a>
lrmd: warning: qb_ipcs_event_sendv: new_event_notification
(9102-9107-7): Bad file descriptor (9)<br>
```<br>
<br>
Please let me know if you need any further info, I'll be more
than happy to provide it.<br>
<br>
This is always reproducible in our environment:<br>
Ubuntu 18.04.2<br>
corosync 2.4.3-0ubuntu1.1<br>
pcs 0.9.164-1<br>
<div>pacemaker 1.1.18-0ubuntu1.1</div>
<div><br>
</div>
<div>Kind regards,</div>
<div>Momo.<br>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
Manage your subscription:
<a class="moz-txt-link-freetext" href="https://lists.clusterlabs.org/mailman/listinfo/users">https://lists.clusterlabs.org/mailman/listinfo/users</a>
ClusterLabs home: <a class="moz-txt-link-freetext" href="https://www.clusterlabs.org/">https://www.clusterlabs.org/</a></pre>
</blockquote>
<br>
</body>
</html>