<div dir="ltr"><div><div>Hey guys<br><br></div>After removing qdisk configuration from /etc/cluster/cluster.conf got fencing success!? <br><span style="font-family:monospace,monospace">&lt;quorumd interval=&quot;1&quot; label=&quot;QuorumDisk&quot; status_file=&quot;/qdisk_status&quot; tko=&quot;70&quot;/&gt;<br><br></span></div><span style="font-family:monospace,monospace"><font face="arial,helvetica,sans-serif">Maybe </font>tko=&quot;70&quot;<font face="arial,helvetica,sans-serif"> is cause of the issue, I have to inspect this further?</font><br></span><div><div><br><span style="font-family:monospace,monospace">[node1:~]# fence_node -vv node3<br>fence node3 dev 0.0 agent fence_pcmk result: success<br>agent args: action=off port=node3 nodename=node3 agent=fence_pcmk<br>fence node3 success </span><br></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Jun 22, 2015 at 10:21 AM, Milos Buncic <span dir="ltr">&lt;<a href="mailto:htchak19@gmail.com" target="_blank">htchak19@gmail.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Hey first of all thank you for you answer<br></div><span class=""><div> </div><blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex" class="gmail_quote">Does &#39;fence_ipmilan ...&#39; work when called manually from the command line?  <br></blockquote></span><div class="gmail_extra"><br>Yes it does (off, on, status...)<br><br><span style="font-family:monospace,monospace">[node1:~]# fence_ipmilan -v -p </span><span style="font-family:monospace,monospace"><span>********</span> -l fencer -L OPERATOR -P -a 1.1.1.1 -o status<br>Getting status of IPMI:1.1.1.1...Spawning: &#39;/usr/bin/ipmitool -I lanplus -H &#39;1.1.1.1&#39; -U &#39;fencer&#39; -L &#39;OPERATOR&#39; -P &#39;[set]&#39; -v chassis power status&#39;...<br>Chassis power = On<br>Done <br></span></div><div class="gmail_extra"><span class=""><br><blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex" class="gmail_quote">Looks like the same node is defined twice, instead of &#39;node3&#39;.<br></blockquote>
</span><span>Sorry because of that I mistyped the hostname just after I pasted configuration<br></span></div><div class="gmail_extra"><span><br>Configuration looks like this  <br></span></div><div class="gmail_extra"><span style="font-family:monospace,monospace"><span class=""><br>&lt;?xml version=&quot;1.0&quot;?&gt;<br>&lt;cluster config_version=&quot;10&quot; name=&quot;mycluster&quot;&gt;<br>        &lt;fence_daemon/&gt;<br>        &lt;clusternodes&gt;<br>                &lt;clusternode name=&quot;node1&quot; nodeid=&quot;1&quot;&gt;<br>                        &lt;fence&gt;<br>                                &lt;method name=&quot;pcmk-redirect&quot;&gt;<br>                                        &lt;device action=&quot;off&quot; name=&quot;pcmk&quot; port=&quot;node1&quot;/&gt;<br>                                &lt;/method&gt;<br>                        &lt;/fence&gt;<br>                &lt;/clusternode&gt;<br>                &lt;clusternode name=&quot;node2&quot; nodeid=&quot;2&quot;&gt;<br>                        &lt;fence&gt;<br>                                &lt;method name=&quot;pcmk-redirect&quot;&gt;<br>                                        &lt;device action=&quot;off&quot; name=&quot;pcmk&quot; port=&quot;node2&quot;/&gt;<br>                                &lt;/method&gt;<br>                        &lt;/fence&gt;<br>                &lt;/clusternode&gt;<br></span>                &lt;clusternode name=&quot;node3&quot; nodeid=&quot;3&quot;&gt;<br>                        &lt;fence&gt;<br>                                &lt;method name=&quot;pcmk-redirect&quot;&gt;<br>                                        &lt;device action=&quot;off&quot; name=&quot;pcmk&quot; port=&quot;node3&quot;/&gt;<br>                                &lt;/method&gt;<br>                        &lt;/fence&gt;<br>                &lt;/clusternode&gt;<span class=""><br>        &lt;/clusternodes&gt;<br>        &lt;cman/&gt;<br>        &lt;fencedevices&gt;<br>                &lt;fencedevice agent=&quot;fence_pcmk&quot; name=&quot;pcmk&quot;/&gt;<br>        &lt;/fencedevices&gt;<br>        &lt;rm&gt;<br>                &lt;failoverdomains/&gt;<br>                &lt;resources/&gt;<br>        &lt;/rm&gt;<br>        &lt;logging debug=&quot;on&quot;/&gt;<br>        &lt;quorumd interval=&quot;1&quot; label=&quot;QuorumDisk&quot; status_file=&quot;/qdisk_status&quot; tko=&quot;70&quot;/&gt;<br></span><span class="">        &lt;totem token=&quot;108000&quot;/&gt;<br>&lt;/cluster&gt;</span></span><br><br><span class=""><br><blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex" class="gmail_quote">Run &#39;fence_check&#39; (this tests cman&#39;s fencing which is hooked into<br>
pacemaker&#39;s stonith).<br></blockquote>
<br></span><span style="font-family:monospace,monospace">fence_check run at Mon Jun 22 09:35:38 CEST 2015 pid: 16091<br>Checking if cman is running: running<br>Checking if node is quorate: quorate<br>Checking if node is in fence domain: yes<br>Checking if node is fence master: this node is fence master<br>Checking if real fencing is in progress: no fencing in progress<br>Get node list: node1 node2 node3<br><br>Testing node1 fencing<br>Checking if cman is running: running<br>Checking if node is quorate: quorate<br>Checking if node is in fence domain: yes<br>Checking if node is fence master: this node is fence master<br>Checking if real fencing is in progress: no fencing in progress<br>Checking how many fencing methods are configured for node node1<br>Found 1 method(s) to test for node node1<br>Testing node1 method 1 status<br>Testing node1 method 1: success<br><br>Testing node2 fencing<br>Checking if cman is running: running<br>Checking if node is quorate: quorate<br>Checking if node is in fence domain: yes<br>Checking if node is fence master: this node is fence master<br>Checking if real fencing is in progress: no fencing in progress<br>Checking how many fencing methods are configured for node node2<br>Found 1 method(s) to test for node node2<br>Testing node2 method 1 status<br>Testing node2 method 1: success<br><br>Testing node3 fencing<br>Checking if cman is running: running<br>Checking if node is quorate: quorate<br>Checking if node is in fence domain: yes<br>Checking if node is fence master: this node is fence master<br>Checking if real fencing is in progress: no fencing in progress<br>Checking how many fencing methods are configured for node node3<br>Found 1 method(s) to test for node node3<br>Testing node3 method 1 status<br>Testing node3 method 1: success<br>cleanup: 0<br></span><span class=""><br><br><blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex" class="gmail_quote">Also, I&#39;m not sure how well qdisk is tested/supported. Do you even need<br>
it with three nodes?<br></blockquote></span><div>Qdisk is tested in production where we&#39;re using rgmanager so I just mirrored that configuration.<br></div><div>Hm yes in three node cluster probably we don&#39;t need it.<br></div><span class="">
<br><blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex" class="gmail_quote">
&lt;totem token=&quot;108000&quot;/&gt;<br>
That is a VERY high number!<br></blockquote>
</span><span>You&#39;re probably right I changed this value to default (10 sec)<br></span></div><div class="gmail_extra">&lt;totem token=&quot;10000&quot;/&gt;<span class=""><br><br>[node1:~]# fence_node -vv node3<br>fence node3 dev 0.0 agent fence_pcmk result: error from agent<br></span>agent args: action=off port=node3 nodename=node3 agent=fence_pcmk<br>fence node3 failed<br><br></div><div class="gmail_extra">Messages log caught from node2 who&#39;s running fencing resource for node3<br><br><span style="font-family:monospace,monospace">[node2:~]# tail -0f /var/log/messages<br>...<br>Jun 22 10:04:28 node2 stonith-ng[7382]:   notice: can_fence_host_with_device: node1-ipmi can not fence (off) node3: static-list<br>Jun 22 10:04:28 node2 stonith-ng[7382]:   notice: can_fence_host_with_device: node3-ipmi can fence (off) node3: static-list<br>Jun 22 10:04:28 node2 stonith-ng[7382]:   notice: can_fence_host_with_device: node1-ipmi can not fence (off) node3: static-list<br>Jun 22 10:04:28 node2 stonith-ng[7382]:   notice: can_fence_host_with_device: node3-ipmi can fence (off) node3: static-list<br>Jun 22 10:04:38 node2 stonith-ng[7382]:   notice: log_operation: Operation &#39;off&#39; [5288] (call 2 from stonith_admin.cman.7377) for host &#39;node3&#39; with device &#39;node3-ipmi&#39; returned: 0 (OK)<br>Jun 22 10:05:44 node2 qdiskd[5948]: Node 3 evicted<br><br><br></span></div><div class="gmail_extra"><span style="font-family:arial,helvetica,sans-serif">This is where delay happens (~3.5 min) <br></span></div><div class="gmail_extra"><span style="font-family:monospace,monospace"><br><br>Jun 22 10:08:06 node2 corosync[5861]:   [QUORUM] Members[2]: 1 2<br>Jun 22 10:08:06 node2 corosync[5861]:   [TOTEM ] A processor joined or left the membership and a new membership was formed.<br>Jun 22 10:08:06 node2 crmd[7386]:   notice: crm_update_peer_state: cman_event_callback: Node node3[3] - state is now lost (was member)<br>Jun 22 10:08:06 node2 crmd[7386]:  warning: match_down_event: No match for shutdown action on node3<br>Jun 22 10:08:06 node2 crmd[7386]:   notice: peer_update_callback: Stonith/shutdown of node3 not matched<br>Jun 22 10:08:06 node2 crmd[7386]:   notice: do_state_transition: State transition S_IDLE -&gt; S_INTEGRATION [ input=I_NODE_JOIN cause=C_FSA_INTERNAL origin=check_join_state ]<br>Jun 22 10:08:06 node2 rsyslogd-2177: imuxsock begins to drop messages from pid 5861 due to rate-limiting<br>Jun 22 10:08:06 node2 kernel: dlm: closing connection to node 3<br>Jun 22 10:08:06 node2 attrd[7384]:   notice: attrd_local_callback: Sending full refresh (origin=crmd)<br>Jun 22 10:08:06 node2 attrd[7384]:   notice: attrd_trigger_update: Sending flush op to all hosts for: shutdown (0)<br>Jun 22 10:08:06 node2 crmd[7386]:  warning: match_down_event: No match for shutdown action on node3<br>Jun 22 10:08:06 node2 crmd[7386]:   notice: peer_update_callback: Stonith/shutdown of node3 not matched<br>Jun 22 10:08:06 node2 stonith-ng[7382]:   notice: remote_op_done: Operation off of node3 by node2 for stonith_admin.cman.7377@node1.753ce4e5: OK<br>Jun 22 10:08:06 node2 fenced[6211]: fencing deferred to node1<br>Jun 22 10:08:06 node2 attrd[7384]:   notice: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true)<br>Jun 22 10:08:06 node2 crmd[7386]:   notice: tengine_stonith_notify: Peer node3 was terminated (off) by node2 for node1: OK (ref=753ce4e5-a84a-491b-8ed9-044667946381) by client stonith_admin.cman.7377<br>Jun 22 10:08:06 node2 crmd[7386]:   notice: tengine_stonith_notify: Notified CMAN that &#39;node3&#39; is now fenced<br>Jun 22 10:08:07 node2 rsyslogd-2177: imuxsock lost 108 messages from pid 5861 due to rate-limiting<br>Jun 22 10:08:07 node2 pengine[7385]:   notice: unpack_config: On loss of CCM Quorum: Ignore<br>Jun 22 10:08:07 node2 pengine[7385]:   notice: LogActions: Start   testvm102#011(node1)<br>Jun 22 10:08:07 node2 pengine[7385]:   notice: LogActions: Migrate testvm103#011(Started node1 -&gt; node2)<br>Jun 22 10:08:07 node2 pengine[7385]:   notice: LogActions: Start   testvm105#011(node1)<br>Jun 22 10:08:07 node2 pengine[7385]:   notice: LogActions: Start   testvm108#011(node1)<br>Jun 22 10:08:07 node2 pengine[7385]:   notice: LogActions: Migrate testvm109#011(Started node1 -&gt; node2)<br>Jun 22 10:08:07 node2 pengine[7385]:   notice: LogActions: Start   testvm111#011(node1)<br>Jun 22 10:08:07 node2 pengine[7385]:   notice: LogActions: Start   testvm114#011(node1)<br>Jun 22 10:08:07 node2 pengine[7385]:   notice: LogActions: Migrate testvm115#011(Started node1 -&gt; node2)<br>Jun 22 10:08:07 node2 pengine[7385]:   notice: LogActions: Start   testvm117#011(node1)<br>Jun 22 10:08:07 node2 pengine[7385]:   notice: LogActions: Start   node1-ipmi#011(node2)<br>...</span><br></div><div class="gmail_extra"><span class=""><br><blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex" class="gmail_quote">I noticed you&#39;re not a mailing list member. Please register if you want<br>
your emails to come through without getting stuck in the moderator queue.<br></blockquote></span>Thanks man I will<br></div><div class="gmail_extra"><br></div><div class="gmail_extra">Problem still persist :(<div><div class="h5"><br><br><br><div class="gmail_quote">On Mon, Jun 22, 2015 at 8:02 AM, Digimer <span dir="ltr">&lt;<a href="mailto:lists@alteeve.ca" target="_blank">lists@alteeve.ca</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span>On 21/06/15 02:12 PM, Milos Buncic wrote:<br>
&gt; Hey people<br>
&gt;<br>
&gt; I&#39;m experiencing very strange issue and it&#39;s appearing every time I try<br>
&gt; to fence a node.<br>
&gt; I have a test environment with three node cluster (CentOS 6.6 x86_64)<br>
&gt; where rgmanager is replaced with pacemaker (CMAN + pacemaker).<br>
&gt;<br>
&gt; I&#39;ve configured fencing with pcs for all three nodes<br>
&gt;<br>
&gt; Pacemaker:<br>
&gt; pcs stonith create node1-ipmi \<br>
&gt; fence_ipmilan pcmk_host_list=&quot;node1&quot; ipaddr=1.1.1.1 login=fencer<br>
&gt; passwd=******** privlvl=OPERATOR power_wait=10 lanplus=1 action=off \<br>
&gt; op monitor interval=10s timeout=30s<br>
<br>
</span>Does &#39;fence_ipmilan ...&#39; work when called manually from the command line? <br></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div> </div></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div>
&gt; pcs constraint location node1-ipmi avoids node1<br>
&gt;<br>
&gt; pcs property set stonith-enabled=true<br>
&gt;<br>
&gt;<br>
&gt; CMAN - /etc/cluster/cluster.conf:<br>
&gt; &lt;?xml version=&quot;1.0&quot;?&gt;<br>
&gt; &lt;cluster config_version=&quot;10&quot; name=&quot;mycluster&quot;&gt;<br>
&gt;         &lt;fence_daemon/&gt;<br>
&gt;         &lt;clusternodes&gt;<br>
&gt;                 &lt;clusternode name=&quot;node1&quot; nodeid=&quot;1&quot;&gt;<br>
&gt;                         &lt;fence&gt;<br>
&gt;                                 &lt;method name=&quot;pcmk-redirect&quot;&gt;<br>
&gt;                                         &lt;device action=&quot;off&quot; name=&quot;pcmk&quot;<br>
&gt; port=&quot;node1&quot;/&gt;<br>
&gt;                                 &lt;/method&gt;<br>
&gt;                         &lt;/fence&gt;<br>
&gt;                 &lt;/clusternode&gt;<br>
&gt;                 &lt;clusternode name=&quot;node2&quot; nodeid=&quot;2&quot;&gt;<br>
&gt;                         &lt;fence&gt;<br>
&gt;                                 &lt;method name=&quot;pcmk-redirect&quot;&gt;<br>
&gt;                                         &lt;device action=&quot;off&quot; name=&quot;pcmk&quot;<br>
&gt; port=&quot;node2&quot;/&gt;<br>
&gt;                                 &lt;/method&gt;<br>
&gt;                         &lt;/fence&gt;<br>
&gt;                 &lt;/clusternode&gt;<br>
&gt;                 &lt;clusternode name=&quot;node2&quot; nodeid=&quot;3&quot;&gt;<br>
&gt;                         &lt;fence&gt;<br>
&gt;                                 &lt;method name=&quot;pcmk-redirect&quot;&gt;<br>
&gt;                                         &lt;device action=&quot;off&quot; name=&quot;pcmk&quot;<br>
&gt; port=&quot;node2&quot;/&gt;<br>
&gt;                                 &lt;/method&gt;<br>
&gt;                         &lt;/fence&gt;<br>
&gt;                 &lt;/clusternode&gt;<br>
<br>
</div></div>Looks like the same node is defined twice, instead of &#39;node3&#39;.<br>
<span><br>
&gt;         &lt;/clusternodes&gt;<br>
&gt;         &lt;cman/&gt;<br>
&gt;         &lt;fencedevices&gt;<br>
&gt;                 &lt;fencedevice agent=&quot;fence_pcmk&quot; name=&quot;pcmk&quot;/&gt;<br>
&gt;         &lt;/fencedevices&gt;<br>
&gt;         &lt;rm&gt;<br>
&gt;                 &lt;failoverdomains/&gt;<br>
&gt;                 &lt;resources/&gt;<br>
&gt;         &lt;/rm&gt;<br>
&gt;         &lt;logging debug=&quot;on&quot;/&gt;<br>
&gt;         &lt;quorumd interval=&quot;1&quot; label=&quot;QuorumDisk&quot;<br>
&gt; status_file=&quot;/qdisk_status&quot; tko=&quot;70&quot;/&gt;<br>
<br>
</span>Also, I&#39;m not sure how well qdisk is tested/supported. Do you even need<br>
it with three nodes?<br>
<br>
&gt;         &lt;totem token=&quot;108000&quot;/&gt;<br>
<br>
That is a VERY high number!<br>
<span><br>
&gt; &lt;/cluster&gt;<br>
&gt;<br>
&gt; Every time I try to fence a node I&#39;m getting a timeout error with node<br>
&gt; being fenced at the end (on second attempt) but I&#39;m wondering why it<br>
&gt; took so long to fence a node?<br>
<br>
</span>Run &#39;fence_check&#39; (this tests cman&#39;s fencing which is hooked into<br>
pacemaker&#39;s stonith).<br>
<div><div><br>
&gt; So when I run stonith_admin or fence_node (which at the end also runs<br>
&gt; stonith_admin, you can see that clearly from the log file) it&#39;s always<br>
&gt; failing on the first attempt, my guess probably  because it doesn&#39;t get<br>
&gt; status code or something like that:<br>
&gt; strace stonith_admin --fence node1 --tolerance 5s --tag cman<br>
&gt;<br>
&gt; Partial output from strace:<br>
&gt;   ...<br>
&gt;   poll([{fd=4, events=POLLIN}], 1, 500)   = 0 (Timeout)<br>
&gt;   poll([{fd=4, events=POLLIN}], 1, 500)   = 0 (Timeout)<br>
&gt;   poll([{fd=4, events=POLLIN}], 1, 500)   = 0 (Timeout)<br>
&gt;   poll([{fd=4, events=POLLIN}], 1, 500)   = 0 (Timeout)<br>
&gt;   poll([{fd=4, events=POLLIN}], 1, 500)   = 0 (Timeout)<br>
&gt;   poll([{fd=4, events=POLLIN}], 1, 500)   = 0 (Timeout)<br>
&gt;   poll([{fd=4, events=POLLIN}], 1, 500)   = 0 (Timeout)<br>
&gt;   poll([{fd=4, events=POLLIN}], 1, 500)   = 0 (Timeout)<br>
&gt;   poll([{fd=4, events=POLLIN}], 1, 500)   = 0 (Timeout)<br>
&gt;   poll([{fd=4, events=POLLIN}], 1, 500)   = 0 (Timeout)<br>
&gt;   poll([{fd=4, events=POLLIN}], 1, 500)   = 0 (Timeout)<br>
&gt;   poll([{fd=4, events=POLLIN}], 1, 500)   = 0 (Timeout)<br>
&gt;   poll([{fd=4, events=POLLIN}], 1, 500)   = 0 (Timeout)<br>
&gt;   poll([{fd=4, events=POLLIN}], 1, 500)   = 0 (Timeout)<br>
&gt;   poll([{fd=4, events=POLLIN}], 1, 500)   = 0 (Timeout)<br>
&gt;   poll([{fd=4, events=POLLIN}], 1, 291)   = 0 (Timeout)<br>
&gt;   fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 8), ...}) = 0<br>
&gt;   mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,<br>
&gt; 0) = 0x7fb2a8c37000<br>
&gt;   write(1, &quot;Command failed: Timer expired\n&quot;, 30Command failed: Timer<br>
&gt; expired<br>
&gt;   ) = 30<br>
&gt;   poll([{fd=4, events=POLLIN}], 1, 0)     = 0 (Timeout)<br>
&gt;   shutdown(4, 2 /* send and receive */)   = 0<br>
&gt;   close(4)                                = 0<br>
&gt;   munmap(0x7fb2a8b98000, 270336)          = 0<br>
&gt;   munmap(0x7fb2a8bda000, 8248)            = 0<br>
&gt;   munmap(0x7fb2a8b56000, 270336)          = 0<br>
&gt;   munmap(0x7fb2a8c3b000, 8248)            = 0<br>
&gt;   munmap(0x7fb2a8b14000, 270336)          = 0<br>
&gt;   munmap(0x7fb2a8c38000, 8248)            = 0<br>
&gt;   munmap(0x7fb2a8bdd000, 135168)          = 0<br>
&gt;   munmap(0x7fb2a8bfe000, 135168)          = 0<br>
&gt;   exit_group(-62)                         = ?<br>
&gt;<br>
&gt;<br>
&gt; Or via cman:<br>
&gt; [node1:~]# fence_node -vv node3<br>
&gt; fence node3 dev 0.0 agent fence_pcmk result: error from agent<br>
&gt; agent args: action=off port=node3 timeout=15 nodename=node3 agent=fence_pcmk<br>
&gt; fence node3 failed<br>
&gt;<br>
&gt;<br>
&gt; /var/log/messages:<br>
&gt;   Jun 19 10:57:43 node1 stonith_admin[3804]:   notice: crm_log_args:<br>
&gt; Invoked: stonith_admin --fence node1 --tolerance 5s --tag cman<br>
&gt;   Jun 19 10:57:43 node1 stonith-ng[8283]:   notice: handle_request:<br>
&gt; Client stonith_admin.cman.3804.65de6378 wants to fence (off) &#39;node1&#39;<br>
&gt; with device &#39;(any)&#39;<br>
&gt;   Jun 19 10:57:43 node1 stonith-ng[8283]:   notice:<br>
&gt; initiate_remote_stonith_op: Initiating remote operation off for node1:<br>
&gt; fbc7fe61-9451-4634-9c12-57d933ccd0a4 (  0)<br>
&gt;   Jun 19 10:57:43 node1 stonith-ng[8283]:   notice:<br>
&gt; can_fence_host_with_device: node2-ipmi can not fence (off) node1:<br>
&gt; static-list<br>
&gt;   Jun 19 10:57:43 node1 stonith-ng[8283]:   notice:<br>
&gt; can_fence_host_with_device: node3-ipmi can fence (off) node3: static-list<br>
&gt;   Jun 19 10:57:54 node1 stonith-ng[8283]:  warning: get_xpath_object: No<br>
&gt; match for //@st_delegate in /st-reply<br>
&gt;   Jun 19 10:59:00 node1 qdiskd[7409]: Node 3 evicted<br>
&gt;   Jun 19 10:59:31 node1 corosync[7349]:   [TOTEM ] A processor failed,<br>
&gt; forming new configuration.<br>
&gt;   Jun 19 11:01:21 node1 corosync[7349]:   [QUORUM] Members[2]: 1 2<br>
&gt;   Jun 19 11:01:21 node1 corosync[7349]:   [TOTEM ] A processor joined or<br>
&gt; left the membership and a new membership was formed.<br>
&gt;   Jun 19 11:01:21 node1 crmd[8287]:   notice: crm_update_peer_state:<br>
&gt; cman_event_callback: Node node3[3] - state is now lost (was member)<br>
&gt;   Jun 19 11:01:21 node1 kernel: dlm: closing connection to node 3<br>
&gt;   Jun 19 11:01:21 node1 stonith-ng[8283]:   notice: remote_op_done:<br>
&gt; Operation off of node3 by node2 for stonith_admin.cman.3804@node1.<br>
&gt; com.fbc7fe61: OK<br>
&gt;   Jun 19 11:01:21 node1 crmd[8287]:   notice: tengine_stonith_notify:<br>
&gt; Peer node3 was terminated (off) by node2 for node1: OK (<br>
&gt; ref=fbc7fe61-9451-4634-9c12-57d933ccd0a4) by client stonith_admin.cman.3804<br>
&gt;   Jun 19 11:01:21 node1 crmd[8287]:   notice: tengine_stonith_notify:<br>
&gt; Notified CMAN that &#39;node3&#39; is now fenced<br>
&gt;<br>
&gt;   Jun 19 11:01:21 node1 fenced[7625]: fencing node node3<br>
&gt;   Jun 19 11:01:22 node1 fence_pcmk[8067]: Requesting Pacemaker fence<br>
&gt; node3 (off)<br>
&gt;   Jun 19 11:01:22 node1 stonith_admin[8068]:   notice: crm_log_args:<br>
&gt; Invoked: stonith_admin --fence node3 --tolerance 5s --tag cman<br>
&gt;   Jun 19 11:01:22 node1 stonith-ng[8283]:   notice: handle_request:<br>
&gt; Client stonith_admin.cman.8068.fcd7f751 wants to fence (off) &#39;node3&#39;<br>
&gt; with device &#39;(any)&#39;<br>
&gt;   Jun 19 11:01:22 node1 stonith-ng[8283]:   notice:<br>
&gt; stonith_check_fence_tolerance: Target node3 was fenced (off) less than<br>
&gt; 5s ago by node2 on   behalf of node1<br>
&gt;   Jun 19 11:01:22 node1 fenced[7625]: fence node3 success<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt;     [node1:~]# ls -ahl /proc/22505/fd<br>
&gt;   total 0<br>
&gt;   dr-x------ 2 root root  0 Jun 19 11:55 .<br>
&gt;   dr-xr-xr-x 8 root root  0 Jun 19 11:55 ..<br>
&gt;   lrwx------ 1 root root 64 Jun 19 11:56 0 -&gt; /dev/pts/8<br>
&gt;   lrwx------ 1 root root 64 Jun 19 11:56 1 -&gt; /dev/pts/8<br>
&gt;   lrwx------ 1 root root 64 Jun 19 11:55 2 -&gt; /dev/pts/8<br>
&gt;   lrwx------ 1 root root 64 Jun 19 11:56 3 -&gt; socket:[4061683]<br>
&gt;  lrwx------ 1 root root 64 Jun 19 11:56 4 -&gt; socket:[4061684]<br>
&gt;<br>
&gt;   [node1:~]# lsof -p 22505<br>
&gt;   ...<br>
&gt;   stonith_admin 22505 root    3u  unix 0xffff880c14889b80      0t0<br>
&gt; 4061683 socket<br>
&gt;   stonith_admin 22505 root    4u  unix 0xffff880c2a4fbc40      0t0<br>
&gt; 4061684 socket<br>
&gt;<br>
&gt;<br>
&gt; Obviously it&#39;s trying to read some data from unix socket but doesn&#39;t get<br>
&gt; anything from the other side, is there anyone there who can explain me<br>
&gt; why fence command is always failing on first attempt?<br>
&gt;<br>
&gt; Thanks<br>
<br>
</div></div>I noticed you&#39;re not a mailing list member. Please register if you want<br>
your emails to come through without getting stuck in the moderator queue.<br>
<span><font color="#888888"><br>
--<br>
Digimer<br>
Papers and Projects: <a href="https://alteeve.ca/w/" rel="noreferrer" target="_blank">https://alteeve.ca/w/</a><br>
What if the cure for cancer is trapped in the mind of a person without<br>
access to education?<br>
</font></span></blockquote></div><br></div></div></div></div>
</blockquote></div><br></div>