<html><body><p><font size="2">Tomas, </font><br><br><font size="2">Yes, I have an IBM internal build we're using for KVM on System Z. I tried the --force option and, while it didn't complain, </font><br><font size="2">it didn't work either (as expected, as per bug </font><tt><font size="2"><a href="https://bugzilla.redhat.com/show_bug.cgi?id=1225423">https://bugzilla.redhat.com/show_bug.cgi?id=1225423</a></font></tt><font size="2">), so </font><br><font size="2">maybe it is a valid option. </font><br><br><font size="2">[root@zs95kj VD]# date; pcs cluster node remove zs95KLpcs1 --force</font><br><font size="2">Wed Apr 19 10:14:10 EDT 2017</font><br><font size="2">Error: pcsd is not running on zs95KLpcs1</font><br><font size="2">[root@zs95kj VD]#</font><br><br><font size="2">Hopefully we'll roll in </font><tt><font size="2">pcs-0.9.143-15.el7_2.1</font></tt><font size="2"> with our next release of KVM. </font><br><br><font size="2">In the mean time, thanks very much for all the valuable feedback. I'm good to go for now with the workaround. </font><br><br><font size="2">Scott</font><br><font size="2"><br>Scott Greenlese ... KVM on System Z - Solutions Test, IBM Poughkeepsie, N.Y.<br> INTERNET: swgreenl@us.ibm.com <br></font><br><br><img width="16" height="16" src="cid:1__=8FBB0B94DFDDB9908f9e8a93df938690918c8FB@" border="0" alt="Inactive hide details for Tomas Jelinek ---04/19/2017 03:25:26 AM---Dne 18.4.2017 v 19:52 Scott Greenlese napsal(a): > My thank"><font size="2" color="#424282">Tomas Jelinek ---04/19/2017 03:25:26 AM---Dne 18.4.2017 v 19:52 Scott Greenlese napsal(a): > My thanks to both Ken Gaillot and Tomas Jelinek f</font><br><br><font size="2" color="#5F5F5F">From: </font><font size="2">Tomas Jelinek <tojeline@redhat.com></font><br><font size="2" color="#5F5F5F">To: </font><font size="2">users@clusterlabs.org</font><br><font size="2" color="#5F5F5F">Date: </font><font size="2">04/19/2017 03:25 AM</font><br><font size="2" color="#5F5F5F">Subject: </font><font size="2">Re: [ClusterLabs] How to force remove a cluster node?</font><br><hr width="100%" size="2" align="left" noshade style="color:#8091A5; "><br><br><br><tt><font size="2">Dne 18.4.2017 v 19:52 Scott Greenlese napsal(a):<br>> My thanks to both Ken Gaillot and Tomas Jelinek for the workaround. The<br>> procedure(s) worked like a champ.<br>><br>> I just have a few side comments / observations ...<br>><br>> First - Tomas, in the bugzilla you show this error message on your<br>> cluster remove command, directing you to use the --force option:<br>><br>> [root@rh72-node1:~]# pcs cluster node remove rh72-node3<br>> Error: pcsd is not running on rh72-node3, use --force to override<br>><br>> When I issue the cluster remove, I do not get and reference to the<br>> --force option in the error message:<br>><br>> [root@zs93kl ]# pcs cluster node remove zs95KLpcs1<br>> Error: pcsd is not running on zs95KLpcs1<br>> [root@zs93kl ]#<br>><br>> The man page doesn't mention --force at my level.<br><br>The man page doesn't mention --force for most commands in which --force <br>can be used. One shouldn't really make any conclusions from that.<br><br>> Is this a feature added after pcs-0.9.143-15.el7_2.ibm.2.s390x ?<br><br>The feature has been backported to pcs-0.9.143-15.el7_2.1. I cannot <br>really check if it is present in pcs-0.9.143-15.el7_2.ibm.2.s390x <br>because I don't have access to that particular build. Based on the name <br>I would say it was build internally at IBM. However, if the error <br>message doesn't suggest using --force, than the feature is most likely <br>not present in that build.<br><br>><br>> Also, in your workaround procedure, you have me do: 'pcs cluster<br>> *localnode*remove <name_of_node_to_be_removed> '.<br>> However, wondering why the 'localnode' option is not in the pcs man page<br>> for the pcs cluster command?<br>> The command / option worked great, just curious why it's not documented ...<br><br>It's an internal pcs command which is not meant to be run by users. It <br>exists mostly for the sake of the current pcs/pcsd architecture ("pcs <br>cluster node" calls pcsd instance on all nodes over network and pcsd <br>instance runs "pcs cluster localnode" to do the actual job) and is <br>likely to be removed in the future. It is useful for the workaround as <br>the check whether all nodes are running is done in the "pcs cluster <br>node" command.<br><br>Regards,<br>Tomas<br><br>><br>> [root@zs93kl # pcs cluster localnode remove zs93kjpcs1<br>> zs93kjpcs1: successfully removed!<br>><br>> My man page level:<br>><br>> [root@zs93kl VD]# rpm -q --whatprovides /usr/share/man/man8/pcs.8.gz<br>> pcs-0.9.143-15.el7_2.ibm.2.s390x<br>> [root@zs93kl VD]#<br>><br>> Thanks again,<br>><br>> Scott G.<br>><br>> Scott Greenlese ... KVM on System Z - Solutions Test, IBM Poughkeepsie, N.Y.<br>> INTERNET: swgreenl@us.ibm.com<br>><br>><br>> Inactive hide details for Tomas Jelinek ---04/18/2017 09:04:59 AM---Dne<br>> 17.4.2017 v 17:28 Ken Gaillot napsal(a): > On 04/13/201Tomas Jelinek<br>> ---04/18/2017 09:04:59 AM---Dne 17.4.2017 v 17:28 Ken Gaillot napsal(a):<br>>> On 04/13/2017 01:11 PM, Scott Greenlese wrote:<br>><br>> From: Tomas Jelinek <tojeline@redhat.com><br>> To: users@clusterlabs.org<br>> Date: 04/18/2017 09:04 AM<br>> Subject: Re: [ClusterLabs] How to force remove a cluster node?<br>><br>> ------------------------------------------------------------------------<br>><br>><br>><br>> Dne 17.4.2017 v 17:28 Ken Gaillot napsal(a):<br>>> On 04/13/2017 01:11 PM, Scott Greenlese wrote:<br>>>> Hi,<br>>>><br>>>> I need to remove some nodes from my existing pacemaker cluster which are<br>>>> currently unbootable / unreachable.<br>>>><br>>>> Referenced<br>>>><br>> </font></tt><tt><font size="2"><a href="https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/High_Availability_Add-On_Reference/s1-clusternodemanage-HAAR.html#s2-noderemove-HAAR">https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/High_Availability_Add-On_Reference/s1-clusternodemanage-HAAR.html#s2-noderemove-HAAR</a></font></tt><tt><font size="2"><br>>>><br>>>> *4.4.4. Removing Cluster Nodes*<br>>>> The following command shuts down the specified node and removes it from<br>>>> the cluster configuration file, corosync.conf, on all of the other nodes<br>>>> in the cluster. For information on removing all information about the<br>>>> cluster from the cluster nodes entirely, thereby destroying the cluster<br>>>> permanently, refer to _Section 4.6, “Removing the Cluster<br>>>> Configuration”_<br>>>><br>> <</font></tt><tt><font size="2"><a href="https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/High_Availability_Add-On_Reference/s1-clusterremove-HAAR.html#s2-noderemove-HAAR">https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/High_Availability_Add-On_Reference/s1-clusterremove-HAAR.html#s2-noderemove-HAAR</a></font></tt><tt><font size="2">>.<br>>>><br>>>> pcs cluster node remove /node/<br>>>><br>>>> I ran the command with the cluster active on 3 of the 5 available<br>>>> cluster nodes (with quorum). The command fails with:<br>>>><br>>>> [root@zs90KP VD]# date;*pcs cluster node remove zs93kjpcs1*<br>>>> Thu Apr 13 13:40:59 EDT 2017<br>>>> *Error: pcsd is not running on zs93kjpcs1*<br>>>><br>>>><br>>>> The node was not removed:<br>>>><br>>>> [root@zs90KP VD]# pcs status |less<br>>>> Cluster name: test_cluster_2<br>>>> Last updated: Thu Apr 13 14:08:15 2017 Last change: Wed Apr 12 16:40:26<br>>>> 2017 by root via cibadmin on zs93KLpcs1<br>>>> Stack: corosync<br>>>> Current DC: zs90kppcs1 (version 1.1.13-10.el7_2.ibm.1-44eb2dd) -<br>>>> partition with quorum<br>>>> 45 nodes and 180 resources configured<br>>>><br>>>> Node zs95KLpcs1: UNCLEAN (offline)<br>>>> Online: [ zs90kppcs1 zs93KLpcs1 zs95kjpcs1 ]<br>>>> *OFFLINE: [ zs93kjpcs1 ]*<br>>>><br>>>><br>>>> Is there a way to force remove a node that's no longer bootable? If not,<br>>>> what's the procedure for removing a rogue cluster node?<br>>>><br>>>> Thank you...<br>>>><br>>>> Scott Greenlese ... KVM on System Z - Solutions Test, IBM<br>> Poughkeepsie, N.Y.<br>>>> INTERNET: swgreenl@us.ibm.com<br>>><br>>> Yes, the pcs command is just a convenient shorthand for a series of<br>>> commands. You want to ensure pacemaker and corosync are stopped on the<br>>> node to be removed (in the general case, obviously already done in this<br>>> case), remove the node from corosync.conf and restart corosync on all<br>>> other nodes, then run "crm_node -R <nodename>" on any one active node.<br>><br>> Hi Scott,<br>><br>> It is possible to remove an offline node from a cluster with upstream<br>> pcs 0.9.154 or RHEL pcs-0.9.152-5 (available in RHEL7.3) or newer.<br>><br>> If you have an older version, here's a workaround:<br>> 1. run 'pcs cluster localnode remove <nodename>' on all remaining nodes<br>> 2. run 'pcs cluster reload corosync' on one node<br>> 3. run 'crm_node -R <nodename> --force' on one node<br>> It's basically the same procedure Ken described.<br>><br>> See </font></tt><tt><font size="2"><a href="https://bugzilla.redhat.com/show_bug.cgi?id=1225423">https://bugzilla.redhat.com/show_bug.cgi?id=1225423</a></font></tt><tt><font size="2"> for more details.<br>><br>> Regards,<br>> Tomas<br>><br>> _______________________________________________<br>> Users mailing list: Users@clusterlabs.org<br>> </font></tt><tt><font size="2"><a href="http://lists.clusterlabs.org/mailman/listinfo/users">http://lists.clusterlabs.org/mailman/listinfo/users</a></font></tt><tt><font size="2"><br>><br>> Project Home: </font></tt><tt><font size="2"><a href="http://www.clusterlabs.org">http://www.clusterlabs.org</a></font></tt><tt><font size="2"><br>> Getting started: </font></tt><tt><font size="2"><a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a></font></tt><tt><font size="2"><br>> Bugs: </font></tt><tt><font size="2"><a href="http://bugs.clusterlabs.org">http://bugs.clusterlabs.org</a></font></tt><tt><font size="2"><br>><br>><br>><br>><br>><br>><br>> _______________________________________________<br>> Users mailing list: Users@clusterlabs.org<br>> </font></tt><tt><font size="2"><a href="http://lists.clusterlabs.org/mailman/listinfo/users">http://lists.clusterlabs.org/mailman/listinfo/users</a></font></tt><tt><font size="2"><br>><br>> Project Home: </font></tt><tt><font size="2"><a href="http://www.clusterlabs.org">http://www.clusterlabs.org</a></font></tt><tt><font size="2"><br>> Getting started: </font></tt><tt><font size="2"><a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a></font></tt><tt><font size="2"><br>> Bugs: </font></tt><tt><font size="2"><a href="http://bugs.clusterlabs.org">http://bugs.clusterlabs.org</a></font></tt><tt><font size="2"><br>><br><br>_______________________________________________<br>Users mailing list: Users@clusterlabs.org<br></font></tt><tt><font size="2"><a href="http://lists.clusterlabs.org/mailman/listinfo/users">http://lists.clusterlabs.org/mailman/listinfo/users</a></font></tt><tt><font size="2"><br><br>Project Home: </font></tt><tt><font size="2"><a href="http://www.clusterlabs.org">http://www.clusterlabs.org</a></font></tt><tt><font size="2"><br>Getting started: </font></tt><tt><font size="2"><a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a></font></tt><tt><font size="2"><br>Bugs: </font></tt><tt><font size="2"><a href="http://bugs.clusterlabs.org">http://bugs.clusterlabs.org</a></font></tt><tt><font size="2"><br><br></font></tt><br><br><BR>
</body></html>