<html><body><p><font size="2">Tomas, </font><br><br><font size="2">Yes, that was it.  </font><br><br><font size="2">[root@zs95KL corosync]# pcs cluster destroy</font><br><font size="2">Shutting down pacemaker/corosync services...</font><br><font size="2">Redirecting to /bin/systemctl stop  pacemaker.service</font><br><font size="2">Redirecting to /bin/systemctl stop  corosync.service</font><br><font size="2">Killing any remaining services...</font><br><font size="2">Removing all cluster configuration files...</font><br><font size="2">[root@zs95KL corosync]#</font><br><br><br><font size="2">[root@zs93kl corosync]# pcs cluster node add zs95KLpcs1,zs95KLpcs2</font><br><font size="2">zs95kjpcs1: Corosync updated</font><br><font size="2">zs93KLpcs1: Corosync updated</font><br><font size="2">zs95KLpcs1: Succeeded</font><br><font size="2">Synchronizing pcsd certificates on nodes zs95KLpcs1...</font><br><font size="2">zs95KLpcs1: Success</font><br><br><font size="2">Restaring pcsd on the nodes in order to reload the certificates...</font><br><font size="2">zs95KLpcs1: Success</font><br><font size="2">[root@zs93kl corosync]#</font><br><br><font size="2">Thank you very much for this quick fix. </font><br><br><font size="2">- Scott</font><br><font size="2"><br>Scott Greenlese ... KVM on System Z - Solutions Test, IBM Poughkeepsie, N.Y.<br>  INTERNET:  swgreenl@us.ibm.com   <br></font><br><br><img width="16" height="16" src="cid:1__=8FBB0BDDDFE3C9368f9e8a93df938690918c8FB@" border="0" alt="Inactive hide details for Tomas Jelinek ---06/29/2017 12:13:42 PM---Hi Scott, It looks like some of cluster configuration files"><font size="2" color="#424282">Tomas Jelinek ---06/29/2017 12:13:42 PM---Hi Scott, It looks like some of cluster configuration files still exist on your</font><br><br><font size="2" color="#5F5F5F">From:        </font><font size="2">Tomas Jelinek <tojeline@redhat.com></font><br><font size="2" color="#5F5F5F">To:        </font><font size="2">users@clusterlabs.org</font><br><font size="2" color="#5F5F5F">Date:        </font><font size="2">06/29/2017 12:13 PM</font><br><font size="2" color="#5F5F5F">Subject:        </font><font size="2">Re: [ClusterLabs] Unable to add 'NodeX' to cluster: node is already in a cluster</font><br><hr width="100%" size="2" align="left" noshade style="color:#8091A5; "><br><br><br><tt><font size="2">Hi Scott,<br><br>It looks like some of cluster configuration files still exist on your <br>node 'zs95KLpcs1'. Try running "pcs cluster destroy" on that node. This <br>will delete all cluster config files on the node. So make sure it is the <br>right node before running the command. Then you should be able to add <br>the node to your cluster.<br><br><br>Regards,<br>Tomas<br><br><br><br>Dne 29.6.2017 v 17:32 Scott Greenlese napsal(a):<br>> Hi all...<br>><br>> When I try to add a previously removed cluster node back into my<br>> pacemaker cluster, I get the following error:<br>><br>> [root@zs93kl]# pcs cluster node add zs95KLpcs1,zs95KLpcs2<br>> Error: Unable to add 'zs95KLpcs1' to cluster: node is already in a cluster<br>><br>> The node I am adding was recently removed from the cluster, but<br>> apparently the removal<br>> was incomplete.<br>><br>> I am looking for some help to thoroughly remove zs95KLpcs1 from this (or<br>> any other)<br>> cluster that this host may be a part of.<br>><br>><br>> Background:<br>><br>> I had removed node ( zs95KLpcs1) from my 3 node, single ring protocol<br>> pacemaker cluster while that node<br>> (which happens to be a KVM on System Z Linux host), was deactivated /<br>> shut down due to<br>> relentless, unsolicited STONITH events. My thought was that there was<br>> some issue with the ring0<br>> interface (on vlan1293) causing the cluster to initiate fence (power<br>> off) actions, just minutes after<br>> joining the cluster. That's why I went ahead and deactivated that node.<br>><br>> The first procedure I used to remove zs95KLpcs1 was flawed, because I<br>> forgot that there's an issue with<br>> attempting to remove an unreachable cluster node on the older pacemaker<br>> code:<br>><br>> [root@zs95kj ]# date;pcs cluster node remove zs95KLpcs1<br>> Tue Jun 27 18:28:23 EDT 2017<br>> Error: pcsd is not running on zs95KLpcs1<br>><br>> I then followed this procedure (courtesy of Tomasand Ken inthis user<br>> group):<br>><br>> 1. run 'pcs cluster localnode remove <nodename>' on all remaining nodes<br>> 2. run 'pcs cluster reload corosync' on one node<br>> 3. run 'crm_node -R <nodename> --force' on one node<br>><br>> My execution:<br>><br>> I made the mistake of manually removing the target node (zs95KLpcs1)<br>> stanza from corosync.conf file before<br>> executing the above procedure:<br>><br>> [root@zs95kj ]# vi /etc/corosync/corosync.conf<br>><br>> Removed this stanza:<br>><br>> node {<br>> ring0_addr: zs95KLpcs1<br>> nodeid: 3<br>> }<br>><br>> I then followed the recommended steps ...<br>><br>> [root@zs95kj ]# pcs cluster localnode remove zs95KLpcs1<br>> Error: unable to remove zs95KLpcs1 ### I assume this was because I<br>> manually removed the stanza (above)<br>><br>> [root@zs93kl ]# pcs cluster localnode remove zs95KLpcs1<br>> zs95KLpcs1: successfully removed!<br>> [root@zs93kl ]#<br>><br>> [root@zs95kj ]# pcs cluster reload corosync<br>> Corosync reloaded<br>> [root@zs95kj ]#<br>><br>> [root@zs95kj ]# crm_node -R zs95KLpcs1 --force<br>> [root@zs95kj ]#<br>><br>><br>> [root@zs95kj ]# pcs status |less<br>> Cluster name: test_cluster_2<br>> Last updated: Tue Jun 27 18:39:14 2017 Last change: Tue Jun 27 18:38:56<br>> 2017 by root via crm_node on zs95kjpcs1<br>> Stack: corosync<br>> Current DC: zs93KLpcs1 (version 1.1.13-10.el7_2.ibm.1-44eb2dd) -<br>> partition with quorum<br>> 45 nodes and 227 resources configured<br>><br>> *Online: [ zs93KLpcs1 zs95kjpcs1 ]*<br>><br>><br>> This seemed to work well, at least I'm showing only the two cluster nodes.<br>><br>> Later on, once I was able to activate zs95KLpcs1 (former cluster member)<br>> ... I did what I thought<br>> I should do to tell that node that it's no longer a member of the cluster:<br>><br>> [root@zs95kj ]# cat neuter.sh<br>> ssh root@zs95KL "/usr/sbin/pcs cluster localnode *remove *zs95KLpcs1"<br>> ssh root@zs95KL "/usr/sbin/pcs cluster reload corosync"<br>> ssh root@zs95KL "/usr/sbin/crm_node -R zs95KLpcs1 --force"<br>><br>> [root@zs95kj ]# ./neuter.sh<br>> zs95KLpcs1:***successfully removed!*<br>> Corosync reloaded<br>> [root@zs95kj ]#<br>><br>><br>> Next, I followed a procedure to convert my current 2-node, single ring<br>> cluster to RRP ... which seems to be running<br>> well, and the corosync config looks like this:<br>><br>> [root@zs93kl ]# for host in zs95kjpcs1 zs93KLpcs1 ; do ssh $host<br>> "hostname;corosync-cfgtool -s"; done<br>> zs95kj<br>> Printing ring status.<br>> Local node ID 2<br>> RING ID 0<br>> id = 10.20.93.12<br>> status = ring 0 active with no faults<br>> RING ID 1<br>> id = 10.20.94.212<br>> status = ring 1 active with no faults<br>><br>> zs93kl<br>> Printing ring status.<br>> Local node ID 5<br>> RING ID 0<br>> id = 10.20.93.13<br>> status = ring 0 active with no faults<br>> RING ID 1<br>> id = 10.20.94.213<br>> status = ring 1 active with no faults<br>> [root@zs93kl ]#<br>><br>><br>> So now, when I try to add zs95KLpcs1 (and the second ring interface,<br>> zs95KLpcs2) to the RRP config,<br>> I get the error:<br>><br>> [root@zs93kl]# pcs cluster node add zs95KLpcs1,zs95KLpcs2<br>> Error: Unable to add 'zs95KLpcs1' to cluster: node is already in a cluster<br>><br>><br>> I re-ran the node removal procedures, and also deleted<br>> /etc/corosync/corosync.conf<br>> on the target node zs95KLpcs1, and nothing I've tried resolves my problem.<br>><br>> I checked to see if zs95KLpcs1 exists in any "corosync.conf" file on the<br>> 3 nodes, and it does not.<br>><br>> [root@zs95kj corosync]# grep zs95KLpcs1 *<br>> [root@zs95kj corosync]#<br>><br>> [root@zs93kl corosync]# grep zs95KLpcs1 *<br>> [root@zs95kj corosync]#<br>><br>> [root@zs95KL corosync]# grep zs95KLpcs1 *<br>> [root@zs95kj corosync]#<br>><br>> Thanks in advance ..<br>><br>> Scott Greenlese ... KVM on System Z - Solutions Test, IBM Poughkeepsie, N.Y.<br>> INTERNET: swgreenl@us.ibm.com<br>><br>><br>><br>> _______________________________________________<br>> Users mailing list: Users@clusterlabs.org<br>> </font></tt><tt><font size="2"><a href="http://lists.clusterlabs.org/mailman/listinfo/users">http://lists.clusterlabs.org/mailman/listinfo/users</a></font></tt><tt><font size="2"><br>><br>> Project Home: </font></tt><tt><font size="2"><a href="http://www.clusterlabs.org">http://www.clusterlabs.org</a></font></tt><tt><font size="2"><br>> Getting started: </font></tt><tt><font size="2"><a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a></font></tt><tt><font size="2"><br>> Bugs: </font></tt><tt><font size="2"><a href="http://bugs.clusterlabs.org">http://bugs.clusterlabs.org</a></font></tt><tt><font size="2"><br>><br><br>_______________________________________________<br>Users mailing list: Users@clusterlabs.org<br></font></tt><tt><font size="2"><a href="http://lists.clusterlabs.org/mailman/listinfo/users">http://lists.clusterlabs.org/mailman/listinfo/users</a></font></tt><tt><font size="2"><br><br>Project Home: </font></tt><tt><font size="2"><a href="http://www.clusterlabs.org">http://www.clusterlabs.org</a></font></tt><tt><font size="2"><br>Getting started: </font></tt><tt><font size="2"><a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a></font></tt><tt><font size="2"><br>Bugs: </font></tt><tt><font size="2"><a href="http://bugs.clusterlabs.org">http://bugs.clusterlabs.org</a></font></tt><tt><font size="2"><br><br></font></tt><br><br><BR>

</body></html>