<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">I’ve asked this question on server fault and I’ll re-ask the whole thing here for posterity sake:<div class=""><br class=""></div><div class=""><a href="https://serverfault.com/questions/995981/pacemaker-wont-start-because-duplicate-node-but-cant-remove-dupe-node-because" class="">https://serverfault.com/questions/995981/pacemaker-wont-start-because-duplicate-node-but-cant-remove-dupe-node-because</a></div><div class=""><br class=""></div><div class=""><div class=""><font size="3" class="">OK! Really new to pacemaker/corosync, like 1 day new.</font></div><div class=""><font size="3" class=""><br class=""></font></div><div class=""><font size="3" class="">Software: Ubuntu 18.04 LTS and the versions associated with that distro.</font></div><div class=""><font size="3" class=""><br class=""></font></div><div class=""><font size="3" class="">pacemakerd: 1.1.18</font></div><div class=""><font size="3" class=""><br class=""></font></div><div class=""><font size="3" class="">corosync: 2.4.3</font></div><div class=""><font size="3" class=""><br class=""></font></div><div class=""><font size="3" class="">I accidentally removed the nodes from my entire test cluster (3 nodes)</font></div><div class=""><font size="3" class=""><br class=""></font></div><div class=""><font size="3" class="">When I tried to bring everything back up using `pcsd` GUI, that failed because the nodes were "wiped out". Cool.</font></div><div class=""><font size="3" class=""><br class=""></font></div><div class=""><font size="3" class="">So. I had a copy of the last `corosync.conf` from my "primary" node. I copied to the other two nodes. I fixed the `bindnetaddr` on the respective confs. I ran `pcs cluster start` on my "primary" node.</font></div><div class=""><font size="3" class=""><br class=""></font></div><div class=""><font size="3" class="">One of the nodes failed to come up. I took a look at the status of `pacemaker` on that node and I get the following exception:</font></div><div class=""><font size="3" class=""><br class=""></font></div><div class=""><font size="3" class=""> Dec 18 06:33:56 region-ctrl-2 crmd[1049]: crit: Nodes 1084777441 and 2 share the same name 'region-ctrl-2': shutting down</font></div><div class=""><font size="3" class=""><br class=""></font></div><div class=""><font size="3" class="">I tried running `crm_node -R --force 1084777441` on the machine where `pacemaker` won't start, but of course, `pacemaker` isn't running so I get an `crmd: connection refused (111)` error. So, I ran the same command on one of the healthy nodes, which shows no errors, but the node never goes away and `pacemaker` on the affected machine continued to show the same error.</font></div><div class=""><font size="3" class=""><br class=""></font></div><div class=""><font size="3" class="">So, I decided to tear down the entire cluster and again. I purge removed all the packages from the machine. I reinstalled everything fresh. I copied and fixed the `corosync.conf` to the machine. I recreated the cluster. I get the exact same bloody error.</font></div><div class=""><font size="3" class=""><br class=""></font></div><div class=""><font size="3" class="">So this node named `1084777441` is not a machine I created. This is one the cluster created for me. Earlier in the day I realized that I was using IP addresses in `corosync.conf` instead of names. I fixed the `/etc/hosts` of the machines, removed the IP addresses from the corosync config, and that's why I inadvertently deleted my whole cluster in the first place (I removed the nodes that were IP addresses).</font></div><div class=""><font size="3" class=""><br class=""></font></div><div class=""><font size="3" class="">The following is my corosync.conf:</font></div><div class=""><font size="3" class=""><br class=""></font></div><div class=""><font size="3" class=""> totem {</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> version: 2</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> cluster_name: maas-cluster</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> token: 3000</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> token_retransmits_before_loss_const: 10</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> clear_node_high_bit: yes</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> crypto_cipher: none</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> crypto_hash: none</font></div><div class=""><font size="3" class=""><br class=""></font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> interface {</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> ringnumber: 0</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> bindnetaddr: 192.168.99.225</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> mcastport: 5405</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> ttl: 1</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> }</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span>}</font></div><div class=""><font size="3" class=""><br class=""></font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span>logging {</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> fileline: off</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> to_stderr: no</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> to_logfile: no</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> to_syslog: yes</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> syslog_facility: daemon</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> debug: off</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> timestamp: on</font></div><div class=""><font size="3" class=""><br class=""></font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> logger_subsys {</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> subsys: QUORUM</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> debug: off</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> }</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span>}</font></div><div class=""><font size="3" class=""><br class=""></font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span>quorum {</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> provider: corosync_votequorum</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> expected_votes: 3</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> two_node: 1</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span>}</font></div><div class=""><font size="3" class=""><br class=""></font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span>nodelist {</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> node {</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> ring0_addr: postgres-sb</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> nodeid: 3</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> }</font></div><div class=""><font size="3" class=""><br class=""></font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> node {</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> ring0_addr: region-ctrl-2</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> nodeid: 2</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> }</font></div><div class=""><font size="3" class=""><br class=""></font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> node {</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> ring0_addr: region-ctrl-1</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> nodeid: 1</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span> }</font></div><div class=""><font size="3" class=""><span class="Apple-tab-span" style="white-space:pre"> </span>}</font></div><div class=""><font size="3" class=""><br class=""></font></div><div class=""><font size="3" class="">The only thing different about this conf between the nodes is the `bindnetaddr`.</font></div><div class=""><font size="3" class=""><br class=""></font></div><div class=""><font size="3" class="">There seems to be a chicken/egg issue here unless there's some way of which I'm not aware to remove a node from a flat-file db or sqlite dbb somewhere or there's some other more authoritative way to remove a node from a cluster.</font></div></div></body></html>