[ClusterLabs] Antw: [EXT] What/how to clean up when bootstrapping new cluster (or: I have a phantom node)

Wed May 25 01:42:57 EDT 2022

>>> Andreas Hasenack <andreas at canonical.com> schrieb am 24.05.2022 um 22:05 in
Nachricht
<CANYNYEEXVakCyeojRZ2N6uhVt0GuGZQX-4KH4=HACx=C1VxEOQ at mail.gmail.com>:
> Hi,
> 
> I'm trying to find out the correct steps to start a corosync/pacemaker
> cluster right after installing its packages in Debian or Ubuntu.
> 
> I'm not using crmsh or pcs on purpose, I really wanted to get this
> basic initial step working without those.
> 
> Right after install, the default config has this nodelist:
> nodelist {
>         # Change/uncomment/add node sections to match cluster configuration
> 
>         node {
>                 # Hostname of the node
>                 name: node1
>                 # Cluster membership node identifier
>                 nodeid: 1
>                 # Address of first link
>                 ring0_addr: 127.0.0.1
>                 # When knet transport is used it's possible to define
> up to 8 links
>                 #ring1_addr: 192.168.1.1
>         }
>         # ...
> }
> 
> 
> (full default pristine config: https://pastebin.ubuntu.com/p/htBkCvBWqr/)
> 
> This results in a crm_mon output of:
> 
> Cluster Summary:
>   * Stack: corosync
>   * Current DC: node1 (version 2.0.3‑4b1f869f0f) ‑ partition with quorum
>   * Last updated: Tue May 24 19:57:05 2022
>   * Last change:  Tue May 24 19:56:59 2022 by hacluster via crmd on node1
>   * 1 node configured
>   * 0 resource instances configured
> 
> Node List:
>   * Online: [ node1 ]
> 
> Active Resources:
>   * No active resources
> 
> I also tried with corosync 3.1.6 and pacemaker 2.1.2, btw.
> 
> I then proceed to making changes to corosync.conf. I give it a real
> hostname, ring IP and node id:
> nodelist {
>         # Change/uncomment/add node sections to match cluster configuration
> 
>         node {
>                 # Hostname of the node
>                 name: f4
>                 # Cluster membership node identifier
>                 nodeid: 104
>                 # Address of first link
>                 ring0_addr: 10.226.63.102
>                 # When knet transport is used it's possible to define
> up to 8 links
>                 #ring1_addr: 192.168.1.1
>         }
>         # ...
> }
> 
> 
> Restart the services:
> systemctl restart pacemaker corosync
> 
> But now I have this phantom "node1" in the cluster, and the cluster
> thinks it has two nodes:

I guess when starting pacemaker it creates a CIB (containing the nodes).
Changing corosync.conf does not affect the CIB, so you'll have to clean it up
(as Ken noted).
The point is: Why did you start with some fake node?

Regards,
Ulrich

> 
> Cluster Summary:
>   * Stack: corosync
>   * Current DC: f4 (version 2.0.3‑4b1f869f0f) ‑ partition with quorum
>   * Last updated: Tue May 24 19:59:56 2022
>   * Last change:  Tue May 24 19:59:22 2022 by hacluster via crmd on f4
>   * 2 nodes configured
>   * 0 resource instances configured
> 
> Node List:
>   * Node node1: UNCLEAN (offline)
>   * Online: [ f4 ]
> 
> Active Resources:
>   * No active resources
> 
> 
> What is the cleanup step (or steps) that I'm missing? Or are there so
> many details that it's best to leave this to pcs/crmsh?
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/