[ClusterLabs] pacemaker and cluster hostname reconfiguration

Ken Gaillot kgaillot at redhat.com
Thu Oct 1 10:41:31 EDT 2020


On Thu, 2020-10-01 at 10:40 +0200, Riccardo Manfrin wrote:
> Ciao,
> 
> I'm among the people that have to deal with with the in-famous two
> nodes problem (http://www.beekhof.net/blog/2018/two-node-problems).
> I am not sure if to open a bug for this.. so I'm first off reporting
> on the list.. in the hope to get fast feedback.
> Problem statement
> 
> I have a cluster made by two nodes with a DRBD shared partition which
> some resources (systemd services) have to stick to.
> Software versions
> corosync -v
> Corosync Cluster Engine, version '2.4.5'
> Copyright (c) 2006-2009 Red Hat, Inc.
> pacemakerd --version
> Pacemaker 1.1.21-4.el7
> drbdadm --version
> DRBDADM_BUILDTAG=GIT-hash:\ fb98589a8e76783d2c56155c645dbaf02ac7ece7\
> build\ by\ mockbuild@\,\ 2020-04-05\ 03:21:05
> DRBDADM_API_VERSION=2
> DRBD_KERNEL_VERSION_CODE=0x090010
> DRBD_KERNEL_VERSION=9.0.16
> DRBDADM_VERSION_CODE=0x090c02
> DRBDADM_VERSION=9.12.2
> corosync.conf nodes:
> nodelist {
>     node {
>         ring0_addr: 10.1.3.1
>         nodeid: 1
>     }
>     node {
>         ring0_addr: 10.1.3.2
>         nodeid: 2
>     }
> }
> quorum {
>     provider: corosync_votequorum
>     two_node: 1
> }
> drbd nodes config:
> resource myresource {
> 
>   volume 0 {
>     device    /dev/drbd0;
>     disk      /dev/mapper/vg0-res--etc;
>     meta-disk internal;
>   }
> 
>   on 123z555666y0 {
>     node-id 0;
>     address 10.1.3.1:7789;
>   }
> 
>   on 123z555666y1 {
>     node-id 1;
>     address 10.1.3.2:7789;
>   }
> 
>   connection {
>     host 123z555666y0;
>     host 123z555666y1;
>   }
> 
>   handlers {
>     before-resync-target "/usr/lib/drbd/snapshot-resync-target-
> lvm.sh";
>     after-resync-target "/usr/lib/drbd/unsnapshot-resync-target-
> lvm.sh";
>   }
> 
> }
> I need to reconfigure the hostname of both the nodes of the cluster.
> I've gathered some literature around
> https://pacemaker.oss.clusterlabs.narkive.com/csHZkR5R/change-hostname
> https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/s-node-name.html
> https://www.suse.com/support/kb/doc/?id=000018878 <- DIDN'T  WORK
> https://bugs.clusterlabs.org/show_bug.cgi?id=5265 <- DIDN'T  WORK
> but have not yet found a way to address this (unless with
> simultaneous reboot of both nodes).
> The procedure:
> Update the hostname on both Master and Slave nodes
> update /etc/hostname
> update /etc/hosts
> update system with hostname -F /etc/hostname
> Reconfigure drbd on Master and Slave nodes
> modify drbd.01.conf (attached) to reflect new hostname
> invoke drbdadm adjust all
> Update pacemaker config on Master node only
> crm configure property maintenance-mode=true
> crm configure delete --force 1
> crm configure delete --force 2
> crm configure xml ' <node id="1" uname="newhostname0">
>         <instance_attributes id="node-1">
>           <nvpair id="node-1-standby" name="standby" value="off"/>
>         </instance_attributes>
>       </node>'
> crm configure xml ' <node id="2" uname="newhostname1">
>         <instance_attributes id="node-2">
>           <nvpair id="node-2-standby" name="standby" value="off"/>
>         </instance_attributes>
>       </node>'
> crm resource reprobe
> crm configure refresh
> crm configure property maintenance-mode=false
> Let's say for example that I migrate the hostnames like this
> hostname10 -> hostname20
> hostname11 -> hostname21
> After the above procedure is concluded the cluster is correctly
> reconfigured and when I check with crm_mon or crm status or crm
> configure show xml or even by inspecting the cib.xml I find the
> proper new hostnames fetched by pacemaker/corosync (hostname20 and
> hostname21).
> The documentation reports that pacemaker node name is taken from
> corosync.conf nodelist->ring0_addr if not an ip address: NOT MY CASE
> => skip
> corosync.conf nodelist->name if available: NOT MY CASE => skip
> uname -n [SHOULD BE IN HERE]
> Apparently case number 3 does not apply:
> [root at hostname20 ~]# crm_node -n
> hostname10
> [root at hostname20 ~]# uname -n
> hostname20
> This becomes evident as soon as I reboot/poweroff one of  the two
> nodes: crm_mon which after the reconfiguration was correctly showing
> Online: [ hostname21 hostname20 ]
> "rolls back" the configuration without any notice and starts showing
> the old one
> Online: [ hostname10 ]
> OFFLINE: [ hostname11 ]
> Do you have any idea of where on heath pacemaker is recovering the
> old hostnames ? 

Does "uname -n" also revert?

It looks like you're using RHEL 7 or a derivative -- if so, use
hostnamectl to change the host name. That will make sure it's updated
in the right places.

> 
> I've even checked  the code and see that there are cmaps involved so
> I suspect there's some caching issues involved in this.
> It looks like it is retaining the old hostnames in memory and when
> something .. "fails" it restores them.
> Besides don't blame me for this use case (reconfigure hostnames in a
> two-nodes cluster), as I didn't make it up. I just carry the pain.
> R
> 
> 
> 
>  Riccardo Manfrin
> R&D DEPARTMENT
> Web | LinkedIn	t +39 (0)444 750045
> e riccardo.manfrin at athonet.com
> 
> ATHONET | Via Cà del Luogo, 6/8 - 36050 Bolzano Vicentino (VI) Italy 
> This email and any attachments are confidential and intended solely
> for the use of the intended recipient. If you are not the named
> addressee, please be aware that you shall not distribute, copy, use
> or disclose this email. If you have received this email by error,
> please notify us immediately and delete this email from your system.
> Email transmission cannot be guaranteed to be secured or error-free
> or not to contain viruses. Athonet S.r.l. processes any personal data
> exchanged in email correspondence in accordance with EU Reg. 679/2016
> (GDPR) - you may find here the privacy policy with information on
> such processing and your rights. Any views or opinions presented in
> this email are solely those of the sender and do not necessarily
> represent those of Athonet S.r.l.
-- 
Ken Gaillot <kgaillot at redhat.com>



More information about the Users mailing list