[ClusterLabs] Ugrading Ubuntu 14.04 to 16.04 with corosync/pacemaker failed

Wed Feb 19 13:20:37 EST 2020

On February 19, 2020 6:31:19 PM GMT+02:00, Rasca <rasca.gmelch at artcom.de> wrote:
>Hi,
>
>we run a 2-system cluster for Samba with Ubuntu 14.04 and Samba,
>Corosync and Pacemaker from the Ubuntu repos. We wanted to update
>to Ubuntu 16.04 but it failed:
>
>I checked the versions before and because of just minor updates
>of corosync and pacemaker I thought it should be possible to
>update node by node.
>
>* Put srv2 into standby
>* Upgraded srv2 to Ubuntu 16.04 with reboot and so on
>* Added a nodelist to corosync.conf because it looked
>  like corosync on srv2 didn't know the names of the
>  node ids anymore
>
>But still it does not work on srv2. srv1 (the active
>server with ubuntu 14.04) ist fine. It looks like
>it's an upstart/systemd issue, but may be even more.
>Why does srv1 says UNCLEAN about srv2? On srv2 I see
>corosync sees both systems. But srv2 says srv1 is
>OFFLINE!?
>
>crm status
>
>
>srv1____________________________________________________________
>Last updated: Wed Feb 19 17:22:03 2020
>Last change: Tue Feb 18 11:05:47 2020 via crm_attribute on srv2
>Stack: corosync
>Current DC: srv1 (1084766053) - partition with quorum
>Version: 1.1.10-42f2063
>2 Nodes configured
>9 Resources configured
>
>
>Node srv2 (1084766054): UNCLEAN (offline)
>Online: [ srv1 ]
>
> Resource Group: samba_daemons
>     samba-nmbd	(upstart:nmbd):	Started srv1
>[..]
>
>
>srv2____________________________________________________________
>Last updated: Wed Feb 19 17:25:14 2020		Last change: Tue Feb 18
>18:29:29
>2020 by hacluster via crmd on srv2
>Stack: corosync
>Current DC: srv2 (version 1.1.14-70404b0) - partition with quorum
>2 nodes and 9 resources configured
>
>Node srv2: standby
>OFFLINE: [ srv1 ]
>
>Full list of resources:
>
> Resource Group: samba_daemons
>     samba-nmbd	(upstart:nmbd):	Stopped
>[..]
>
>Failed Actions:
>* samba-nmbd_monitor_0 on srv2 'not installed' (5): call=5, status=Not
>installed, exitreason='none',
>    last-rc-change='Wed Feb 19 14:13:20 2020', queued=0ms, exec=1ms
>[..]
>
>
>Any suggestions, ideas? Is the a nice HowTo for this upgrade situation?
>
>Regards,
> Rasca
>
>_______________________________________________
>Manage your subscription:
>https://lists.clusterlabs.org/mailman/listinfo/users
>
>ClusterLabs home: https://www.clusterlabs.org/

Are  you  sure  that there  is no cluster  peotocol mismatch ?

Major number OS Upgrade  (even if supported by vendor)  must be done offline  (with proper  testing in advance).

What happens  when you upgraded  the other  node ,  or when you rollback the upgrade ?

Best Regards,
Strahil Nikolov