[ClusterLabs] Never join a list without a problem...

Fri Feb 24 10:37:31 EST 2017

On 02/24/2017 08:36 AM, Jeffrey Westgate wrote:
> Greetings all.
> 
> I have inherited a pair of Scientific Linux 6 boxes used as front-end load balancers for our DNS cluster. (Yes, I inherited that, too.)
> 
> It was time to update them so we pulled snapshots (they are VMWare VMs, very small, 1 cpu, 2G RAM, 10G disk), did a "yum update -y" watched everything update, then rebooted.  Pacemaker kept the system from booting.
> Reverted to the snapshot, ran a "yum update -y --exclude=pacemaker\* " and everything is hunky-dory.
> 
> # yum list pacemaker\*
> Installed Packages
> pacemaker.x86_64                                         1.1.10-14.el6                                @sl
> pacemaker-cli.x86_64                                     1.1.10-14.el6                                @sl
> pacemaker-cluster-libs.x86_64                            1.1.10-14.el6                                @sl
> pacemaker-libs.x86_64                                    1.1.10-14.el6                                @sl
> Available Packages
> pacemaker.x86_64                                         1.1.14-8.el6_8.2                             sl-security
> pacemaker-cli.x86_64                                     1.1.14-8.el6_8.2                             sl-security
> pacemaker-cluster-libs.x86_64                            1.1.14-8.el6_8.2                             sl-security
> pacemaker-libs.x86_64                                    1.1.14-8.el6_8.2                             sl-security
> 
> I searched clusterlabs.org looking for issues with updates, and came up empty.
> 
> # cat /etc/redhat-release
> Scientific Linux release 6.5 (Carbon)
> 
> ... is there something post-install/pre reboot that I need to do?
> 
> 
> --
> Jeff Westgate
> UNIX/Linux System Administrator
> Arkansas Dept. of Information Systems

Welcome! I joined the list with a problem, too, and now I'm technical
lead for the project, so be prepared ... ;-)

I don't know of any issues that would cause problems in that upgrade,
much less prevent a boot. Try disabling pacemaker at boot, doing the
upgrade, and then starting pacemaker, and pastebin any relevant messages
from /var/log/cluster/corosync.log.

If you're on SL 6, you should be using CMAN as the underlying cluster
layer. If you're using the corosync 1 pacemaker plugin, that's not well
tested on that platform.

Some general tips:

* You can run crm_verify (with either --live-check on a running cluster,
or -x /var/lib/pacemaker/cib/cib.xml on a stopped one) before and after
the upgrade to make sure you don't have any unaddressed configuration
issues.

* You can also run cibadmin --upgrade before and after the upgrade, to
make sure your configuration is using the latest schema.

It shouldn't prevent a boot if they're not done, but that may help
uncover any issues.