[ClusterLabs] Proper procedure for pacemaker RPM upgrades in active cluster
Ken Gaillot
kgaillot at redhat.com
Mon Jan 15 18:10:31 EST 2018
On Mon, 2018-01-15 at 15:42 -0500, Doug Cahill wrote:
> Hello,
>
> I'm looking for some guidance on pacemaker RPM upgrades in a running
> cluster environment. I'm looking to automate the process of
> upgrading
> the RPMs when we decide to plan an upgrade cycle for our clusters.
>
> What I found is that during the RPM upgrade process the
> pacemaker.x86_64 RPM will shutdown the pacemaker service. My
> question
> regarding this is...is it possible to upgrade the RPM component but
> delay the restart part of the pacemaker service to a later time? If
> delaying the restart isn't possible, what is the preferred process
> for
> people with existing clusters that require package upgrades? Should
> I
> upgrade the passive side first and then fail over to it and then
> upgrade the other node which is now passive? Does pacemaker support
> running two nodes at different version levels during the upgrade
> process? Would enabling maintenance mode be appropriate/ideal for
> this?
Yes to most of those :)
Detailed information about upgrade techniques:
http://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html-single/Pa
cemaker_Explained/index.html#_upgrading
Basically, the failover scenario you mentioned is the "rolling upgrade"
technique, and the maintenance mode scenario you mentioned is the
"detach and reattach" technique.
Each has advantages and disadvantages. A rolling upgrade lets you keep
on node on a known working setup as long as possible, while a detach-
and-reattach gives you zero downtime (as long the upgrade has no
problems ...).
>
> I last experienced this situation when I upgraded from 1.1.15 to
> 1.1.17. Now that pacemaker 1.1.18 is available I'm looking to plan
> this process a little better and would like to know what others use
> as
> a procedure.
>
> Basic software config:
> CentOS 6.x (2.6.32-696.13.2.el6.x86_64)
> pacemaker.x86_64 1.1.17-1.el6
> corosync.x86_64 2.4.2-1.el6
> crmsh.noarch 3.0.1_283-0
> Two-node Cluster resources are configured for active/passive
> operation.
>
> Thanks,
> -Doug
On a side note, if you're building 1.1.18 packages yourself, it's a
good idea to use the latest upstream 1.1 branch, because it fixes an
important regression in 1.1.18.
--
Ken Gaillot <kgaillot at redhat.com>
More information about the Users
mailing list