[ClusterLabs] Rolling upgrade failure: v2.1.6 node cannot connect after upgrading peer to v3.0.1
Tomas Jelinek
tojeline at redhat.com
Thu Feb 12 09:03:50 UTC 2026
Hi S Sathish S,
If I remember correctly, pacemaker nodes agree on keeping the feature set
at the highest version that all of them support.
At start, you run v2.1.6, and all nodes agree on that. Then you upgrade one
node to v3.0.1, and you have both v3.0.1 and v2.1.6 running in the cluster.
So the nodes agree on whatever feature set v2.1.6 defines, because that is
the higher version they all support, and it all works fine. However, once
you turn the v2.1.6 node off, there are only v3.0.1 nodes in the cluster.
So they agree on whatever feature set v3.0.1 defines and bump feature set
to that version. If you start v2.1.6 node again, it cannot connect to the
cluster, because it doesn't support the new feature set.
In a summary, the situation you describe looks like an expected behavior to
me. It also matches the behavior you expect: both nodes remain operational
and communicate with each other. Up until the point you reboot the
non-upgraded node, that is.
To debug this, check pacemaker log on the node which doesn't connect. See
if there is a message about incompatible feature set / versions.
Pacemaker team can correct me if I'm wrong.
Regards,
Tomas
Dne 12. 02. 26 v 5:50 S Sathish S via Users napsal(a):
Hi Team,
Any update on below issue.
Thanks and Regards,
S Sathish S
*From:* S Sathish S
*Sent:* 09 February 2026 16:19
*To:* Cluster Labs - All topics related to open-source clustering welcomed
<users at clusterlabs.org> <users at clusterlabs.org>
*Cc:* Devakumar K <devakumar.k at ericsson.com> <devakumar.k at ericsson.com>
*Subject:* Rolling upgrade failure: v2.1.6 node cannot connect after
upgrading peer to v3.0.1
Hi Team,
Issue : We are performing a rolling upgrade from Pacemaker v2.1.6 to v3.0.1
in a two-node cluster. According to the [Pacemaker 3.0 Changes
documentation]( ⚡ Pacemaker 3.0 Changes
<https://projects.clusterlabs.org/w/projects/pacemaker/pacemaker_3.0_changes/>
), rolling upgrades from Pacemaker 2.0.0 and later should be supported with
minimal changes.
However, after upgrading the first node (Node 1) to Pacemaker 3.0.1, the
second node (Node 2) running Pacemaker 2.1.6 is unable to connect to the
cluster after reboot of node2. Pacemaker fails to start on Node 2 with the
error:
Error: error running crm_mon, is pacemaker running?
crm_mon: Connection to cluster failed: Connection refused
Expected Behaviour:
During a rolling upgrade, both nodes should remain operational and
communicate with each other until the second node is upgraded. The
non-upgraded node (2.1.6) should continue to function while the upgraded
node (3.0.1) acts as DC.
Questions:
1. Is this a known limitation with the rolling upgrade path from 2.1.6 to
3.0.1?
2. Are there specific compatibility issues between Pacemaker 3.0.1 and
2.1.6 that prevent cluster communication?
3. Are there any additional logs or diagnostic information needed to
troubleshoot this issue?
Environment:
- Cluster: 2-node RHEL 8 cluster with Corosync
- Node 1 (node1): Upgraded to Pacemaker 3.0.1-1.el8
- Node 2 (node2): Running Pacemaker 2.1.6-1.el8 (not yet upgraded)
Node 1 (node1) - Upgraded:
Pacemaker: 3.0.1-1.el8
Corosync: 3.1.10-1.el8
PCS: 0.12.2-1.el8
libknet1: 1.33-1.el8
resource-agents: 4.17.0-1.el8
Node 2 (node2) - Not Upgraded:
Pacemaker: 2.1.6-1.el8
Corosync: 3.1.7-1.el8
PCS: 0.10.19-2.el8
libknet1: 1.25-1.el8
resource-agents: 4.12.0-1.el8
Node1:
[root at node1 testadmin]# pcs status pcsd
node1: Online
node2: Online
[root at node1 testadmin]# pcs status
Cluster Summary:
* Stack: corosync (Pacemaker is running)
* Current DC: node1 (version 3.0.1-1.el8-3.0.1) - partition with quorum
* Last updated: Sat Feb 7 11:39:00 2026 on node1
* Last change: Sat Feb 7 11:22:01 2026 by hacluster via hacluster on
node1
* 2 nodes configured
* 23 resource instances configured (2 DISABLED)
Node List:
* Online: [ node1 ]
* OFFLINE: [ node2 ]
Full List of Resources:
* SNMP_node2 (ocf:pacemaker:ClusterMon): Stopped
* SNMP_node1 (ocf:pacemaker:ClusterMon): Started node1
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
[root at node1 testadmin]# rpm -qa | egrep
"pcs|pacemaker|corosy|pacemak|resource-a|libknet"
libknet1-1.33-1.el8.x86_64
libknet1-compress-lz4-plugin-1.33-1.el8.x86_64
libknet1-compress-zstd-plugin-1.33-1.el8.x86_64
libknet1-crypto-plugins-all-1.33-1.el8.x86_64
pacemaker-libs-3.0.1-1.el8.x86_64
pcs-0.12.2-1.el8.x86_64
corosynclib-3.1.10-1.el8.x86_64
libknet1-compress-bzip2-plugin-1.33-1.el8.x86_64
libknet1-compress-lzma-plugin-1.33-1.el8.x86_64
libknet1-compress-zlib-plugin-1.33-1.el8.x86_64
libknet1-compress-plugins-all-1.33-1.el8.x86_64
libknet1-crypto-openssl-plugin-1.33-1.el8.x86_64
libknet1-plugins-all-1.33-1.el8.x86_64
pacemaker-schemas-3.0.1-1.el8.noarch
pacemaker-cluster-libs-3.0.1-1.el8.x86_64
pacemaker-3.0.1-1.el8.x86_64
corosync-3.1.10-1.el8.x86_64
libknet1-compress-lzo2-plugin-1.33-1.el8.x86_64
libknet1-crypto-nss-plugin-1.33-1.el8.x86_64
resource-agents-4.17.0-1.el8.x86_64
pacemaker-cli-3.0.1-1.el8.x86_64
[root at node1 testadmin]#
Node2:
[root at node2 testadmin]# pcs status
Error: error running crm_mon, is pacemaker running?
crm_mon: Connection to cluster failed: Connection refused
[root at node2 testadmin]# pcs status pcsd
node2: Online
node1: Online
[root at node2 testadmin]#
[root at node2 testadmin]# rpm -qa | egrep
"pcs|pacemaker|corosy|pacemak|resource-a|libknet"
libknet1-compress-bzip2-plugin-1.25-1.el8.x86_64
libknet1-compress-zlib-plugin-1.25-1.el8.x86_64
libknet1-crypto-openssl-plugin-1.25-1.el8.x86_64
pacemaker-schemas-2.1.6-1.el8.noarch
pacemaker-2.1.6-1.el8.x86_64
libknet1-1.25-1.el8.x86_64
corosync-3.1.7-1.el8.x86_64
libknet1-compress-lz4-plugin-1.25-1.el8.x86_64
libknet1-compress-lzo2-plugin-1.25-1.el8.x86_64
libknet1-compress-zstd-plugin-1.25-1.el8.x86_64
libknet1-crypto-nss-plugin-1.25-1.el8.x86_64
libknet1-crypto-plugins-all-1.25-1.el8.x86_64
resource-agents-4.12.0-1.el8.x86_64
pacemaker-libs-2.1.6-1.el8.x86_64
pacemaker-cli-2.1.6-1.el8.x86_64
pcs-0.10.19-2.el8.x86_64
corosynclib-3.1.7-1.el8.x86_64
libknet1-compress-lzma-plugin-1.25-1.el8.x86_64
libknet1-compress-plugins-all-1.25-1.el8.x86_64
libknet1-plugins-all-1.25-1.el8.x86_64
pacemaker-cluster-libs-2.1.6-1.el8.x86_64
Thanks and Regards,
S Sathish S
_______________________________________________
Manage your subscription:https://lists.clusterlabs.org/mailman/listinfo/users
ClusterLabs home: https://www.clusterlabs.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20260212/cdc796a3/attachment-0001.htm>
More information about the Users
mailing list