[ClusterLabs] Rolling upgrade failure: v2.1.6 node cannot connect after upgrading peer to v3.0.1

Tomas Jelinek tojeline at redhat.com
Thu Feb 12 09:03:50 UTC 2026


Hi S Sathish S,

If I remember correctly, pacemaker nodes agree on keeping the feature set
at the highest version that all of them support.

At start, you run v2.1.6, and all nodes agree on that. Then you upgrade one
node to v3.0.1, and you have both v3.0.1 and v2.1.6 running in the cluster.
So the nodes agree on whatever feature set v2.1.6 defines, because that is
the higher version they all support, and it all works fine. However, once
you turn the v2.1.6 node off, there are only v3.0.1 nodes in the cluster.
So they agree on whatever feature set v3.0.1 defines and bump feature set
to that version. If you start v2.1.6 node again, it cannot connect to the
cluster, because it doesn't support the new feature set.

In a summary, the situation you describe looks like an expected behavior to
me. It also matches the behavior you expect: both nodes remain operational
and communicate with each other. Up until the point you reboot the
non-upgraded node, that is.

To debug this, check pacemaker log on the node which doesn't connect. See
if there is a message about incompatible feature set / versions.

Pacemaker team can correct me if I'm wrong.

Regards,
Tomas


Dne 12. 02. 26 v 5:50 S Sathish S via Users napsal(a):

Hi Team,



Any update on below issue.



Thanks and Regards,

S Sathish S

*From:* S Sathish S
*Sent:* 09 February 2026 16:19
*To:* Cluster Labs - All topics related to open-source clustering welcomed
<users at clusterlabs.org> <users at clusterlabs.org>
*Cc:* Devakumar K <devakumar.k at ericsson.com> <devakumar.k at ericsson.com>
*Subject:* Rolling upgrade failure: v2.1.6 node cannot connect after
upgrading peer to v3.0.1



Hi Team,



Issue : We are performing a rolling upgrade from Pacemaker v2.1.6 to v3.0.1
in a two-node cluster. According to the [Pacemaker 3.0 Changes
documentation]( ⚡ Pacemaker 3.0 Changes
<https://projects.clusterlabs.org/w/projects/pacemaker/pacemaker_3.0_changes/>
), rolling upgrades from Pacemaker 2.0.0 and later should be supported with
minimal changes.



However, after upgrading the first node (Node 1) to Pacemaker 3.0.1, the
second node (Node 2) running Pacemaker 2.1.6 is unable to connect to the
cluster after reboot of node2. Pacemaker fails to start on Node 2 with the
error:



Error: error running crm_mon, is pacemaker running?

  crm_mon: Connection to cluster failed: Connection refused



Expected Behaviour:

During a rolling upgrade, both nodes should remain operational and
communicate with each other until the second node is upgraded. The
non-upgraded node (2.1.6) should continue to function while the upgraded
node (3.0.1) acts as DC.



Questions:

1. Is this a known limitation with the rolling upgrade path from 2.1.6 to
3.0.1?

2. Are there specific compatibility issues between Pacemaker 3.0.1 and
2.1.6 that prevent cluster communication?

3. Are there any additional logs or diagnostic information needed to
troubleshoot this issue?



Environment:

- Cluster: 2-node RHEL 8 cluster with Corosync

- Node 1 (node1): Upgraded to Pacemaker 3.0.1-1.el8

- Node 2 (node2): Running Pacemaker 2.1.6-1.el8 (not yet upgraded)



Node 1 (node1) - Upgraded:

Pacemaker: 3.0.1-1.el8

Corosync: 3.1.10-1.el8

PCS: 0.12.2-1.el8

libknet1: 1.33-1.el8

resource-agents: 4.17.0-1.el8



Node 2 (node2) - Not Upgraded:

Pacemaker: 2.1.6-1.el8

Corosync: 3.1.7-1.el8

PCS: 0.10.19-2.el8

libknet1: 1.25-1.el8

resource-agents: 4.12.0-1.el8





Node1:

[root at node1 testadmin]# pcs status pcsd

  node1: Online

  node2: Online



[root at node1 testadmin]# pcs status

Cluster Summary:

  * Stack: corosync (Pacemaker is running)

  * Current DC: node1 (version 3.0.1-1.el8-3.0.1) - partition with quorum

  * Last updated: Sat Feb  7 11:39:00 2026 on node1

  * Last change:  Sat Feb  7 11:22:01 2026 by hacluster via hacluster on
node1

  * 2 nodes configured

  * 23 resource instances configured (2 DISABLED)



Node List:

  * Online: [ node1 ]

  * OFFLINE: [ node2 ]



Full List of Resources:

  * SNMP_node2       (ocf:pacemaker:ClusterMon):      Stopped

  * SNMP_node1       (ocf:pacemaker:ClusterMon):      Started node1



Daemon Status:

  corosync: active/enabled

  pacemaker: active/enabled

  pcsd: active/enabled





[root at node1 testadmin]# rpm -qa |  egrep
"pcs|pacemaker|corosy|pacemak|resource-a|libknet"

libknet1-1.33-1.el8.x86_64

libknet1-compress-lz4-plugin-1.33-1.el8.x86_64

libknet1-compress-zstd-plugin-1.33-1.el8.x86_64

libknet1-crypto-plugins-all-1.33-1.el8.x86_64

pacemaker-libs-3.0.1-1.el8.x86_64

pcs-0.12.2-1.el8.x86_64

corosynclib-3.1.10-1.el8.x86_64

libknet1-compress-bzip2-plugin-1.33-1.el8.x86_64

libknet1-compress-lzma-plugin-1.33-1.el8.x86_64

libknet1-compress-zlib-plugin-1.33-1.el8.x86_64

libknet1-compress-plugins-all-1.33-1.el8.x86_64

libknet1-crypto-openssl-plugin-1.33-1.el8.x86_64

libknet1-plugins-all-1.33-1.el8.x86_64

pacemaker-schemas-3.0.1-1.el8.noarch

pacemaker-cluster-libs-3.0.1-1.el8.x86_64

pacemaker-3.0.1-1.el8.x86_64

corosync-3.1.10-1.el8.x86_64

libknet1-compress-lzo2-plugin-1.33-1.el8.x86_64

libknet1-crypto-nss-plugin-1.33-1.el8.x86_64

resource-agents-4.17.0-1.el8.x86_64

pacemaker-cli-3.0.1-1.el8.x86_64

[root at node1 testadmin]#





Node2:

[root at node2 testadmin]# pcs status

Error: error running crm_mon, is pacemaker running?

  crm_mon: Connection to cluster failed: Connection refused



[root at node2 testadmin]# pcs status pcsd

  node2: Online

  node1: Online

[root at node2 testadmin]#



[root at node2 testadmin]# rpm -qa |  egrep
"pcs|pacemaker|corosy|pacemak|resource-a|libknet"

libknet1-compress-bzip2-plugin-1.25-1.el8.x86_64

libknet1-compress-zlib-plugin-1.25-1.el8.x86_64

libknet1-crypto-openssl-plugin-1.25-1.el8.x86_64

pacemaker-schemas-2.1.6-1.el8.noarch

pacemaker-2.1.6-1.el8.x86_64

libknet1-1.25-1.el8.x86_64

corosync-3.1.7-1.el8.x86_64

libknet1-compress-lz4-plugin-1.25-1.el8.x86_64

libknet1-compress-lzo2-plugin-1.25-1.el8.x86_64

libknet1-compress-zstd-plugin-1.25-1.el8.x86_64

libknet1-crypto-nss-plugin-1.25-1.el8.x86_64

libknet1-crypto-plugins-all-1.25-1.el8.x86_64

resource-agents-4.12.0-1.el8.x86_64

pacemaker-libs-2.1.6-1.el8.x86_64

pacemaker-cli-2.1.6-1.el8.x86_64

pcs-0.10.19-2.el8.x86_64

corosynclib-3.1.7-1.el8.x86_64

libknet1-compress-lzma-plugin-1.25-1.el8.x86_64

libknet1-compress-plugins-all-1.25-1.el8.x86_64

libknet1-plugins-all-1.25-1.el8.x86_64

pacemaker-cluster-libs-2.1.6-1.el8.x86_64



Thanks and Regards,
S Sathish S

_______________________________________________
Manage your subscription:https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20260212/cdc796a3/attachment-0001.htm>


More information about the Users mailing list