[ClusterLabs] Cluster unable to find back together

Thu May 19 08:55:40 EDT 2022

Hi,

On 19/05/2022 10:16, Leditzky, Fabian via Users wrote:
> Hello
> 
> We have been dealing with our pacemaker/corosync clusters becoming unstable.
> The OS is Debian 10 and we use Debian packages for pacemaker and corosync,
> version 3.0.1-5+deb10u1 and 3.0.1-2+deb10u1 respectively.

Seems like pcmk version is not so important for behavior you've 
described. Corosync 3.0.1 is super old, are you able to reproduce the 
behavior with 3.1.6? What is the version of knet? There were quite a few 
fixes so last one (1.23) is really recommended.

You can try to compile yourself, or use proxmox repo 
(http://download.proxmox.com/debian/pve/) which contains newer version 
of packages.

> We use knet over UDP transport.
> 
> We run multiple 2-node and 4-8 node clusters, primarily managing VIP resources.
> The issue we experience presents itself as a spontaneous disagreement of
> the status of cluster members. In two node clusters, each node spontaneously
> sees the other node as offline, despite network connectivity being OK.
> In larger clusters, the status can be inconsistent across the nodes.
> E.g.: node1 sees 2,4 as offline, node 2 sees 1,4 as offline while node 3 and 4 see every node as online.

This really shouldn't happen.

> The cluster becomes generally unresponsive to resource actions in this state.

Expected

> Thus far we have been unable to restore cluster health without restarting corosync.
> 
> We are running packet captures 24/7 on the clusters and have custom tooling
> to detect lost UDP packets on knet ports. So far we could not see significant
> packet loss trigger an event, at most we have seen a single UDP packet dropped
> some seconds before the cluster fails.
> 
> However, even if the root cause is indeed a flaky network, we do not understand
> why the cluster cannot recover on its own in any way. The issues definitely persist
> beyond the presence of any intermittent network problem.

Try newer version. If problem persist, it's good idea to monitor if 
packets are really passed thru. Corosync always (at least) creates 
single node membership.

Regards,
   Honza

> 
> We were able to artificially break clusters by inducing packet loss with an iptables rule.
> Dropping packets on a single node of an 8-node cluster can cause malfunctions on
> multiple other cluster nodes. The expected behavior would be detecting that the
> artificially broken node failed but keeping the rest of the cluster stable.
> We were able to reproduce this also on Debian 11 with more recent corosync/pacemaker
> versions.
> 
> Our configuration basic, we do not significantly deviate from the defaults.
> 
> We will be very grateful for any insights into this problem.
> 
> Thanks,
> Fabian
> 
> // corosync.conf
> totem {
>      version: 2
>      cluster_name: cluster01
>      crypto_cipher: aes256
>      crypto_hash: sha512
>      transport: knet
> }
> logging {
>      fileline: off
>      to_stderr: no
>      to_logfile: no
>      to_syslog: yes
>      debug: off
>      timestamp: on
>      logger_subsys {
>          subsys: QUORUM
>          debug: off
>      }
> }
> quorum {
>      provider: corosync_votequorum
>      two_node: 1
>      expected_votes: 2
> }
> nodelist {
>      node {
>          name: node01
>          nodeid: 01
>          ring0_addr: 10.0.0.10
>      }
>      node {
>          name: node02
>          nodeid: 02
>          ring0_addr: 10.0.0.11
>      }
> }
> 
> // crm config show
> node 1: node01 \
>      attributes standby=off
> node 2: node02 \
>      attributes standby=off maintenance=off
> primitive IP-clusterC1 IPaddr2 \
>      params ip=10.0.0.20 nic=eth0 cidr_netmask=24 \
>      meta migration-threshold=2 target-role=Started is-managed=true \
>      op monitor interval=20 timeout=60 on-fail=restart
> primitive IP-clusterC2 IPaddr2 \
>      params ip=10.0.0.21 nic=eth0 cidr_netmask=24 \
>      meta migration-threshold=2 target-role=Started is-managed=true \
>      op monitor interval=20 timeout=60 on-fail=restart
> location STICKY-IP-clusterC1 IP-clusterC1 100: node01
> location STICKY-IP-clusterC2 IP-clusterC2 100: node02
> property cib-bootstrap-options: \
>      have-watchdog=false \
>      dc-version=2.0.1-9e909a5bdd \
>      cluster-infrastructure=corosync \
>      cluster-name=cluster01 \
>      stonith-enabled=no \
>      no-quorum-policy=ignore \
>      last-lrm-refresh=1632230917
> 
> 
> ________________________________
>   [https://go.aciworldwide.com/rs/030-ROK-804/images/aci-footer.jpg] <http://www.aciworldwide.com>
> This email message and any attachments may contain confidential, proprietary or non-public information. The information is intended solely for the designated recipient(s). If an addressing or transmission error has misdirected this email, please notify the sender immediately and destroy this email. Any review, dissemination, use or reliance upon this information by unintended recipients is prohibited. Any opinions expressed in this email are those of the author personally.
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
>