[ClusterLabs] Redudant Ring Network failure
ROHWEDER-NEUBECK, MICHAEL (EXTERN)
michael.rohweder-neubeck.sp at dlh.de
Wed Jun 10 05:46:13 EDT 2020
Jan,
actually we using this.
[root at lvm-nfscpdata-05ct::~ 100 ]# apt show corosync
Package: corosync
Version: 3.0.1-2+deb10u1
[root at lvm-nfscpdata-05ct::~]# apt show libknet1
Package: libknet1
Version: 1.8-2
This are the newest version provided on Mirror.
Sitz der Gesellschaft / Corporate Headquarters: Deutsche Lufthansa Aktiengesellschaft, Koeln, Registereintragung / Registration: Amtsgericht Koeln HR B 2168
Vorsitzender des Aufsichtsrats / Chairman of the Supervisory Board: Dr. Karl-Ludwig Kley
Vorstand / Executive Board: Carsten Spohr (Vorsitzender / Chairman), Thorsten Dirks, Christina Foerster, Harry Hohmeister, Dr. Detlef Kayser, Dr. Michael Niggemann
-----Ursprüngliche Nachricht-----
Von: Jan Friesse <jfriesse at redhat.com>
Gesendet: Mittwoch, 10. Juni 2020 09:24
An: Cluster Labs - All topics related to open-source clustering welcomed <users at clusterlabs.org>; ROHWEDER-NEUBECK, MICHAEL (EXTERN) <michael.rohweder-neubeck.sp at dlh.de>; users at lists.clusterlabs.org
Betreff: Re: [ClusterLabs] Redudant Ring Network failure
Michael,
what version of knet you are using? We had quite a few problems with older versions of knet, so current stable is recommended (1.16). Same applies for corosync because 3.0.4 has vastly improved display of links status.
> Hello,
> We have massive problems with the redundant ring operation of our Corosync / pacemaker 3 Node NFS clusters.
>
> Most of the nodes either have an entire ring offline or only 1 node in a ring.
> Example: (Node1 Ring0 333 Ring1 n33 | Node2 Ring0 033 Ring1 3n3 |
> Node3 Ring0 333 Ring 1 33n)
Doesn't seem completely wrong. You can ignore 'n' for ring 1, because that is localhost which is connected only on Ring 0 (3.0.4 has this output more consistent) so all nodes are connected at least via Ring 1.
Ring 0 on node 2 seems to have some trouble with connection to node 1 but node 1 (and 3) seems to be connected to node 2 just fine, so I think it is ether some bug in knet (probably already fixed) or some kind of firewall blocking just connection from node 2 to node 1 on ring 0.
>
> corosync-cfgtool -R don't help
> All nodes are VMs that build the ring together using 2 VLANs.
> Which logs do you need to hopefully help me?
syslog/journal should contain everything needed especially when debug is enabled (corosync.conf - logging.debug: on)
Regards,
Honza
>
> Corosync Cluster Engine, version '3.0.1'
> Copyright (c) 2006-2018 Red Hat, Inc.
> Debian Buster
>
>
> --
> Mit freundlichen Grüßen
> Michael Rohweder-Neubeck
>
> NSB GmbH – Nguyen Softwareentwicklung & Beratung GmbH Röntgenstraße 27
> D-64291 Darmstadt
> E-Mail:
> mrn at nsb-software.de<mailto:mrn at nsb-software.de<mailto:mrn at nsb-software
> .de%3cmailto:mrn at nsb-software.de>>
> Manager: Van-Hien Nguyen, Jörg Jaspert
> USt-ID: DE 195 703 354; HRB 7131 Amtsgericht Darmstadt
>
>
>
>
> Sitz der Gesellschaft / Corporate Headquarters: Deutsche Lufthansa
> Aktiengesellschaft, Koeln, Registereintragung / Registration:
> Amtsgericht Koeln HR B 2168 Vorsitzender des Aufsichtsrats / Chairman
> of the Supervisory Board: Dr. Karl-Ludwig Kley Vorstand / Executive
> Board: Carsten Spohr (Vorsitzender / Chairman), Thorsten Dirks,
> Christina Foerster, Harry Hohmeister, Dr. Detlef Kayser, Dr. Michael
> Niggemann
>
>
>
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
More information about the Users
mailing list