[ClusterLabs] Redudant Ring Network failure
ROHWEDER-NEUBECK, MICHAEL (EXTERN)
michael.rohweder-neubeck.sp at dlh.de
Wed Jun 10 05:28:30 EDT 2020
Hi,
yesterday we restart all cluster and all rings ok.
Now today 1. With broken ring.
ring 0 broken: 033
this is my cfg
[root at lvm-nfscpdata-05ct::~]# less /etc/corosync/corosync.conf
totem {
version: 2
transport: knet
cluster_name: nfscpdata
token: 2000
token_retransmits_before_loss_const: 10
max_messages: 150
window_size: 300
crypto_cipher: aes256
crypto_hash: sha256
interface {
ringnumber: 0
}
interface {
ringnumber: 1
}
}
logging {
fileline: off
to_stderr: yes
to_logfile: no
to_syslog: yes
syslog_facility: daemon
syslog_priority: info
debug: off
timestamp: on
logger_subsys {
subsys: QUORUM
debug: off
}
}
quorum {
# Enable and configure quorum subsystem (default: off)
# see also corosync.conf.5 and votequorum.5
provider: corosync_votequorum
}
nodelist {
node {
ring0_addr: 10.28.63.138
ring1_addr: 10.28.98.138
name: lvm-nfscpdata-04ct
nodeid: 1688
}
node {
ring0_addr: 10.28.63.139
ring1_addr: 10.28.98.139
name: lvm-nfscpdata-05ct
nodeid: 1689
}
node {
ring0_addr: 10.28.63.140
ring1_addr: 10.28.98.140
name: lvm-nfscpdata-06ct
nodeid: 1690
}
}
Ring 1 managed by host firewall. But ports opend
Ring 0 no Firewall setting.
Sitz der Gesellschaft / Corporate Headquarters: Deutsche Lufthansa Aktiengesellschaft, Koeln, Registereintragung / Registration: Amtsgericht Koeln HR B 2168
Vorsitzender des Aufsichtsrats / Chairman of the Supervisory Board: Dr. Karl-Ludwig Kley
Vorstand / Executive Board: Carsten Spohr (Vorsitzender / Chairman), Thorsten Dirks, Christina Foerster, Harry Hohmeister, Dr. Detlef Kayser, Dr. Michael Niggemann
-----Ursprüngliche Nachricht-----
Von: Strahil Nikolov <hunter86_bg at yahoo.com>
Gesendet: Dienstag, 9. Juni 2020 21:34
An: ROHWEDER-NEUBECK, MICHAEL (EXTERN) <michael.rohweder-neubeck.sp at dlh.de>; Cluster Labs - All topics related to open-source clustering welcomed <users at clusterlabs.org>
Betreff: Re: [ClusterLabs] Redudant Ring Network failure
It will be hard to guess if you are using sctp or udp/udpu.
If possible share the corosync.conf (you can remove sensitive data, but make it meaningful).
Are you using a firewall ? If yes check :
1. Node firewall is not blocking the communication on the specific interfaces 2. Verify with tcpdump that the heartbeats are received from the remote side.
3. Check for retransmissions or packet loss.
Usually you can find more details in the log specified in corosync.conf or in /var/log/messages (and also the journal).
Best Regards,
Strahil Nikolov
На 9 юни 2020 г. 21:11:02 GMT+03:00, "ROHWEDER-NEUBECK, MICHAEL (EXTERN)" <michael.rohweder-neubeck.sp at dlh.de> написа:
>Hi,
>
>we are using unicast ("knet")
>
>Greetings
>
>Michael
>
>
>
>
>Sitz der Gesellschaft / Corporate Headquarters: Deutsche Lufthansa
>Aktiengesellschaft, Koeln, Registereintragung / Registration:
>Amtsgericht Koeln HR B 2168
>Vorsitzender des Aufsichtsrats / Chairman of the Supervisory Board: Dr.
>Karl-Ludwig Kley
>Vorstand / Executive Board: Carsten Spohr (Vorsitzender / Chairman),
>Thorsten Dirks, Christina Foerster, Harry Hohmeister, Dr. Detlef
>Kayser, Dr. Michael Niggemann
>
>
>-----Ursprüngliche Nachricht-----
>Von: Strahil Nikolov <hunter86_bg at yahoo.com>
>Gesendet: Dienstag, 9. Juni 2020 19:30
>An: Cluster Labs - All topics related to open-source clustering
>welcomed <users at clusterlabs.org>; ROHWEDER-NEUBECK, MICHAEL (EXTERN)
><michael.rohweder-neubeck.sp at dlh.de>
>Betreff: Re: [ClusterLabs] Redudant Ring Network failure
>
>Are you using multicast ?
>
>Best Regards,
>Strahil Nikolov
>
>На 9 юни 2020 г. 10:28:25 GMT+03:00, "ROHWEDER-NEUBECK, MICHAEL
>(EXTERN)" <michael.rohweder-neubeck.sp at dlh.de> написа:
>>Hello,
>>We have massive problems with the redundant ring operation of our
>>Corosync / pacemaker 3 Node NFS clusters.
>>
>>Most of the nodes either have an entire ring offline or only 1 node in
>
>>a ring.
>>Example: (Node1 Ring0 333 Ring1 n33 | Node2 Ring0 033 Ring1 3n3 |
>Node3
>>Ring0 333 Ring 1 33n)
>>
>>corosync-cfgtool -R don't help
>>All nodes are VMs that build the ring together using 2 VLANs.
>>Which logs do you need to hopefully help me?
>>
>>Corosync Cluster Engine, version '3.0.1'
>>Copyright (c) 2006-2018 Red Hat, Inc.
>>Debian Buster
>>
>>
>>--
>>Mit freundlichen Grüßen
>> Michael Rohweder-Neubeck
>>
>>NSB GmbH – Nguyen Softwareentwicklung & Beratung GmbH Röntgenstraße 27
>>D-64291 Darmstadt
>>E-Mail:
>>mrn at nsb-software.de<mailto:mrn at nsb-software.de<mailto:mrn at nsb-software.
>>de%3cmailto:mrn at nsb-software.de>>
>>Manager: Van-Hien Nguyen, Jörg Jaspert
>>USt-ID: DE 195 703 354; HRB 7131 Amtsgericht Darmstadt
>>
>>
>>
>>
>>Sitz der Gesellschaft / Corporate Headquarters: Deutsche Lufthansa
>>Aktiengesellschaft, Koeln, Registereintragung / Registration:
>>Amtsgericht Koeln HR B 2168
>>Vorsitzender des Aufsichtsrats / Chairman of the Supervisory Board:
>Dr.
>>Karl-Ludwig Kley
>>Vorstand / Executive Board: Carsten Spohr (Vorsitzender / Chairman),
>>Thorsten Dirks, Christina Foerster, Harry Hohmeister, Dr. Detlef
>>Kayser, Dr. Michael Niggemann
More information about the Users
mailing list