[ClusterLabs] Two node cluster goes into split brain scenario during CPU intensive tasks

Lentes, Bernd bernd.lentes at helmholtz-muenchen.de
Mon Jun 24 11:16:42 EDT 2019



----- On Jun 23, 2019, at 1:40 PM, Somanath Jeeva somanath.jeeva at ericsson.com wrote:

> Hi All,
> I have a two node cluster with multicast (udp) transport . The multicast IP used
> in 224.1.1.1 .
> Whenever there is a CPU intensive task the pcs cluster goes into split brain
> scenario and doesn’t recover automatically . We have to do a manual restart of
> services to bring both nodes online again. Before the nodes goes into split
> brain , the corosync log shows ,
> May 24 15:10:02 server1 corosync[4745]: [TOTEM ] Retransmit List: 7c 7e
> May 24 15:10:02 server1 corosync[4745]: [TOTEM ] Retransmit List: 7c 7e
> May 24 15:10:02 server1 corosync[4745]: [TOTEM ] Retransmit List: 7c 7e
> May 24 15:10:02 server1 corosync[4745]: [TOTEM ] Retransmit List: 7c 7e
> May 24 15:10:02 server1 corosync[4745]: [TOTEM ] Retransmit List: 7c 7e
> May 24 15:51:42 server1 corosync[4745]: [TOTEM ] A processor failed, forming new
> configuration.
> May 24 16:41:42 server1 corosync[4745]: [TOTEM ] A new membership
> (10.241.31.12:29276) was formed. Members left: 1
> May 24 16:41:42 server1 corosync[4745]: [TOTEM ] Failed to receive the leave
> message. failed: 1
> Is there any way we can overcome this or this may be due to any multicast issues
> in the network side.
> With Regards
> Somanath Thilak J

I have atop running on all of my systems. It helps debugging and troubleshooting a lot.
It should be available for all distros.
In /etc/atop/atop.daily (on SuSE-systems) i change "INTERVAL=1",
so use of resources is logged each second. Attention !
This creates big logfiles in /var/log/atop (Suse).
Between two or three Gigabytes.
Take care that you have enough storage or configure atop.daily so that only some logs are kept.

With atop you have a precise look what the system does in the seconds before fencing,
e.g. see which processes use much resources.

Bernd
 

Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling
Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich Bassler, Kerstin Guenther
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671



More information about the Users mailing list