[ClusterLabs] Two node cluster goes into split brain scenario during CPU intensive tasks
Jan Friesse
jfriesse at redhat.com
Mon Jun 24 02:52:33 EDT 2019
Somanath,
> Hi All,
>
> I have a two node cluster with multicast (udp) transport . The multicast IP used in 224.1.1.1 .
Would you mind to give a try to UDPU (unicast)? For two node cluster
there is going to be no difference in terms of speed/throughput.
>
> Whenever there is a CPU intensive task the pcs cluster goes into split brain scenario and doesn't recover automatically . We have to do a manual restart of services to bring both nodes online again.
Before the nodes goes into split brain , the corosync log shows ,
>
> May 24 15:10:02 server1 corosync[4745]: [TOTEM ] Retransmit List: 7c 7e
> May 24 15:10:02 server1 corosync[4745]: [TOTEM ] Retransmit List: 7c 7e
> May 24 15:10:02 server1 corosync[4745]: [TOTEM ] Retransmit List: 7c 7e
> May 24 15:10:02 server1 corosync[4745]: [TOTEM ] Retransmit List: 7c 7e
> May 24 15:10:02 server1 corosync[4745]: [TOTEM ] Retransmit List: 7c 7e
This is usually happening when:
- multicast is somehow rate-limited on switch side (configuration/bad
switch implementation/...)
- MTU of network is smaller than 1500 bytes and fragmentation is not
allowed -> try reduce totem.netmtu
Regards,
Honza
> May 24 15:51:42 server1 corosync[4745]: [TOTEM ] A processor failed, forming new configuration.
> May 24 16:41:42 server1 corosync[4745]: [TOTEM ] A new membership (10.241.31.12:29276) was formed. Members left: 1
> May 24 16:41:42 server1 corosync[4745]: [TOTEM ] Failed to receive the leave message. failed: 1
>
> Is there any way we can overcome this or this may be due to any multicast issues in the network side.
>
> With Regards
> Somanath Thilak J
>
>
>
>
>
>
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
More information about the Users
mailing list