[ClusterLabs] Corosync ring shown faulty between healthy nodes & networks (rrp_mode: passive)

Jan Friesse jfriesse at redhat.com
Wed Oct 5 07:01:16 UTC 2016


Martin,

> Hello all,
>
> I am trying to understand why the following 2 Corosync heartbeat ring failure
> scenarios
> I have been testing and hope somebody can explain why this makes any sense.
>
>
> Consider the following cluster:
>
>      * 3x Nodes: A, B and C
>      * 2x NICs for each Node
>      * Corosync 2.3.5 configured with "rrp_mode: passive" and
>        udpu transport with ring id 0 and 1 on each node.
>      * On each node "corosync-cfgtool -s" shows:
>          [...] ring 0 active with no faults
>          [...] ring 1 active with no faults
>
>
> Consider the following scenarios:
>
>      1. On node A only block all communication on the first NIC  configured with
> ring id 0
>      2. On node A only block all communication on all       NICs configured with
> ring id 0 and 1
>
>
> The result of the above scenarios is as follows:
>
>      1. Nodes A, B and C (!) display the following ring status:
>          [...] Marking ringid 0 interface <IP-Address> FAULTY
>          [...] ring 1 active with no faults
>      2. Node A is shown as OFFLINE - B and C display the following ring status:
>          [...] ring 0 active with no faults
>          [...] ring 1 active with no faults
>
>
> Questions:
>      1. Is this the expected outcome ?

Yes

>      2. In experiment 1. B and C can still communicate with each other over both
> NICs, so why are
>         B and C not displaying a "no faults" status for ring id 0 and 1 just like
> in experiment 2.

Because this is how RRP works. RRP marks whole ring as failed so every 
node sees that ring as failed.

>         when node A is completely unreachable ?

Because it's different scenario. In scenario 1 there are 3 nodes 
membership where one of them has failed one ring -> whole ring is 
failed. In scenario 2 there are 2 nodes membership where both rings 
works as expected. Node A is completely unreachable and it's not in the 
membership.

Regards,
   Honza

>
>
> Regards,
> Martin Schlegel
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>





More information about the Users mailing list