[ClusterLabs] Q: Corosync (totemrrp.c:961): (max - recv_count[i] > threshold

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Wed May 13 08:00:32 EDT 2015


Hi!

I have a simple question for those who know the answer ;-):

Can you phrase in English what the condition "(max - recv_count[i] > threshold" (in corosync-1.4.7/exec/totemrrp.c near line 961) is?
The condition triggers "Marking ringid %u interface %s FAULTY", and I can not find out why this condition is triggered periodically in our configuration.

(I tired to read and understand the code, but due to comment-free programming style I can't get it)

"threshold" is either rrp_instance->totem_config->rrp_problem_count_threshold or rrp_instance->totem_config->rrp_problem_count_mcast_threshold, and "max" either the maximum of token_recv_count or mcast_recv_count, sometimes normalized relative to the minimum recv_count. I guess the code in passive_monitor() assumes that all the counters are frozen while they are inspected multiple times.

Maybe it would be helpful to see max and recv_count[i], as well as threshold in the FAULTY message...

(We never see the other FAULTY message: "Marking seqid %d ringid %u interface %s FAULTY")

Regards,
Ulrich






More information about the Users mailing list