[ClusterLabs] corosync race condition when node leaves immediately after joining

Tue Oct 31 06:41:56 EDT 2017

Jonathan,

> Hi Honza,
>
> On 19/10/17 17:05, Jonathan Davies wrote:
>> On 19/10/17 16:56, Jan Friesse wrote:
>>> Jonathan,
>>>
>>>>
>>>>
>>>> On 18/10/17 16:18, Jan Friesse wrote:
>>>>> Jonathan,
>>>>>
>>>>>>
>>>>>> On 18/10/17 14:38, Jan Friesse wrote:
>>>>>>> Can you please try to remove
>>>>>>> "votequorum_exec_send_nodeinfo(us->node_id);" line from votequorum.c
>>>>>>> in the votequorum_exec_init_fn function (around line 2306) and
>>>>>>> let me
>>>>>>> know if problem persists?
>>>>>>
>>>>>> Wow! With that change, I'm pleased to say that I'm not able to
>>>>>> reproduce
>>>>>> the problem at all!
>>>>>
>>>>> Sounds good.
>>>>>
>>>>>>
>>>>>> Is this a legitimate fix, or do we still need the call to
>>>>>> votequorum_exec_send_nodeinfo for other reasons?
>>>>>
>>>>> That is good question. Calling of votequorum_exec_send_nodeinfo should
>>>>> not be needed because it's called by sync_process only slightly later.
>>>>>
>>>>> But to mark this as a legitimate fix, I would like to find out why is
>>>>> this happening and if it is legal or not. Basically because I'm not
>>>>> able to reproduce the bug at all (and I was really trying also with
>>>>> various usleeps/packet loss/...) I would like to have more information
>>>>> about notworking_cluster1.log. Because tracing doesn't work, we need
>>>>> to try blackbox. Could you please add
>>>>>
>>>>> icmap_set_string("runtime.blackbox.dump_flight_data", "yes");
>>>>>
>>>>> line before api->shutdown_request(); in cmap.c ?
>>>>>
>>>>> It should trigger dumping blackbox in /var/lib/corosync. When you
>>>>> reproduce the nonworking_cluster1, could you please ether:
>>>>> - compress the file pointed by /var/lib/corosync/fdata symlink
>>>>> - or execute corosync-blackbox
>>>>> - or execute qb-blackbox "/var/lib/corosync/fdata"
>>>>>
>>>>> and send it?
>>>>
>>>> Attached, along with the "debug: trace" log from cluster2.
>>>
>>> Thanks a lot for the logs. I'm - finally!!!! - able to reproduce bug
>>> (with the 2 artificial pauses - included at the end of the mail).
>>> I'll try to fix the main bug (what may take some time, eventho I have
>>> kind of idea what is happening) and let you know.
>>
>> Glad to hear that the logs are useful and you're able to reproduce the
>> problem! I look forward to hearing what you come up with, and am happy
>> to test out patches if that would help.
>
> Did you get a chance to confirm whether the workaround to remove the
> final call to votequorum_exec_send_nodeinfo from votequorum_exec_init_fn
> is safe?

I didn't had time to find out what exactly is happening, but I can 
confirm you, that workaround is safe. It's just not a full fix and there 
can still be situations when the bug appears.

>
> The patch works well in our testing, but I'm keen to hear whether you
> think this is likely to be safe for use in production.

It's safe but it's just a workaround.

Regards,
   Honza

>
> Thanks,
> Jonathan