[ClusterLabs] corosync race condition when node leaves immediately after joining

Wed Oct 18 09:38:14 EDT 2017

Jonathan,

>
>
> On 16/10/17 15:58, Jan Friesse wrote:
>> Jonathan,
>>
>>>
>>>
>>> On 13/10/17 17:24, Jan Friesse wrote:
>>>>> I've done a bit of digging and am getting closer to the root cause of
>>>>> the race.
>>>>>
>>>>> We rely on having votequorum_sync_init called twice -- once when
>>>>> node 1
>>>>> joins (with member_list_entries=2) and once when node 1 leaves (with
>>>>> member_list_entries=1). This is important because votequorum_sync_init
>>>>> marks nodes as NODESTATE_DEAD if they are not in quorum_members[]
>>>>> -- so
>>>>> it needs to have seen the node appear then disappear. This is
>>>>> important
>>>>> because get_total_votes only counts votes from nodes in state
>>>>> NODESTATE_MEMBER.
>>>>
>>>> So there are basically two problems.
>>>>
>>>> Actually first (main) problem is that votequorum_sync_init is ever
>>>> called when that node joins. It really shouldn't. And problem is
>>>> simply because calling api->shutdown_request() is not enough. Can you
>>>> try replace it with exit(1) (for testing) and reproduce the problem?
>>>> I'm pretty sure problem disappears.
>>>
>>> No, the problem still happens :-(
>>
>> Not good.
>>
>>>
>>> I am using the following patch:
>>>
>>> diff --git a/exec/cmap.c b/exec/cmap.c
>>> index de730d2..1125cef 100644
>>> --- a/exec/cmap.c
>>> +++ b/exec/cmap.c
>>> @@ -406,7 +406,7 @@ static void cmap_sync_activate (void)
>>>                  log_printf(LOGSYS_LEVEL_ERROR,
>>>                      "Received config version (%"PRIu64") is different
>>> than my config version (%"PRIu64")! Exiting",
>>>                      cmap_highest_config_version_received,
>>> cmap_my_config_version);
>>> -               api->shutdown_request();
>>> +               exit(1);
>>>                  return ;
>>>          }
>>>   }
>>> diff --git a/exec/main.c b/exec/main.c
>>> index b0d5639..4fd3e68 100644
>>> --- a/exec/main.c
>>> +++ b/exec/main.c
>>> @@ -627,6 +627,7 @@ static void deliver_fn (
>>>                          ((void *)msg);
>>>          }
>>>
>>> +       log_printf(LOGSYS_LEVEL_NOTICE, "executing '%s' exec_handler_fn
>>> %p for node %d (fn %d)", corosync_service[service]->name,
>>> corosync_service[service]->exec_engine[fn_id].exec_handler_fn, nodeid,
>>> fn_id);
>>>          corosync_service[service]->exec_engine[fn_id].exec_handler_fn
>>>                  (msg, nodeid);
>>>   }
>>> diff --git a/exec/votequorum.c b/exec/votequorum.c
>>> index 1a97c6d..7c0f34f 100644
>>> --- a/exec/votequorum.c
>>> +++ b/exec/votequorum.c
>>> @@ -2099,6 +2100,7 @@ static void
>>> message_handler_req_exec_votequorum_nodeinfo (
>>>          node->flags = req_exec_quorum_nodeinfo->flags;
>>>          node->votes = req_exec_quorum_nodeinfo->votes;
>>>          node->state = NODESTATE_MEMBER;
>>> +       log_printf(LOGSYS_LEVEL_NOTICE,
>>> "message_handler_req_exec_votequorum_nodeinfo (%p) marking node %d as
>>> MEMBER", message_handler_req_exec_votequorum_nodeinfo, nodeid);
>>>
>>>          if (node->flags & NODE_FLAGS_LEAVING) {
>>>                  node->state = NODESTATE_LEAVING;
>>>
>>> When it's working correctly I see this:
>>>
>>> 1508151960.072927 notice  [TOTEM ] A new membership (10.71.218.17:2304)
>>> was formed. Members joined: 1
>>> 1508151960.073082 notice  [SYNC  ] calling sync_init on service
>>> 'corosync configuration map access' (0) with my_member_list_entries = 2
>>> 1508151960.073150 notice  [MAIN  ] executing 'corosync configuration map
>>> access' exec_handler_fn 0x55b5eb504ca0 for node 1 (fn 0)
>>> 1508151960.073197 notice  [MAIN  ] executing 'corosync configuration map
>>> access' exec_handler_fn 0x55b5eb504ca0 for node 2 (fn 0)
>>> 1508151960.073238 notice  [SYNC  ] calling sync_init on service
>>> 'corosync cluster closed process group service v1.01' (1) with
>>> my_member_list_entries = 2
>>> 1508151961.073033 notice  [TOTEM ] A processor failed, forming new
>>> configuration.
>>>
>>> When it's not working correctly I see this:
>>>
>>> 1508151908.447584 notice  [TOTEM ] A new membership (10.71.218.17:2292)
>>> was formed. Members joined: 1
>>> 1508151908.447757 notice  [MAIN  ] executing 'corosync vote quorum
>>> service v1.0' exec_handler_fn 0x558b39fbbaa0 for node 1 (fn 0)
>>> 1508151908.447866 notice  [VOTEQ ]
>>> message_handler_req_exec_votequorum_nodeinfo (0x558b39fbbaa0) marking
>>> node 1 as MEMBER
>>> 1508151908.447972 notice  [VOTEQ ] get_total_votes: node 1 is a MEMBER
>>> so counting vote
>>> 1508151908.448045 notice  [VOTEQ ] get_total_votes: node 2 is a MEMBER
>>> so counting vote
>>> 1508151908.448091 notice  [QUORUM] This node is within the primary
>>> component and will provide service.
>>> 1508151908.448134 notice  [QUORUM] Members[1]: 2
>>> 1508151908.448175 notice  [SYNC  ] calling sync_init on service
>>> 'corosync configuration map access' (0) with my_member_list_entries = 2
>>> 1508151908.448205 notice  [MAIN  ] executing 'corosync configuration map
>>> access' exec_handler_fn 0x558b39fb3ca0 for node 1 (fn 0)
>>> 1508151908.448247 notice  [MAIN  ] executing 'corosync configuration map
>>> access' exec_handler_fn 0x558b39fb3ca0 for node 2 (fn 0)
>>> 1508151908.448307 notice  [SYNC  ] calling sync_init on service
>>> 'corosync cluster closed process group service v1.01' (1) with
>>> my_member_list_entries = 2
>>> 1508151909.447182 notice  [TOTEM ] A processor failed, forming new
>>> configuration.
>>>
>>> ... and at that point I already see "Total votes: 2" in the
>>> corosync-quorumtool output.
>>>
>>> The key difference seems to be whether votequorum's nodeinfo
>>> exec_handler_fn is called. If it is called, it marks the node as
>>> NODESTATE_MEMBER, which is sufficient to trigger the bug because it is
>>> never subsequently marked as NODESTATE_DEAD.
>>>
>>> (Note that in both cases there is now a one-second delay before the "A
>>> processor failed" message. This is due to calling exit(1) rather than
>>> api->shutdown_request().)
>>>
>>>>> When it goes wrong, I see that votequorum_sync_init is only called
>>>>> *once* (with member_list_entries=1) -- after node 1 has joined and
>>>>> left.
>>>>> So it never sees node 1 in member_list, hence never marks it as
>>>>> NODESTATE_DEAD. But message_handler_req_exec_votequorum_nodeinfo has
>>>>> indepedently marked the node as NODESTATE_MEMBER, hence
>>>>> get_total_votes
>>>>> counts it and quorate is set to 1.
>>>>
>>>> This is second problem and it has to be also fixed. You have to be
>>>> pretty lucky to reproduce it so often.
>>>>
>>>> Anyway, here is theory:
>>>> - node 1 calls api->shutdown_request() and continue processing (this
>>>> is problem 1)
>>>> - both nodes gets to a state where they should call
>>>> votequorum_sync_init, but node 2 is now not scheduled (as I said, it
>>>> must be pretty luck)
>>>> - node 1 calls votequorum_sync_init and votequorum_sync_process
>>>> (sending nodeinfo) and it's shutdown
>>>> - node 2 gets nodeinfo
>>>> - node 2 sees node 1 shutdown
>>>> - node 2 calls votequorum_sync_init and votequorum_sync_process
>>>>
>>>> If this theory is true, we must probably fix the sync.c to have 2 or
>>>> ideally 3-4 barriers instead of 1.
>>>>
>>>>>
>>>>> So why is votequorum_sync_init sometimes only called once? It looks
>>>>> like
>>>>> it's all down to whether we manage to iterate through all the calls to
>>>>> schedwrk_processor before entering the OPERATIONAL state. I haven't
>>>>> yet
>>>>> looked into exactly what controls the timing of these two things.
>>>>>
>>>>> Adding the following patch helps me to demonstrate the problem more
>>>>> clearly:
>>>>>
>>>>> diff --git a/exec/sync.c b/exec/sync.c
>>>>> index e7b71bd..a2fb06d 100644
>>>>> --- a/exec/sync.c
>>>>> +++ b/exec/sync.c
>>>>> @@ -544,6 +545,7 @@ static int schedwrk_processor (const void
>>>>> *context)
>>>>>                  }
>>>>>
>>>>>                  if
>>>>> (my_sync_callbacks_retrieve(my_service_list[my_processing_idx].service_id,
>>>>>
>>>>>
>>>>> NULL) != -1) {
>>>>> +                       log_printf(LOGSYS_LEVEL_NOTICE, "calling
>>>>> sync_init on service '%s' (%d) with my_member_list_entries = %d",
>>>>> my_service_list[my_processing_idx].name, my_processing_idx,
>>>>> my_member_list_entries);
>>>>>                          my_service_list[my_processing_idx].sync_init
>>>>> (my_trans_list,
>>>>>                                  my_trans_list_entries,
>>>>> my_member_list,
>>>>>                                  my_member_list_entries,
>>>>> diff --git a/exec/votequorum.c b/exec/votequorum.c
>>>>> index d5f06c1..aab6c15 100644
>>>>> --- a/exec/votequorum.c
>>>>> +++ b/exec/votequorum.c
>>>>> @@ -2336,6 +2353,8 @@ static void votequorum_sync_init (
>>>>>          int left_nodes;
>>>>>          struct cluster_node *node;
>>>>>
>>>>> +       log_printf(LOGSYS_LEVEL_NOTICE, "votequorum_sync_init has %d
>>>>> member_list_entries", member_list_entries);
>>>>> +
>>>>>          ENTER();
>>>>>
>>>>>          sync_in_progress = 1;
>>>>>
>>>>> When it works correctly I see the following (selected log lines):
>>>>>
>>>>> notice  [TOTEM ] A new membership (10.71.218.17:2016) was formed.
>>>>> Members joined: 1
>>>>> notice  [SYNC  ] calling sync_init on service 'corosync configuration
>>>>> map access' (0) with my_member_list_entries = 2
>>>>> notice  [SYNC  ] calling sync_init on service 'corosync cluster closed
>>>>> process group service v1.01' (1) with my_member_list_entries = 2
>>>>> notice  [SYNC  ] calling sync_init on service 'corosync vote quorum
>>>>> service v1.0' (2) with my_member_list_entries = 2
>>>>> notice  [VOTEQ ] votequorum_sync_init has 2 member_list_entries
>>>>> notice  [TOTEM ] A new membership (10.71.218.18:2020) was formed.
>>>>> Members left: 1
>>>>> notice  [SYNC  ] calling sync_init on service 'corosync configuration
>>>>> map access' (0) with my_member_list_entries = 1
>>>>> notice  [SYNC  ] calling sync_init on service 'corosync cluster closed
>>>>> process group service v1.01' (1) with my_member_list_entries = 1
>>>>> notice  [SYNC  ] calling sync_init on service 'corosync vote quorum
>>>>> service v1.0' (2) with my_member_list_entries = 1
>>>>> notice  [VOTEQ ] votequorum_sync_init has 1 member_list_entries
>>>>>
>>>>>   -- Notice that votequorum_sync_init is called once with 2 members
>>>>> and
>>>>> once with 1 member.
>>>>>
>>>>> When it goes wrong I see the following (selected log lines):
>>>>>
>>>>> notice  [TOTEM ] A new membership (10.71.218.17:2004) was formed.
>>>>> Members joined: 1
>>>>> notice  [SYNC  ] calling sync_init on service 'corosync configuration
>>>>> map access' (0) with my_member_list_entries = 2
>>>>> notice  [SYNC  ] calling sync_init on service 'corosync cluster closed
>>>>> process group service v1.01' (1) with my_member_list_entries = 2
>>>>> notice  [TOTEM ] A new membership (10.71.218.18:2008) was formed.
>>>>> Members left: 1
>>>>> notice  [SYNC  ] calling sync_init on service 'corosync configuration
>>>>> map access' (0) with my_member_list_entries = 1
>>>>> notice  [SYNC  ] calling sync_init on service 'corosync cluster closed
>>>>> process group service v1.01' (1) with my_member_list_entries = 1
>>>>> notice  [SYNC  ] calling sync_init on service 'corosync vote quorum
>>>>> service v1.0' (2) with my_member_list_entries = 1
>>>>> notice  [VOTEQ ] votequorum_sync_init has 1 member_list_entries
>>>>>
>>>>>   -- Notice the value of my_member_list_entries in the different
>>>>> sync_init calls, and that votequorum_sync_init is only called once.
>>>>>
>>>>> Does this help explain the issue?
>>>>
>>>> It's definitively helpful. I will try to thing about what may be
>>>> happening a little deeper. Can you please add similar debug info as
>>>> you have for sync_init also to other sync_ and send me the log of both
>>>> nodes when everything works and when do not?
>>>
>>> Okay. I've rolled back to make the call to api->shutdown_request() on
>>> config_version mismatch, and have added debug lines in sync.c just
>>> before calls to sync_activate, sync_abort and sync_process.
>>>
>>> When it's working correctly:
>>>
>>> 1508158482.991782 notice  [TOTEM ] A new membership (10.71.218.17:2384)
>>> was formed. Members joined: 1
>>> 1508158482.991930 notice  [MAIN  ] executing 'corosync vote quorum
>>> service v1.0' exec_handler_fn 0x56492b0d6bf0 for node 1 (fn 0)
>>> 1508158482.992029 notice  [VOTEQ ]
>>> message_handler_req_exec_votequorum_nodeinfo (0x56492b0d6bf0) marking
>>> node 1 as MEMBER
>>> 1508158482.992122 notice  [VOTEQ ] get_total_votes: node 1 is a MEMBER
>>> so counting vote
>>> 1508158482.992206 notice  [VOTEQ ] get_total_votes: node 2 is a MEMBER
>>> so counting vote
>>> 1508158482.992263 notice  [QUORUM] This node is within the primary
>>> component and will provide service.
>>> 1508158482.992307 notice  [QUORUM] Members[1]: 2
>>> 1508158482.992351 notice  [SYNC  ] calling sync_init on service
>>> 'corosync configuration map access' (0) with my_member_list_entries = 2
>>> 1508158482.992382 notice  [SYNC  ] calling sync_process on service
>>> 'corosync configuration map access' (0)
>>> 1508158482.992425 notice  [MAIN  ] executing 'corosync configuration map
>>> access' exec_handler_fn 0x56492b0cedf0 for node 1 (fn 0)
>>> 1508158482.992456 notice  [MAIN  ] executing 'corosync configuration map
>>> access' exec_handler_fn 0x56492b0cedf0 for node 2 (fn 0)
>>> 1508158482.992498 notice  [SYNC  ] calling sync_activate on service
>>> 'corosync configuration map access' (0)
>>> 1508158482.992529 notice  [SYNC  ] calling sync_init on service
>>> 'corosync cluster closed process group service v1.01' (1) with
>>> my_member_list_entries = 2
>>> 1508158482.992572 notice  [SYNC  ] calling sync_process on service
>>> 'corosync cluster closed process group service v1.01' (1)
>>> 1508158482.992603 notice  [MAIN  ] executing 'corosync cluster closed
>>> process group service v1.01' exec_handler_fn 0x56492b0d2940 for node 2
>>> (fn 5)
>>> 1508158482.992646 notice  [MAIN  ] executing 'corosync cluster closed
>>> process group service v1.01' exec_handler_fn 0x56492b0d2940 for node 1
>>> (fn 5)
>>> 1508158482.992677 notice  [SYNC  ] calling sync_activate on service
>>> 'corosync cluster closed process group service v1.01' (1)
>>> 1508158482.992720 notice  [SYNC  ] calling sync_init on service
>>> 'corosync vote quorum service v1.0' (2) with my_member_list_entries = 2
>>> 1508158482.992750 notice  [VOTEQ ] votequorum_sync_init has 2
>>> member_list_entries
>>> 1508158482.992792 notice  [SYNC  ] calling sync_process on service
>>> 'corosync vote quorum service v1.0' (2)
>>> 1508158482.992850 notice  [TOTEM ] A new membership (10.71.218.18:2388)
>>> was formed. Members left: 1
>>> 1508158482.992895 notice  [MAIN  ] executing 'corosync vote quorum
>>> service v1.0' exec_handler_fn 0x56492b0d6bf0 for node 2 (fn 0)
>>> 1508158482.992938 notice  [VOTEQ ]
>>> message_handler_req_exec_votequorum_nodeinfo (0x56492b0d6bf0) marking
>>> node 2 as MEMBER
>>> 1508158482.992968 notice  [VOTEQ ] get_total_votes: node 1 is a MEMBER
>>> so counting vote
>>> 1508158482.993011 notice  [VOTEQ ] get_total_votes: node 2 is a MEMBER
>>> so counting vote
>>> 1508158482.993042 notice  [MAIN  ] executing 'corosync vote quorum
>>> service v1.0' exec_handler_fn 0x56492b0d6bf0 for node 2 (fn 0)
>>> 1508158482.993084 notice  [SYNC  ] calling sync_init on service
>>> 'corosync configuration map access' (0) with my_member_list_entries = 1
>>> 1508158482.993115 notice  [SYNC  ] calling sync_process on service
>>> 'corosync configuration map access' (0)
>>> 1508158482.993157 notice  [MAIN  ] executing 'corosync configuration map
>>> access' exec_handler_fn 0x56492b0cedf0 for node 2 (fn 0)
>>> 1508158482.993188 notice  [SYNC  ] calling sync_activate on service
>>> 'corosync configuration map access' (0)
>>> 1508158482.993231 notice  [SYNC  ] calling sync_init on service
>>> 'corosync cluster closed process group service v1.01' (1) with
>>> my_member_list_entries = 1
>>> 1508158482.993261 notice  [SYNC  ] calling sync_process on service
>>> 'corosync cluster closed process group service v1.01' (1)
>>> 1508158482.993304 notice  [MAIN  ] executing 'corosync cluster closed
>>> process group service v1.01' exec_handler_fn 0x56492b0d2940 for node 2
>>> (fn 5)
>>> 1508158482.993335 notice  [SYNC  ] calling sync_activate on service
>>> 'corosync cluster closed process group service v1.01' (1)
>>> 1508158482.993378 notice  [SYNC  ] calling sync_init on service
>>> 'corosync vote quorum service v1.0' (2) with my_member_list_entries = 1
>>> 1508158482.993409 notice  [VOTEQ ] votequorum_sync_init has 1
>>> member_list_entries
>>> 1508158482.993452 notice  [VOTEQ ] votequorum_sync_init marking node 1
>>> as DEAD
>>> 1508158482.993482 notice  [SYNC  ] calling sync_process on service
>>> 'corosync vote quorum service v1.0' (2)
>>> 1508158482.993525 notice  [MAIN  ] executing 'corosync vote quorum
>>> service v1.0' exec_handler_fn 0x56492b0d6bf0 for node 2 (fn 0)
>>> 1508158482.993555 notice  [VOTEQ ]
>>> message_handler_req_exec_votequorum_nodeinfo (0x56492b0d6bf0) marking
>>> node 2 as MEMBER
>>> 1508158482.993598 notice  [VOTEQ ] get_total_votes: node 2 is a MEMBER
>>> so counting vote
>>> 1508158482.993628 notice  [MAIN  ] executing 'corosync vote quorum
>>> service v1.0' exec_handler_fn 0x56492b0d6bf0 for node 2 (fn 0)
>>> 1508158482.993673 notice  [SYNC  ] calling sync_activate on service
>>> 'corosync vote quorum service v1.0' (2)
>>> 1508158482.993703 notice  [VOTEQ ] get_total_votes: node 2 is a MEMBER
>>> so counting vote
>>> 1508158482.993746 notice  [QUORUM] This node is within the non-primary
>>> component and will NOT provide any services.
>>> 1508158482.993776 notice  [QUORUM] Members[1]: 2
>>> 1508158482.993818 notice  [MAIN  ] Completed service synchronization,
>>> ready to provide service.
>>>
>>> When it's working incorrectly:
>>>
>>> 1508158493.905246 notice  [TOTEM ] A new membership (10.71.218.17:2392)
>>> was formed. Members joined: 1
>>> 1508158493.905392 notice  [MAIN  ] executing 'corosync vote quorum
>>> service v1.0' exec_handler_fn 0x56492b0d6bf0 for node 1 (fn 0)
>>> 1508158493.905522 notice  [VOTEQ ]
>>> message_handler_req_exec_votequorum_nodeinfo (0x56492b0d6bf0) marking
>>> node 1 as MEMBER
>>> 1508158493.905623 notice  [VOTEQ ] get_total_votes: node 1 is a MEMBER
>>> so counting vote
>>> 1508158493.905697 notice  [VOTEQ ] get_total_votes: node 2 is a MEMBER
>>> so counting vote
>>> 1508158493.905742 notice  [QUORUM] This node is within the primary
>>> component and will provide service.
>>> 1508158493.905773 notice  [QUORUM] Members[1]: 2
>>> 1508158493.905815 notice  [SYNC  ] calling sync_init on service
>>> 'corosync configuration map access' (0) with my_member_list_entries = 2
>>> 1508158493.905889 notice  [SYNC  ] calling sync_process on service
>>> 'corosync configuration map access' (0)
>>> 1508158493.905921 notice  [MAIN  ] executing 'corosync configuration map
>>> access' exec_handler_fn 0x56492b0cedf0 for node 1 (fn 0)
>>> 1508158493.905951 notice  [MAIN  ] executing 'corosync configuration map
>>> access' exec_handler_fn 0x56492b0cedf0 for node 2 (fn 0)
>>> 1508158493.905993 notice  [SYNC  ] calling sync_activate on service
>>> 'corosync configuration map access' (0)
>>> 1508158493.906023 notice  [SYNC  ] calling sync_init on service
>>> 'corosync cluster closed process group service v1.01' (1) with
>>> my_member_list_entries = 2
>>> 1508158493.906066 notice  [SYNC  ] calling sync_process on service
>>> 'corosync cluster closed process group service v1.01' (1)
>>> 1508158493.906096 notice  [MAIN  ] executing 'corosync cluster closed
>>> process group service v1.01' exec_handler_fn 0x56492b0d2940 for node 2
>>> (fn 5)
>>> 1508158493.906138 notice  [TOTEM ] A new membership (10.71.218.18:2396)
>>> was formed. Members left: 1
>>> 1508158493.906168 notice  [SYNC  ] calling sync_init on service
>>> 'corosync configuration map access' (0) with my_member_list_entries = 1
>>> 1508158493.906210 notice  [SYNC  ] calling sync_process on service
>>> 'corosync configuration map access' (0)
>>> 1508158493.906240 notice  [MAIN  ] executing 'corosync configuration map
>>> access' exec_handler_fn 0x56492b0cedf0 for node 2 (fn 0)
>>> 1508158493.906282 notice  [SYNC  ] calling sync_activate on service
>>> 'corosync configuration map access' (0)
>>> 1508158493.906312 notice  [SYNC  ] calling sync_init on service
>>> 'corosync cluster closed process group service v1.01' (1) with
>>> my_member_list_entries = 1
>>> 1508158493.906354 notice  [SYNC  ] calling sync_process on service
>>> 'corosync cluster closed process group service v1.01' (1)
>>> 1508158493.906384 notice  [MAIN  ] executing 'corosync cluster closed
>>> process group service v1.01' exec_handler_fn 0x56492b0d2940 for node 2
>>> (fn 5)
>>> 1508158493.906427 notice  [SYNC  ] calling sync_activate on service
>>> 'corosync cluster closed process group service v1.01' (1)
>>> 1508158493.906457 notice  [SYNC  ] calling sync_init on service
>>> 'corosync vote quorum service v1.0' (2) with my_member_list_entries = 1
>>> 1508158493.906499 notice  [VOTEQ ] votequorum_sync_init has 1
>>> member_list_entries
>>> 1508158493.906528 notice  [SYNC  ] calling sync_process on service
>>> 'corosync vote quorum service v1.0' (2)
>>> 1508158493.906570 notice  [MAIN  ] executing 'corosync vote quorum
>>> service v1.0' exec_handler_fn 0x56492b0d6bf0 for node 2 (fn 0)
>>> 1508158493.906600 notice  [VOTEQ ]
>>> message_handler_req_exec_votequorum_nodeinfo (0x56492b0d6bf0) marking
>>> node 2 as MEMBER
>>> 1508158493.906642 notice  [VOTEQ ] get_total_votes: node 1 is a MEMBER
>>> so counting vote
>>> 1508158493.906672 notice  [VOTEQ ] get_total_votes: node 2 is a MEMBER
>>> so counting vote
>>> 1508158493.906714 notice  [MAIN  ] executing 'corosync vote quorum
>>> service v1.0' exec_handler_fn 0x56492b0d6bf0 for node 2 (fn 0)
>>> 1508158493.906743 notice  [SYNC  ] calling sync_activate on service
>>> 'corosync vote quorum service v1.0' (2)
>>> 1508158493.906785 notice  [VOTEQ ] get_total_votes: node 1 is a MEMBER
>>> so counting vote
>>> 1508158493.906815 notice  [VOTEQ ] get_total_votes: node 2 is a MEMBER
>>> so counting vote
>>> 1508158493.906856 notice  [QUORUM] Members[1]: 2
>>> 1508158493.906886 notice  [MAIN  ] Completed service synchronization,
>>> ready to provide service.
>>>
>>> Now it seems that sync calls votequorum's exec_handler_fn in both
>>> working and non-working cases. So now it relies on votequorum_sync_init
>>> being called once with both members and once without node 1 in order to
>>> set the node as NODESTATE_DEAD.
>>
>> votequorum's exec_handler_fn is probably executed because of icmap
>> events.
>>
>> Can you please try to set debug to trace (so config file will look like
>>
>> logging {
>>    to_syslog: yes
>>    to_stderr: yes
>>    debug: trace
>> }
>>
>> and send full log (so including "notice  [MAIN  ] Corosync Cluster
>> Engine ('2.4.2.45-2fbb'): started and ready to provide service.") from
>> both nodes with description which one is which (ip + nodeid) for
>> failing case?
>
> Interestingly I didn't manage to reproduce the problem when I use
> "debug: trace" on the node that joins and immediately leaves (cluster1).
> That suggests to me that it's a timing issue (and is probably the reason
> I can reproduce it consistently but you can't).

Exactly.

>
> Therefore the best I can do is give you "debug: trace" logs from
> cluster2 but only "debug: off" logs from cluster1 in the case where it
> ends up in the wrong state. Is that at all useful?

Sure

>
> IP addresses:
>    10.71.218.18    cluster2
>    10.71.218.17    cluster1
>
> Nodeid mapping:
>    nodelist {
>      node {
>        nodeid: 1
>        ring0_addr: cluster1
>      }
>      node {
>        nodeid: 2
>        ring0_addr: cluster2
>      }
>    }
>
> Method:
>   * I run the following command on cluster2:
>      corosync -f 2>&1 | ts '%.s' | tee cluster2.log
>
>   * Then run the following command on cluster1:
>      corosync -f 2>&1 | ts '%.s' | tee cluster1.log
>
>   * Then I take a look at corosync-quorumtool on cluster2 to check
> whether it triggered the bug or not.
>
>   * Then I press Ctrl-C on cluster2.
>
> I have attached zipped versions of cluster1.log and cluster2.log from
> when it worked and when it broke. I hope you're able to spot something
> useful!

Can you please try to remove 
"votequorum_exec_send_nodeinfo(us->node_id);" line from votequorum.c in 
the votequorum_exec_init_fn function (around line 2306) and let me know 
if problem persists?

Honza

>
> Thanks,
> Jonathan