[ClusterLabs] two virtual domains start and stop every 15 minutes

Wed Jun 19 12:46:07 EDT 2019

----- On Jun 15, 2019, at 4:30 PM, Bernd Lentes bernd.lentes at helmholtz-muenchen.de wrote:

> ----- Am 14. Jun 2019 um 21:20 schrieb kgaillot kgaillot at redhat.com:
> 
>> On Fri, 2019-06-14 at 18:27 +0200, Lentes, Bernd wrote:
>>> Hi,
>>> 
>>> i had that problem already once but still it's not clear for me what
>>> really happens.
>>> I had this problem some days ago:
>>> I have a 2-node cluster with several virtual domains as resources. I
>>> put one node (ha-idg-2) into standby, and two running virtual domains
>>> were migrated to the other node (ha-idg-1). The other virtual domains
>>> were already running on ha-idg-1.
>>> Since then the two virtual domains which migrated (vm_idcc_devel and
>>> vm_severin) start or stop every 15 minutes on ha-idg-1.
>>> ha-idg-2 resides in standby.
>>> I know that the 15 minutes interval is related to the "cluster-
>>> recheck-interval".
>>> But why are these two domains started and stopped ?
>>> I looked around much in the logs, checked the pe-input files, watched
>>> some graphs created by crm_simulate with dotty ...
>>> I always see that the domains are started and 15 minutes later
>>> stopped and 15 minutes later started ...
>>> but i don't see WHY. I would really like to know that.
>>> And why are the domains not started from the monitor resource
>>> operation ? It should recognize that the domain is stopped and starts
>>> it again. My monitor interval is 30 seconds.
>>> I had two errors pending concerning these domains, a failed migrate
>>> from ha-idg-1 to ha-idg-2, form some time before.
>>> Could that be the culprit ?
>>> 
>>> I still have all the logs from that time, if you need information
>>> just let me know.
>> 
>> Yes the logs and pe-input files would be helpful. It sounds like a bug
>> in the scheduler. What version of pacemaker are you running?
>> 
> 
> Hi,
> 
> here are the log and some pe-input files:
> https://hmgubox.helmholtz-muenchen.de/d/f28f6961722f472eb649/
> On 6th of june at 15:41:28 i issued "crm node standby ha-idg-2", then the
> trouble began.
> I'm running pacemaker-1.1.19+20181105.ccd6b5b10-3.10.1.x86_64 on SLES 12 SP4 and
> kernel 4.12.14-95.13.
> 

Hi,
the problem arised again.
And what attracted my attention: i made a change in the configuration, e.g. some slight changes
of a resource, it immediately start or stop the domains, depending on the state before.
The fence-resource is not affected by this start/stop.

Example (some changes of a stonith agent):

Jun 18 18:07:09 [9577] ha-idg-1        cib:     info: cib_process_request:      Forwarding cib_replace operation for section configuration to all (origin=local/crm_shadow/2)
Jun 18 18:07:09 [9577] ha-idg-1        cib:     info: __xml_diff_object:        Moved nvpair at id (0 -> 2)
Jun 18 18:07:09 [9577] ha-idg-1        cib:     info: __xml_diff_object:        Moved nvpair at name (1 -> 0)
Jun 18 18:07:09 [9577] ha-idg-1        cib:     info: cib_perform_op:   Diff: --- 2.6990.1043 2
Jun 18 18:07:09 [9577] ha-idg-1        cib:     info: cib_perform_op:   Diff: +++ 2.6991.0 6a5f09a19ae7d0a7bae55bddb9d1564f   <=====================   new epoch

Jun 18 18:07:09 [9577] ha-idg-1        cib:     info: cib_perform_op:   -- /cib/configuration/resources/primitive[@id='fence_ilo_ha-idg-2']/instance_attributes[@id='fence_ha-idg-2-instance_attributes']/nvpair[@id='fence_ha-idg-2-instance
_attributes-action']
Jun 18 18:07:09 [9577] ha-idg-1        cib:     info: cib_perform_op:   -- /cib/configuration/resources/primitive[@id='fence_ilo_ha-idg-2']/instance_attributes[@id='fence_ha-idg-2-instance_attributes-0']/nvpair[@id='fence_ha-idg-2-instan
ce_attributes-0-ipaddr']
Jun 18 18:07:09 [9577] ha-idg-1        cib:     info: cib_perform_op:   -- /cib/configuration/resources/primitive[@id='fence_ilo_ha-idg-2']/instance_attributes[@id='fence_ha-idg-2-instance_attributes-1']/nvpair[@id='fence_ha-idg-2-instan
ce_attributes-1-login']
Jun 18 18:07:09 [9577] ha-idg-1        cib:     info: cib_perform_op:   -- /cib/configuration/resources/primitive[@id='fence_ilo_ha-idg-2']/instance_attributes[@id='fence_ha-idg-2-instance_attributes-2']/nvpair[@id='fence_ha-idg-2-instan
ce_attributes-2-passwd']
Jun 18 18:07:09 [9577] ha-idg-1        cib:     info: cib_perform_op:   -- /cib/configuration/resources/primitive[@id='fence_ilo_ha-idg-2']/instance_attributes[@id='fence_ha-idg-2-instance_attributes-3']/nvpair[@id='fence_ha-idg-2-instan
ce_attributes-3-ssl_insecure']
Jun 18 18:07:09 [9577] ha-idg-1        cib:     info: cib_perform_op:   -- /cib/configuration/resources/primitive[@id='fence_ilo_ha-idg-2']/instance_attributes[@id='fence_ha-idg-2-instance_attributes-4']/nvpair[@id='fence_ha-idg-2-instan
ce_attributes-4-delay']
Jun 18 18:07:09 [9577] ha-idg-1        cib:     info: cib_perform_op:   +  /cib:  @epoch=6991, @num_updates=0
Jun 18 18:07:09 [9577] ha-idg-1        cib:     info: cib_perform_op:   ++ /cib/configuration/resources/primitive[@id='fence_ilo_ha-idg-2']/instance_attributes[@id='fence_ha-idg-2-instance_attributes']:  <nvpair name="pcmk_off_action" va
lue="Off" id="fence_ha-idg-2-instance_attributes-pcmk_off_action"/>

 ...

cluster reacts immediately:
Jun 18 18:07:10 [9581] ha-idg-1    pengine:   notice: LogAction:         * Restart    vm_sim                 (             ha-idg-2 )
Jun 18 18:07:10 [9581] ha-idg-1    pengine:   notice: LogAction:         * Restart    vm_geneious            (             ha-idg-2 )
Jun 18 18:07:10 [9581] ha-idg-1    pengine:   notice: LogAction:         * Restart    vm_idcc_devel          (             ha-idg-2 )
Jun 18 18:07:10 [9581] ha-idg-1    pengine:   notice: LogAction:         * Restart    vm_genetrap            (             ha-idg-2 )
Jun 18 18:07:10 [9581] ha-idg-1    pengine:   notice: LogAction:         * Restart    vm_mouseidgenes        (             ha-idg-2 )
Jun 18 18:07:10 [9581] ha-idg-1    pengine:   notice: LogAction:         * Restart    vm_greensql            (             ha-idg-2 )
Jun 18 18:07:10 [9581] ha-idg-1    pengine:   notice: LogAction:         * Restart    vm_severin             (             ha-idg-2 )

What else surprises me:
With the changes of the stonith agent a new epoch (6991) was created. Afterwards, all start/stop actions of the domains, happening hours later, relate to this epoch:

 ...
Jun 18 21:15:07 [9577] ha-idg-1        cib:     info: cib_perform_op:   Diff: --- 2.6991.178 2         <========================
Jun 18 21:15:07 [9577] ha-idg-1        cib:     info: cib_perform_op:   Diff: +++ 2.6991.179 (null)    <========================
Jun 18 21:15:07 [9577] ha-idg-1        cib:     info: cib_perform_op:   +  /cib:  @num_updates=179
Jun 18 21:15:07 [9577] ha-idg-1        cib:     info: cib_perform_op:   +  /cib/status/node_state[@id='1084777492']/lrm[@id='1084777492']/lrm_resources/lrm_resource[@id='vm_sim']/lrm_rsc_op[@id='vm_sim_last_0']:  @operation_key=vm_sim_start_0, @operation=start, @transition-key=33:178:0:612aaad1-30d6-4a94-978e-fcbece63cb8f, @transition-magic=0:0;33:178:0:612aaad1-30d6-4a94-978e-fcbece63cb8f, @call-id=4693, @exec-time=2928
Jun 18 21:15:07 [9577] ha-idg-1        cib:     info: cib_process_request:      Completed cib_modify operation for section status: OK (rc=0, origin=ha-idg-2/crmd/1977, version=2.6991.179)
Jun 18 21:15:07 [9583] ha-idg-1       crmd:     info: match_graph_event:        Action vm_sim_start_0 (33) confirmed on ha-idg-2 (rc=0)
Jun 18 21:15:07 [9583] ha-idg-1       crmd:   notice: te_rsc_command:   Initiating monitor operation vm_sim_monitor_30000 on ha-idg-2 | action 34
Jun 18 21:15:07 [9578] ha-idg-1 stonith-ng:     info: update_cib_stonith_devices_v2:    Updating device list from the cib: modify lrm_rsc_op[@id='vm_sim_last_0']
Jun 18 21:15:07 [9578] ha-idg-1 stonith-ng:     info: cib_devices_update:       Updating devices to version 2.6991.179
Jun 18 21:15:07 [9578] ha-idg-1 stonith-ng:   notice: unpack_config:    On loss of CCM Quorum: Ignore
Jun 18 21:15:07 [9578] ha-idg-1 stonith-ng:     info: cib_device_update:        Device fence_ilo_ha-idg-1 has been disabled on ha-idg-1: score=-INFINITY
Jun 18 21:15:07 [9577] ha-idg-1        cib:     info: cib_perform_op:   Diff: --- 2.6991.179 2
Jun 18 21:15:07 [9577] ha-idg-1        cib:     info: cib_perform_op:   Diff: +++ 2.6991.180 (null)
 ...

Continuing just to the next change of the configuration, then a new epoch is created. 
Normally this epochs are changed very frequently, here they remain for a long time unchanged.
Does that mean that still, although hours later, the actions relate to this epoch ? What is the use of this epoch ?

Also interesting:
Jun 17 16:47:05 [9581] ha-idg-1    pengine:   notice: process_pe_message:       Calculated transition 62, saving inputs in /var/lib/pacemaker/pengine/pe-input-1085.bz2
Jun 17 16:47:05 [9583] ha-idg-1       crmd:     info: do_state_transition:      State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE | input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response
Jun 17 16:47:05 [9583] ha-idg-1       crmd:   notice: do_te_invoke:     Processing graph 62 (ref=pe_calc-dc-1560782825-446) derived from /var/lib/pacemaker/pengine/pe-input-1085.bz2
Jun 17 16:47:05 [9583] ha-idg-1       crmd:   notice: te_rsc_command:   Initiating stop operation vm_mausdb_stop_0 on ha-idg-2 | action 37
Jun 17 16:47:05 [9583] ha-idg-1       crmd:   notice: te_rsc_command:   Initiating stop operation vm_sim_stop_0 on ha-idg-2 | action 39
Jun 17 16:47:05 [9583] ha-idg-1       crmd:   notice: te_rsc_command:   Initiating stop operation vm_geneious_stop_0 on ha-idg-2 | action 41
Jun 17 16:47:05 [9583] ha-idg-1       crmd:   notice: te_rsc_command:   Initiating stop operation vm_idcc_devel_stop_0 on ha-idg-2 | action 43
Jun 17 16:47:05 [9583] ha-idg-1       crmd:   notice: te_rsc_command:   Initiating stop operation vm_genetrap_stop_0 on ha-idg-2 | action 45
Jun 17 16:47:05 [9583] ha-idg-1       crmd:   notice: te_rsc_command:   Initiating stop operation vm_mouseidgenes_stop_0 on ha-idg-2 | action 47
Jun 17 16:47:05 [9583] ha-idg-1       crmd:   notice: te_rsc_command:   Initiating stop operation vm_greensql_stop_0 on ha-idg-2 | action 49
Jun 17 16:47:05 [9583] ha-idg-1       crmd:   notice: te_rsc_command:   Initiating stop operation vm_severin_stop_0 on ha-idg-2 | action 51
 ...
Jun 17 16:48:20 [9583] ha-idg-1       crmd:   notice: run_graph:        Transition 62 (Complete=9, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-1085.bz2): Complete
Jun 17 16:48:20 [9583] ha-idg-1       crmd:     info: do_log:   Input I_TE_SUCCESS received in state S_TRANSITION_ENGINE from notify_crmd
Jun 17 16:48:20 [9583] ha-idg-1       crmd:   notice: do_state_transition:      State transition S_TRANSITION_ENGINE -> S_IDLE | input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd

15 min later:
 ...
Jun 17 17:03:20 [9581] ha-idg-1    pengine:   notice: LogAction:         * Restart    vm_mausdb              (             ha-idg-2 )
Jun 17 17:03:20 [9581] ha-idg-1    pengine:   notice: LogAction:         * Restart    vm_sim                 (             ha-idg-2 )
Jun 17 17:03:20 [9581] ha-idg-1    pengine:   notice: LogAction:         * Restart    vm_geneious            (             ha-idg-2 )
Jun 17 17:03:20 [9581] ha-idg-1    pengine:   notice: LogAction:         * Restart    vm_idcc_devel          (             ha-idg-2 )
Jun 17 17:03:20 [9581] ha-idg-1    pengine:   notice: LogAction:         * Restart    vm_genetrap            (             ha-idg-2 )
Jun 17 17:03:20 [9581] ha-idg-1    pengine:   notice: LogAction:         * Restart    vm_mouseidgenes        (             ha-idg-2 )
Jun 17 17:03:20 [9581] ha-idg-1    pengine:   notice: LogAction:         * Restart    vm_greensql            (             ha-idg-2 )
Jun 17 17:03:20 [9581] ha-idg-1    pengine:   notice: LogAction:         * Restart    vm_severin             (             ha-idg-2 )
 ...
Jun 17 17:03:20 [9583] ha-idg-1       crmd:   notice: te_rsc_command:   Initiating stop operation vm_mausdb_stop_0 on ha-idg-2 | action 29
Jun 17 17:03:20 [9583] ha-idg-1       crmd:   notice: te_rsc_command:   Initiating stop operation vm_sim_stop_0 on ha-idg-2 | action 32
Jun 17 17:03:20 [9583] ha-idg-1       crmd:   notice: te_rsc_command:   Initiating stop operation vm_geneious_stop_0 on ha-idg-2 | action 35
Jun 17 17:03:20 [9583] ha-idg-1       crmd:   notice: te_rsc_command:   Initiating stop operation vm_idcc_devel_stop_0 on ha-idg-2 | action 38
Jun 17 17:03:20 [9583] ha-idg-1       crmd:   notice: te_rsc_command:   Initiating stop operation vm_genetrap_stop_0 on ha-idg-2 | action 41
Jun 17 17:03:20 [9583] ha-idg-1       crmd:   notice: te_rsc_command:   Initiating stop operation vm_mouseidgenes_stop_0 on ha-idg-2 | action 44
Jun 17 17:03:20 [9583] ha-idg-1       crmd:   notice: te_rsc_command:   Initiating stop operation vm_greensql_stop_0 on ha-idg-2 | action 47
Jun 17 17:03:20 [9583] ha-idg-1       crmd:   notice: te_rsc_command:   Initiating stop operation vm_severin_stop_0 on ha-idg-2 | action 50
 ...
Despite the fact that the domains are already stopped they are stopped again !?!

And immediately after being stopped they are started again:
Jun 17 17:03:20 [9583] ha-idg-1       crmd:     info: match_graph_event:        Action vm_mausdb_stop_0 (29) confirmed on ha-idg-2 (rc=0)
Jun 17 17:03:20 [9583] ha-idg-1       crmd:   notice: te_rsc_command:   Initiating start operation vm_mausdb_start_0 on ha-idg-2 | action 30

Jun 17 17:03:20 [9583] ha-idg-1       crmd:     info: match_graph_event:        Action vm_idcc_devel_stop_0 (38) confirmed on ha-idg-2 (rc=0)
Jun 17 17:03:20 [9583] ha-idg-1       crmd:   notice: te_rsc_command:   Initiating start operation vm_idcc_devel_start_0 on ha-idg-2 | action 39
Stopped and immediately started again !?!

Also interesting:
 ...
Jun 19 14:47:42 [9583] ha-idg-1       crmd:     info: crm_timer_popped: PEngine Recheck Timer (I_PE_CALC) just popped (900000ms)
Jun 19 14:47:42 [9583] ha-idg-1       crmd:   notice: do_state_transition:      State transition S_IDLE -> S_POLICY_ENGINE | input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped
Jun 19 14:47:42 [9583] ha-idg-1       crmd:     info: do_state_transition:      Progressed to state S_POLICY_ENGINE after C_TIMER_POPPED
Jun 19 14:47:42 [9581] ha-idg-1    pengine:   notice: unpack_config:    On loss of CCM Quorum: Ignore
Jun 19 14:47:42 [9581] ha-idg-1    pengine:     info: unpack_status:    Node ha-idg-1 is in standby-mode
Jun 19 14:47:42 [9581] ha-idg-1    pengine:     info: determine_online_status_fencing:  Node ha-idg-1 is active
Jun 19 14:47:42 [9581] ha-idg-1    pengine:     info: determine_online_status:  Node ha-idg-1 is standby
Jun 19 14:47:42 [9581] ha-idg-1    pengine:     info: determine_online_status_fencing:  Node ha-idg-2 is active
Jun 19 14:47:42 [9581] ha-idg-1    pengine:     info: determine_online_status:  Node ha-idg-2 is online
Jun 19 14:47:42 [9581] ha-idg-1    pengine:  warning: unpack_rsc_op_failure:    Processing failed monitor of vm_mausdb on ha-idg-2: not running | rc=7
Jun 19 14:47:42 [9581] ha-idg-1    pengine:  warning: unpack_rsc_op_failure:    Processing failed migrate_to of vm_mouseidgenes on ha-idg-2: unknown error | rc=1
Jun 19 14:47:42 [9581] ha-idg-1    pengine:  warning: unpack_rsc_op_failure:    Processing failed migrate_to of vm_idcc_devel on ha-idg-2: unknown error | rc=1
Jun 19 14:47:42 [9581] ha-idg-1    pengine:  warning: unpack_rsc_op_failure:    Processing failed migrate_to of vm_sim on ha-idg-2: unknown error | rc=1
Jun 19 14:47:42 [9581] ha-idg-1    pengine:  warning: unpack_rsc_op_failure:    Processing failed migrate_to of vm_genetrap on ha-idg-2: unknown error | rc=1
Jun 19 14:47:42 [9581] ha-idg-1    pengine:  warning: unpack_rsc_op_failure:    Processing failed migrate_to of vm_geneious on ha-idg-2: unknown error | rc=1
Jun 19 14:47:42 [9581] ha-idg-1    pengine:  warning: unpack_rsc_op_failure:    Processing failed migrate_to of vm_greensql on ha-idg-2: unknown error | rc=1
Jun 19 14:47:42 [9581] ha-idg-1    pengine:  warning: unpack_rsc_op_failure:    Processing failed migrate_to of vm_severin on ha-idg-2: unknown error | rc=1
Jun 19 14:47:42 [9581] ha-idg-1    pengine:     info: unpack_node_loop: Node 1084777482 is already processed
Jun 19 14:47:42 [9581] ha-idg-1    pengine:     info: unpack_node_loop: Node 1084777492 is already processed
Jun 19 14:47:42 [9581] ha-idg-1    pengine:     info: unpack_node_loop: Node 1084777482 is already processed
Jun 19 14:47:42 [9581] ha-idg-1    pengine:     info: unpack_node_loop: Node 1084777492 is already processed
 ...
Jun 19 14:47:42 [9581] ha-idg-1    pengine:   notice: process_pe_message:       ===============> Calculated transition 250 <===============, saving inputs in /var/lib/pacemaker/pengine/pe-input-1273.bz2    

Jun 19 14:47:42 [9583] ha-idg-1       crmd:     info: do_state_transition:      State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE | input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response
Jun 19 14:47:42 [9583] ha-idg-1       crmd:   notice: do_te_invoke:     Processing graph 250 (ref=pe_calc-dc-1560948462-3241) derived from /var/lib/pacemaker/pengine/pe-input-1273.bz2   <===============================
Jun 19 14:47:42 [9583] ha-idg-1       crmd:   notice: te_rsc_command:   Initiating stop operation vm_sim_stop_0 on ha-idg-2 | action 32
Jun 19 14:47:42 [9583] ha-idg-1       crmd:   notice: te_rsc_command:   Initiating stop operation vm_geneious_stop_0 on ha-idg-2 | action 35
Jun 19 14:47:42 [9583] ha-idg-1       crmd:   notice: te_rsc_command:   Initiating stop operation vm_idcc_devel_stop_0 on ha-idg-2 | action 38
Jun 19 14:47:42 [9583] ha-idg-1       crmd:   notice: te_rsc_command:   Initiating stop operation vm_genetrap_stop_0 on ha-idg-2 | action 41
Jun 19 14:47:42 [9583] ha-idg-1       crmd:   notice: te_rsc_command:   Initiating stop operation vm_mouseidgenes_stop_0 on ha-idg-2 | action 44
Jun 19 14:47:42 [9583] ha-idg-1       crmd:   notice: te_rsc_command:   Initiating stop operation vm_greensql_stop_0 on ha-idg-2 | action 47
Jun 19 14:47:42 [9583] ha-idg-1       crmd:   notice: te_rsc_command:   Initiating stop operation vm_severin_stop_0 on ha-idg-2 | action 50
 ...
The changes are calculated to transition 250 and completed:
 ...
Jun 19 14:47:46 [9583] ha-idg-1       crmd:   notice: run_graph:        =========> Transition 250 (Complete=22, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-1273.bz2): Complete <=========== 

Jun 19 14:47:46 [9583] ha-idg-1       crmd:     info: do_log:   Input I_TE_SUCCESS received in state S_TRANSITION_ENGINE from notify_crmd
Jun 19 14:47:46 [9583] ha-idg-1       crmd:   notice: do_state_transition:      State transition S_TRANSITION_ENGINE -> S_IDLE | input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd
Jun 19 14:47:46 [9578] ha-idg-1 stonith-ng:     info: update_cib_stonith_devices_v2:    Updating device list from the cib: modify lrm_rsc_op[@id='vm_mouseidgenes_monitor_30000']
Jun 19 14:47:46 [9578] ha-idg-1 stonith-ng:     info: cib_devices_update:       Updating devices to version 2.7007.890
Jun 19 14:47:46 [9578] ha-idg-1 stonith-ng:   notice: unpack_config:    On loss of CCM Quorum: Ignore
Jun 19 14:47:46 [9578] ha-idg-1 stonith-ng:     info: cib_device_update:        Device fence_ilo_ha-idg-1 has been disabled on ha-idg-1: score=-INFINITY

Transition 250 is completed. But
 ...
Jun 19 14:47:51 [9577] ha-idg-1        cib:     info: cib_process_ping: Reporting our current digest to ha-idg-1: ae505b65505c427ab0a45d36717b4135 for 2.7007.890 (0x1ea7670 0)
Jun 19 14:57:32 [9577] ha-idg-1        cib:     info: cib_process_request:      Forwarding cib_delete operation for section //node_state[@uname='ha-idg-1']//lrm_resource[@id='vm_mouseidgenes'] to all (origin=local/crmd/707)
Jun 19 14:57:32 [9577] ha-idg-1        cib:     info: cib_perform_op:   Diff: --- 2.7007.890 2
Jun 19 14:57:32 [9577] ha-idg-1        cib:     info: cib_perform_op:   Diff: +++ 2.7007.891 2d0272cf2594d336b179085d63b67c6f
Jun 19 14:57:32 [9577] ha-idg-1        cib:     info: cib_perform_op:   -- /cib/status/node_state[@id='1084777482']/lrm[@id='1084777482']/lrm_resources/lrm_resource[@id='vm_mouseidgenes']
Jun 19 14:57:32 [9577] ha-idg-1        cib:     info: cib_perform_op:   +  /cib:  @num_updates=891
Jun 19 14:57:32 [9577] ha-idg-1        cib:     info: cib_process_request:      Completed cib_delete operation for section //node_state[@uname='ha-idg-1']//lrm_resource[@id='vm_mouseidgenes']: OK (rc=0, origin=ha-idg-1/crmd/707, version=
2.7007.890)
Jun 19 14:57:32 [9583] ha-idg-1       crmd:     info: delete_resource:  Removing resource vm_mouseidgenes for 4460a5c3-c009-44f6-a01d-52f93e731fda (root) on ha-idg-1
Jun 19 14:57:32 [9583] ha-idg-1       crmd:     info: notify_deleted:   Notifying 4460a5c3-c009-44f6-a01d-52f93e731fda on ha-idg-1 that vm_mouseidgenes was deleted
Jun 19 14:57:32 [9577] ha-idg-1        cib:     info: cib_process_request:      Forwarding cib_delete operation for section //node_state[@uname='ha-idg-1']//lrm_resource[@id='vm_mouseidgenes'] to all (origin=local/crmd/708)
Jun 19 14:57:32 [9583] ha-idg-1       crmd:  warning: qb_ipcs_event_sendv:      new_event_notification (9583-10294-15): Broken pipe (32)
Jun 19 14:57:32 [9577] ha-idg-1        cib:     info: cib_perform_op:   Diff: --- 2.7007.890 2
Jun 19 14:57:32 [9577] ha-idg-1        cib:     info: cib_perform_op:   Diff: +++ 2.7007.891 (null)
Jun 19 14:57:32 [9577] ha-idg-1        cib:     info: cib_perform_op:   -- /cib/status/node_state[@id='1084777482']/lrm[@id='1084777482']/lrm_resources/lrm_resource[@id='vm_mouseidgenes']
Jun 19 14:57:32 [9577] ha-idg-1        cib:     info: cib_perform_op:   +  /cib:  @num_updates=891
Jun 19 14:57:32 [9577] ha-idg-1        cib:     info: cib_process_request:      Completed cib_delete operation for section //node_state[@uname='ha-idg-1']//lrm_resource[@id='vm_mouseidgenes']: OK (rc=0, origin=ha-idg-1/crmd/708, version=
2.7007.891)
Jun 19 14:57:32 [9577] ha-idg-1        cib:     info: cib_process_request:      Forwarding cib_modify operation for section crm_config to all (origin=local/crmd/710)

Jun 19 14:57:32 [9583] ha-idg-1       crmd:     info: abort_transition_graph:   ========> Transition 250 aborted <============== by deletion of lrm_resource[@id='vm_mouseidgenes']: Resource state removal | cib=2.7007.891 source=abort_unless_down:344 path=/cib/sta
tus/node_state[@id='1084777482']/lrm[@id='1084777482']/lrm_resources/lrm_resource[@id='vm_mouseidgenes'] complete=true

Jun 19 14:57:32 [9583] ha-idg-1       crmd:   notice: do_state_transition:      State transition S_IDLE -> S_POLICY_ENGINE | input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph
Jun 19 14:57:32 [9578] ha-idg-1 stonith-ng:     info: update_cib_stonith_devices_v2:    Updating device list from the cib: delete lrm_resource[@id='vm_mouseidgenes']
Jun 19 14:57:32 [9578] ha-idg-1 stonith-ng:     info: cib_devices_update:       Updating devices to version 2.7007.891
Jun 19 14:57:32 [9577] ha-idg-1        cib:     info: cib_perform_op:   Diff: --- 2.7007.891 2
Jun 19 14:57:32 [9577] ha-idg-1        cib:     info: cib_perform_op:   Diff: +++ 2.7008.0 (null)
Jun 19 14:57:32 [9577] ha-idg-1        cib:     info: cib_perform_op:   +  /cib:  @epoch=7008, @num_updates=0
Jun 19 14:57:32 [9577] ha-idg-1        cib:     info: cib_perform_op:   +  /cib/configuration/crm_config/cluster_property_set[@id='cib-bootstrap-options']/nvpair[@id='cib-bootstrap-options-last-lrm-refresh']:  @value=1560949052
Jun 19 14:57:32 [9578] ha-idg-1 stonith-ng:   notice: unpack_config:    On loss of CCM Quorum: Ignore
Jun 19 14:57:32 [9577] ha-idg-1        cib:     info: cib_process_request:      Completed cib_modify operation for section crm_config: OK (rc=0, origin=ha-idg-1/crmd/710, version=2.7008.0)
Jun 19 14:57:32 [9583] ha-idg-1       crmd:     info: abort_transition_graph:   Transition 250 aborted by cib-bootstrap-options-last-lrm-refresh doing modify last-lrm-refresh=1560949052: Configuration change | cib=2.7008.0 source=te_update_diff_v2:500 path=/cib/configuration/crm_config/cluster_property_set[@id='cib-bootstrap-options']/nvpair[@id='cib-bootstrap-options-last-lrm-refresh'] complete=true

a few minutes later transition 250 is aborted. How can something which is completed being aborted ?

Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling
Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich Bassler, Kerstin Guenther
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671