[ClusterLabs] Stop one VM, another tries to migrate

Ken Gaillot kgaillot at redhat.com
Tue Jun 26 12:40:11 EDT 2018


On Tue, 2018-06-26 at 07:19 -0400, Jason Gauthier wrote:
> Greetings,
> 
>    I am using my cluster platform primarily for virtual machines.
> While I've still been in implementation mode, I felt like things were
> somewhat stable. However, I've noticed that sometimes when I stop a
> resource another resource tries to migrate.   I did this morning, and
> that scenario occurred.   Basically, I 'crm resource stop Omicron',
> and the machine 'Lapras' tried to migrate as well.  I've included
> cluster logs since I can't make heads or tails of this decision.

Have a look at resource-stickiness.

Basically, the cluster will by default try to balance the number of
resources across all nodes (subject to your constraints of course).
Stickiness tells it to prefer to keep running resources where they are,
and only consider balancing when starting a resource.

> 
> I've attached a cluster log, but also put it in line here since I'm
> not sure the preferred way.  This log only pertains to the actions
> since issuing the resource stop.
> 
> Jun 26 07:01:49 [4552] alpha        cib:     info: cib_perform_op:
>  Diff: --- 1.442.64 2
> Jun 26 07:01:49 [4552] alpha        cib:     info: cib_perform_op:
>  Diff: +++ 1.443.0 92508eef9d32f83b93e7f1ed2dff3340
> Jun 26 07:01:49 [4552] alpha        cib:     info: cib_perform_op:
>  +  /cib:  @epoch=443, @num_updates=0
> Jun 26 07:01:49 [4552] alpha        cib:     info: cib_perform_op:
>  +  /cib/configuration/resources/primitive[@id='Omicron']/meta_attrib
> utes[@id='Omicron-meta_attributes']/nvpair[@id='Omicron-
> meta_attributes-target-ro
> le']:  @value=Stopped
> Jun 26 07:01:49 [4557] alpha       crmd:     info:
> abort_transition_graph:      Transition aborted by
> Omicron-meta_attributes-target-role doing modify target-role=Stopped:
> Configuration change | cib=1.443.0 source=te_upda
> te_diff:444
> path=/cib/configuration/resources/primitive[@id='Omicron']/meta_attri
> butes[@id='Omicron-meta_attributes']/nvpair[@id='Omicron-
> meta_attributes-target-role']
> complete=true
> Jun 26 07:01:49 [4557] alpha       crmd:   notice:
> do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE |
> input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph
> Jun 26 07:01:49 [4553] alpha stonith-ng:     info:
> update_cib_stonith_devices_v2:       Updating device list from the
> cib: modify nvpair[@id='Omicron-meta_attributes-target-role']
> Jun 26 07:01:49 [4553] alpha stonith-ng:     info:
> cib_devices_update:
>  Updating devices to version 1.443.0
> Jun 26 07:01:49 [4552] alpha        cib:     info:
> cib_process_request: Completed cib_apply_diff operation for section
> 'all': OK (rc=0, origin=alpha/cibadmin/2, version=1.443.0)
> Jun 26 07:01:49 [4553] alpha stonith-ng:     info: cib_device_update:
>  Device ipmi_alpha has been disabled on alpha: score=-INFINITY
> Jun 26 07:01:49 [4552] alpha        cib:     info: cib_file_backup:
>  Archived previous version as /var/lib/pacemaker/cib/cib-83.raw
> Jun 26 07:01:49 [4552] alpha        cib:     info:
> cib_file_write_with_digest:  Wrote version 1.443.0 of the CIB to disk
> (digest: 2a60981d2eceb59a6ed3015ce20f9dff)
> Jun 26 07:01:49 [4552] alpha        cib:     info:
> cib_file_write_with_digest:  Reading cluster configuration file
> /var/lib/pacemaker/cib/cib.g9gWwY (digest:
> /var/lib/pacemaker/cib/cib.tslkpk)
> Jun 26 07:01:49 [4556] alpha    pengine:     info:
> determine_online_status_fencing:     Node beta is active
> Jun 26 07:01:49 [4556] alpha    pengine:     info:
> determine_online_status:     Node beta is online
> Jun 26 07:01:49 [4556] alpha    pengine:     info:
> determine_online_status_fencing:     Node alpha is active
> Jun 26 07:01:49 [4556] alpha    pengine:     info:
> determine_online_status:     Node alpha is online
> Jun 26 07:01:49 [4556] alpha    pengine:     info:
> determine_op_status: Operation monitor found resource Calibre active
> on beta
> Jun 26 07:01:49 [4556] alpha    pengine:     info:
> determine_op_status: Operation monitor found resource Calibre active
> on beta
> Jun 26 07:01:49 [4556] alpha    pengine:     info:
> determine_op_status: Operation monitor found resource Iota active on
> beta
> Jun 26 07:01:49 [4556] alpha    pengine:     info:
> determine_op_status: Operation monitor found resource Iota active on
> beta
> Jun 26 07:01:49 [4556] alpha    pengine:     info:
> determine_op_status: Operation monitor found resource Lapras active
> on
> beta
> Jun 26 07:01:49 [4556] alpha    pengine:     info:
> determine_op_status: Operation monitor found resource Lapras active
> on
> beta
> Jun 26 07:01:49 [4556] alpha    pengine:     info:
> determine_op_status: Operation monitor found resource Tau active on
> beta
> Jun 26 07:01:49 [4556] alpha    pengine:     info:
> determine_op_status: Operation monitor found resource Tau active on
> beta
> Jun 26 07:01:49 [4556] alpha    pengine:     info:
> determine_op_status: Operation monitor found resource Omicron active
> on alpha
> Jun 26 07:01:49 [4556] alpha    pengine:     info:
> determine_op_status: Operation monitor found resource Omicron active
> on alpha
> Jun 26 07:01:49 [4556] alpha    pengine:     info:
> determine_op_status: Operation monitor found resource Plex active on
> alpha
> Jun 26 07:01:49 [4556] alpha    pengine:     info:
> determine_op_status: Operation monitor found resource Plex active on
> alpha
> Jun 26 07:01:49 [4556] alpha    pengine:     info:
> determine_op_status: Operation monitor found resource Umbreon active
> on alpha
> Jun 26 07:01:49 [4556] alpha    pengine:     info:
> determine_op_status: Operation monitor found resource Umbreon active
> on alpha
> Jun 26 07:01:49 [4556] alpha    pengine:     info:
> determine_op_status: Operation monitor found resource Nu active on
> alpha
> Jun 26 07:01:49 [4556] alpha    pengine:     info:
> determine_op_status: Operation monitor found resource Nu active on
> alpha
> Jun 26 07:01:49 [4556] alpha    pengine:     info: native_print:
>  Omicron (ocf::heartbeat:VirtualDomain): Started alpha (disabled)
> Jun 26 07:01:49 [4556] alpha    pengine:     info: native_print:
>  Calibre (ocf::heartbeat:VirtualDomain): Started beta
> Jun 26 07:01:49 [4556] alpha    pengine:     info: native_print:
>  Iota    (ocf::heartbeat:VirtualDomain): Started beta
> Jun 26 07:01:49 [4556] alpha    pengine:     info: native_print:
>  Plex    (ocf::heartbeat:VirtualDomain): Started alpha
> Jun 26 07:01:49 [4556] alpha    pengine:     info: native_print:
>  Nu      (ocf::heartbeat:VirtualDomain): Started alpha
> Jun 26 07:01:49 [4556] alpha    pengine:     info: native_print:
>  ipmi_alpha      (stonith:external/ipmi):        Started beta
> Jun 26 07:01:49 [4556] alpha    pengine:     info: native_print:
>  ipmi_beta       (stonith:external/ipmi):        Started alpha
> Jun 26 07:01:49 [4556] alpha    pengine:     info: native_print:
>  Tau     (ocf::heartbeat:VirtualDomain): Started beta
> Jun 26 07:01:49 [4556] alpha    pengine:     info: native_print:
>  Lapras  (ocf::heartbeat:VirtualDomain): Started beta
> Jun 26 07:01:49 [4556] alpha    pengine:     info: native_print:
>  Umbreon (ocf::heartbeat:VirtualDomain): Started alpha
> Jun 26 07:01:49 [4556] alpha    pengine:     info: native_color:
>  Resource Omicron cannot run anywhere
> Jun 26 07:01:49 [4556] alpha    pengine:     info:
> RecurringOp:  Start
> recurring monitor (10s) for Lapras on alpha
> Jun 26 07:01:49 [4556] alpha    pengine:   notice: LogActions:  Stop
>  Omicron (alpha)
> Jun 26 07:01:49 [4556] alpha    pengine:     info: LogActions:  Leave
>  Calibre (Started beta)
> Jun 26 07:01:49 [4556] alpha    pengine:     info: LogActions:  Leave
>  Iota    (Started beta)
> Jun 26 07:01:49 [4556] alpha    pengine:     info: LogActions:  Leave
>  Plex    (Started alpha)
> Jun 26 07:01:49 [4556] alpha    pengine:     info: LogActions:  Leave
>  Nu      (Started alpha)
> Jun 26 07:01:49 [4556] alpha    pengine:     info: LogActions:  Leave
>  ipmi_alpha      (Started beta)
> Jun 26 07:01:49 [4556] alpha    pengine:     info: LogActions:  Leave
>  ipmi_beta       (Started alpha)
> Jun 26 07:01:49 [4556] alpha    pengine:     info: LogActions:  Leave
>  Tau     (Started beta)
> Jun 26 07:01:49 [4556] alpha    pengine:   notice: LogActions:
> Migrate Lapras  (Started beta -> alpha)
> Jun 26 07:01:49 [4556] alpha    pengine:     info: LogActions:  Leave
>  Umbreon (Started alpha)
> Jun 26 07:01:49 [4556] alpha    pengine:   notice:
> process_pe_message:
>  Calculated transition 268, saving inputs in
> /var/lib/pacemaker/pengine/pe-input-548.bz2
> Jun 26 07:01:49 [4557] alpha       crmd:     info:
> do_state_transition: State transition S_POLICY_ENGINE ->
> S_TRANSITION_ENGINE | input=I_PE_SUCCESS cause=C_IPC_MESSAGE
> origin=handle_response
> Jun 26 07:01:49 [4557] alpha       crmd:     info: do_te_invoke:
>  Processing graph 268 (ref=pe_calc-dc-1530010909-426) derived from
> /var/lib/pacemaker/pengine/pe-input-548.bz2
> Jun 26 07:01:49 [4557] alpha       crmd:   notice: te_rsc_command:
>  Initiating stop operation Omicron_stop_0 locally on alpha | action
> 12
> Jun 26 07:01:49 [4554] alpha       lrmd:     info:
> cancel_recurring_action:     Cancelling ocf operation
> Omicron_monitor_10000
> Jun 26 07:01:49 [4557] alpha       crmd:     info: do_lrm_rsc_op:
>  Performing key=12:268:0:a472c072-7fdc-4996-abe6-64a46331a1df
> op=Omicron_stop_0
> Jun 26 07:01:49 [4554] alpha       lrmd:     info: log_execute:
> executing - rsc:Omicron action:stop call_id:162
> Jun 26 07:01:49 [4557] alpha       crmd:   notice: te_rsc_command:
>  Initiating migrate_to operation Lapras_migrate_to_0 on beta | action
> 30
> Jun 26 07:01:49 [4557] alpha       crmd:     info: process_lrm_event:
>  Result of monitor operation for Omicron on alpha: Cancelled |
> call=156 key=Omicron_monitor_10000 confirmed=true
> Jun 26 07:01:49 [4552] alpha        cib:     info: cib_perform_op:
>  Diff: --- 1.443.0 2
> Jun 26 07:01:49 [4552] alpha        cib:     info: cib_perform_op:
>  Diff: +++ 1.443.1 (null)
> Jun 26 07:01:49 [4552] alpha        cib:     info: cib_perform_op:
>  +  /cib:  @num_updates=1
> Jun 26 07:01:49 [4552] alpha        cib:     info: cib_perform_op:
>  +  /cib/status/node_state[@id='1084772369']/lrm[@id='1084772369']/lr
> m_resources/lrm_resource[@id='Lapras']/lrm_rsc_op[@id='Lapras_last_0'
> ]:
>  @operati
> on_key=Lapras_migrate_to_0, @operation=migrate_to,
> @transition-key=30:268:0:a472c072-7fdc-4996-abe6-64a46331a1df,
> @transition-magic=-1:193;30:268:0:a472c072-7fdc-4996-abe6-
> 64a46331a1df,
> @call-id=-1, @rc-code=193, @op-status=-1, @last-run=1530010909,
> @last-rc-change=1530010909, @exec-time=0, @mi
> Jun 26 07:01:49 [4552] alpha        cib:     info:
> cib_process_request: Completed cib_modify operation for section
> status: OK (rc=0, origin=beta/crmd/117, version=1.443.1)
> VirtualDomain(Omicron)[30885]:  2018/06/26_07:01:49 INFO: Issuing
> graceful shutdown request for domain Omicron.
> Jun 26 07:01:54 [4552] alpha        cib:     info: cib_process_ping:
>  Reporting our current digest to alpha:
> 31cad76e7e3b084f6b7ed1ea3e909c4b for 1.443.1 (0x560b20c971a0 0)
> Jun 26 07:02:09 [4552] alpha        cib:     info: cib_perform_op:
>  Diff: --- 1.443.1 2
> Jun 26 07:02:09 [4552] alpha        cib:     info: cib_perform_op:
>  Diff: +++ 1.443.2 (null)
> Jun 26 07:02:09 [4552] alpha        cib:     info: cib_perform_op:
>  +  /cib:  @num_updates=2
> Jun 26 07:02:09 [4552] alpha        cib:     info: cib_perform_op:
>  +  /cib/status/node_state[@id='1084772369']/lrm[@id='1084772369']/lr
> m_resources/lrm_resource[@id='Lapras']/lrm_rsc_op[@id='Lapras_last_fa
> ilure_0']:
>  @operation_key=Lapras_migrate_to_0, @operation=migrate_to,
> @transition-key=30:268:0:a472c072-7fdc-4996-abe6-64a46331a1df,
> @transition-magic=2:1;30:268:0:a472c072-7fdc-4996-abe6-64a46331a1df,
> @call-id=139, @rc-code=1, @op-status=2, @last-run=1530010909,
> @last-rc-change=1530010909, @exec-time=200
> Jun 26 07:02:09 [4552] alpha        cib:     info: cib_perform_op:
>  +  /cib/status/node_state[@id='1084772369']/lrm[@id='1084772369']/lr
> m_resources/lrm_resource[@id='Lapras']/lrm_rsc_op[@id='Lapras_last_0'
> ]:
>  @transition-magic=2:1;30:268:0:a472c072-7fdc-4996-abe6-64a46331a1df,
> @call-id=139, @rc-code=1, @op-status=2, @exec-time=20004,
> @queue-time=1
> Jun 26 07:02:09 [4552] alpha        cib:     info:
> cib_process_request: Completed cib_modify operation for section
> status: OK (rc=0, origin=beta/crmd/118, version=1.443.2)
> Jun 26 07:02:09 [4557] alpha       crmd:  warning: status_from_rc:
>  Action 30 (Lapras_migrate_to_0) on beta failed (target: 0 vs. rc:
> 1):
> Error
> Jun 26 07:02:09 [4557] alpha       crmd:   notice:
> abort_transition_graph:      Transition aborted by operation
> Lapras_migrate_to_0 'modify' on beta: Event failed |
> magic=2:1;30:268:0:a472c072-7fdc-4996-abe6-64a46331a1df cib=1.443.2
> source=match_graph_event:310 complete=false
> Jun 26 07:02:09 [4557] alpha       crmd:     info: match_graph_event:
>  Action Lapras_migrate_to_0 (30) confirmed on beta (rc=1)
> Jun 26 07:02:09 [4557] alpha       crmd:     info:
> process_graph_event: Detected action (268.30)
> Lapras_migrate_to_0.139=unknown error: failed
> Jun 26 07:02:09 [4557] alpha       crmd:  warning: status_from_rc:
>  Action 30 (Lapras_migrate_to_0) on beta failed (target: 0 vs. rc:
> 1):
> Error
> 
> Jun 26 07:02:09 [4557] alpha       crmd:     info:
> abort_transition_graph:      Transition aborted by operation
> Lapras_migrate_to_0 'modify' on beta: Event failed |
> magic=2:1;30:268:0:a472c072-7fdc-4996-abe6-64a46331a1df cib=1.443.2
> source=match_graph_event:310 complete=false
> Jun 26 07:02:09 [4557] alpha       crmd:     info: match_graph_event:
>  Action Lapras_migrate_to_0 (30) confirmed on beta (rc=1)
> Jun 26 07:02:09 [4557] alpha       crmd:     info:
> process_graph_event: Detected action (268.30)
> Lapras_migrate_to_0.139=unknown error: failed
> Jun 26 07:02:14 [4552] alpha        cib:     info: cib_process_ping:
>  Reporting our current digest to alpha:
> 05f076e7f5fcb9bd9695af7a83f2ab0a for 1.443.2 (0x560b20c971a0 0)
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.
> pdf
> Bugs: http://bugs.clusterlabs.org
-- 
Ken Gaillot <kgaillot at redhat.com>



More information about the Users mailing list