[ClusterLabs] Stop one VM, another tries to migrate
Ken Gaillot
kgaillot at redhat.com
Tue Jun 26 12:40:11 EDT 2018
On Tue, 2018-06-26 at 07:19 -0400, Jason Gauthier wrote:
> Greetings,
>
> I am using my cluster platform primarily for virtual machines.
> While I've still been in implementation mode, I felt like things were
> somewhat stable. However, I've noticed that sometimes when I stop a
> resource another resource tries to migrate. I did this morning, and
> that scenario occurred. Basically, I 'crm resource stop Omicron',
> and the machine 'Lapras' tried to migrate as well. I've included
> cluster logs since I can't make heads or tails of this decision.
Have a look at resource-stickiness.
Basically, the cluster will by default try to balance the number of
resources across all nodes (subject to your constraints of course).
Stickiness tells it to prefer to keep running resources where they are,
and only consider balancing when starting a resource.
>
> I've attached a cluster log, but also put it in line here since I'm
> not sure the preferred way. This log only pertains to the actions
> since issuing the resource stop.
>
> Jun 26 07:01:49 [4552] alpha cib: info: cib_perform_op:
> Diff: --- 1.442.64 2
> Jun 26 07:01:49 [4552] alpha cib: info: cib_perform_op:
> Diff: +++ 1.443.0 92508eef9d32f83b93e7f1ed2dff3340
> Jun 26 07:01:49 [4552] alpha cib: info: cib_perform_op:
> + /cib: @epoch=443, @num_updates=0
> Jun 26 07:01:49 [4552] alpha cib: info: cib_perform_op:
> + /cib/configuration/resources/primitive[@id='Omicron']/meta_attrib
> utes[@id='Omicron-meta_attributes']/nvpair[@id='Omicron-
> meta_attributes-target-ro
> le']: @value=Stopped
> Jun 26 07:01:49 [4557] alpha crmd: info:
> abort_transition_graph: Transition aborted by
> Omicron-meta_attributes-target-role doing modify target-role=Stopped:
> Configuration change | cib=1.443.0 source=te_upda
> te_diff:444
> path=/cib/configuration/resources/primitive[@id='Omicron']/meta_attri
> butes[@id='Omicron-meta_attributes']/nvpair[@id='Omicron-
> meta_attributes-target-role']
> complete=true
> Jun 26 07:01:49 [4557] alpha crmd: notice:
> do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE |
> input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph
> Jun 26 07:01:49 [4553] alpha stonith-ng: info:
> update_cib_stonith_devices_v2: Updating device list from the
> cib: modify nvpair[@id='Omicron-meta_attributes-target-role']
> Jun 26 07:01:49 [4553] alpha stonith-ng: info:
> cib_devices_update:
> Updating devices to version 1.443.0
> Jun 26 07:01:49 [4552] alpha cib: info:
> cib_process_request: Completed cib_apply_diff operation for section
> 'all': OK (rc=0, origin=alpha/cibadmin/2, version=1.443.0)
> Jun 26 07:01:49 [4553] alpha stonith-ng: info: cib_device_update:
> Device ipmi_alpha has been disabled on alpha: score=-INFINITY
> Jun 26 07:01:49 [4552] alpha cib: info: cib_file_backup:
> Archived previous version as /var/lib/pacemaker/cib/cib-83.raw
> Jun 26 07:01:49 [4552] alpha cib: info:
> cib_file_write_with_digest: Wrote version 1.443.0 of the CIB to disk
> (digest: 2a60981d2eceb59a6ed3015ce20f9dff)
> Jun 26 07:01:49 [4552] alpha cib: info:
> cib_file_write_with_digest: Reading cluster configuration file
> /var/lib/pacemaker/cib/cib.g9gWwY (digest:
> /var/lib/pacemaker/cib/cib.tslkpk)
> Jun 26 07:01:49 [4556] alpha pengine: info:
> determine_online_status_fencing: Node beta is active
> Jun 26 07:01:49 [4556] alpha pengine: info:
> determine_online_status: Node beta is online
> Jun 26 07:01:49 [4556] alpha pengine: info:
> determine_online_status_fencing: Node alpha is active
> Jun 26 07:01:49 [4556] alpha pengine: info:
> determine_online_status: Node alpha is online
> Jun 26 07:01:49 [4556] alpha pengine: info:
> determine_op_status: Operation monitor found resource Calibre active
> on beta
> Jun 26 07:01:49 [4556] alpha pengine: info:
> determine_op_status: Operation monitor found resource Calibre active
> on beta
> Jun 26 07:01:49 [4556] alpha pengine: info:
> determine_op_status: Operation monitor found resource Iota active on
> beta
> Jun 26 07:01:49 [4556] alpha pengine: info:
> determine_op_status: Operation monitor found resource Iota active on
> beta
> Jun 26 07:01:49 [4556] alpha pengine: info:
> determine_op_status: Operation monitor found resource Lapras active
> on
> beta
> Jun 26 07:01:49 [4556] alpha pengine: info:
> determine_op_status: Operation monitor found resource Lapras active
> on
> beta
> Jun 26 07:01:49 [4556] alpha pengine: info:
> determine_op_status: Operation monitor found resource Tau active on
> beta
> Jun 26 07:01:49 [4556] alpha pengine: info:
> determine_op_status: Operation monitor found resource Tau active on
> beta
> Jun 26 07:01:49 [4556] alpha pengine: info:
> determine_op_status: Operation monitor found resource Omicron active
> on alpha
> Jun 26 07:01:49 [4556] alpha pengine: info:
> determine_op_status: Operation monitor found resource Omicron active
> on alpha
> Jun 26 07:01:49 [4556] alpha pengine: info:
> determine_op_status: Operation monitor found resource Plex active on
> alpha
> Jun 26 07:01:49 [4556] alpha pengine: info:
> determine_op_status: Operation monitor found resource Plex active on
> alpha
> Jun 26 07:01:49 [4556] alpha pengine: info:
> determine_op_status: Operation monitor found resource Umbreon active
> on alpha
> Jun 26 07:01:49 [4556] alpha pengine: info:
> determine_op_status: Operation monitor found resource Umbreon active
> on alpha
> Jun 26 07:01:49 [4556] alpha pengine: info:
> determine_op_status: Operation monitor found resource Nu active on
> alpha
> Jun 26 07:01:49 [4556] alpha pengine: info:
> determine_op_status: Operation monitor found resource Nu active on
> alpha
> Jun 26 07:01:49 [4556] alpha pengine: info: native_print:
> Omicron (ocf::heartbeat:VirtualDomain): Started alpha (disabled)
> Jun 26 07:01:49 [4556] alpha pengine: info: native_print:
> Calibre (ocf::heartbeat:VirtualDomain): Started beta
> Jun 26 07:01:49 [4556] alpha pengine: info: native_print:
> Iota (ocf::heartbeat:VirtualDomain): Started beta
> Jun 26 07:01:49 [4556] alpha pengine: info: native_print:
> Plex (ocf::heartbeat:VirtualDomain): Started alpha
> Jun 26 07:01:49 [4556] alpha pengine: info: native_print:
> Nu (ocf::heartbeat:VirtualDomain): Started alpha
> Jun 26 07:01:49 [4556] alpha pengine: info: native_print:
> ipmi_alpha (stonith:external/ipmi): Started beta
> Jun 26 07:01:49 [4556] alpha pengine: info: native_print:
> ipmi_beta (stonith:external/ipmi): Started alpha
> Jun 26 07:01:49 [4556] alpha pengine: info: native_print:
> Tau (ocf::heartbeat:VirtualDomain): Started beta
> Jun 26 07:01:49 [4556] alpha pengine: info: native_print:
> Lapras (ocf::heartbeat:VirtualDomain): Started beta
> Jun 26 07:01:49 [4556] alpha pengine: info: native_print:
> Umbreon (ocf::heartbeat:VirtualDomain): Started alpha
> Jun 26 07:01:49 [4556] alpha pengine: info: native_color:
> Resource Omicron cannot run anywhere
> Jun 26 07:01:49 [4556] alpha pengine: info:
> RecurringOp: Start
> recurring monitor (10s) for Lapras on alpha
> Jun 26 07:01:49 [4556] alpha pengine: notice: LogActions: Stop
> Omicron (alpha)
> Jun 26 07:01:49 [4556] alpha pengine: info: LogActions: Leave
> Calibre (Started beta)
> Jun 26 07:01:49 [4556] alpha pengine: info: LogActions: Leave
> Iota (Started beta)
> Jun 26 07:01:49 [4556] alpha pengine: info: LogActions: Leave
> Plex (Started alpha)
> Jun 26 07:01:49 [4556] alpha pengine: info: LogActions: Leave
> Nu (Started alpha)
> Jun 26 07:01:49 [4556] alpha pengine: info: LogActions: Leave
> ipmi_alpha (Started beta)
> Jun 26 07:01:49 [4556] alpha pengine: info: LogActions: Leave
> ipmi_beta (Started alpha)
> Jun 26 07:01:49 [4556] alpha pengine: info: LogActions: Leave
> Tau (Started beta)
> Jun 26 07:01:49 [4556] alpha pengine: notice: LogActions:
> Migrate Lapras (Started beta -> alpha)
> Jun 26 07:01:49 [4556] alpha pengine: info: LogActions: Leave
> Umbreon (Started alpha)
> Jun 26 07:01:49 [4556] alpha pengine: notice:
> process_pe_message:
> Calculated transition 268, saving inputs in
> /var/lib/pacemaker/pengine/pe-input-548.bz2
> Jun 26 07:01:49 [4557] alpha crmd: info:
> do_state_transition: State transition S_POLICY_ENGINE ->
> S_TRANSITION_ENGINE | input=I_PE_SUCCESS cause=C_IPC_MESSAGE
> origin=handle_response
> Jun 26 07:01:49 [4557] alpha crmd: info: do_te_invoke:
> Processing graph 268 (ref=pe_calc-dc-1530010909-426) derived from
> /var/lib/pacemaker/pengine/pe-input-548.bz2
> Jun 26 07:01:49 [4557] alpha crmd: notice: te_rsc_command:
> Initiating stop operation Omicron_stop_0 locally on alpha | action
> 12
> Jun 26 07:01:49 [4554] alpha lrmd: info:
> cancel_recurring_action: Cancelling ocf operation
> Omicron_monitor_10000
> Jun 26 07:01:49 [4557] alpha crmd: info: do_lrm_rsc_op:
> Performing key=12:268:0:a472c072-7fdc-4996-abe6-64a46331a1df
> op=Omicron_stop_0
> Jun 26 07:01:49 [4554] alpha lrmd: info: log_execute:
> executing - rsc:Omicron action:stop call_id:162
> Jun 26 07:01:49 [4557] alpha crmd: notice: te_rsc_command:
> Initiating migrate_to operation Lapras_migrate_to_0 on beta | action
> 30
> Jun 26 07:01:49 [4557] alpha crmd: info: process_lrm_event:
> Result of monitor operation for Omicron on alpha: Cancelled |
> call=156 key=Omicron_monitor_10000 confirmed=true
> Jun 26 07:01:49 [4552] alpha cib: info: cib_perform_op:
> Diff: --- 1.443.0 2
> Jun 26 07:01:49 [4552] alpha cib: info: cib_perform_op:
> Diff: +++ 1.443.1 (null)
> Jun 26 07:01:49 [4552] alpha cib: info: cib_perform_op:
> + /cib: @num_updates=1
> Jun 26 07:01:49 [4552] alpha cib: info: cib_perform_op:
> + /cib/status/node_state[@id='1084772369']/lrm[@id='1084772369']/lr
> m_resources/lrm_resource[@id='Lapras']/lrm_rsc_op[@id='Lapras_last_0'
> ]:
> @operati
> on_key=Lapras_migrate_to_0, @operation=migrate_to,
> @transition-key=30:268:0:a472c072-7fdc-4996-abe6-64a46331a1df,
> @transition-magic=-1:193;30:268:0:a472c072-7fdc-4996-abe6-
> 64a46331a1df,
> @call-id=-1, @rc-code=193, @op-status=-1, @last-run=1530010909,
> @last-rc-change=1530010909, @exec-time=0, @mi
> Jun 26 07:01:49 [4552] alpha cib: info:
> cib_process_request: Completed cib_modify operation for section
> status: OK (rc=0, origin=beta/crmd/117, version=1.443.1)
> VirtualDomain(Omicron)[30885]: 2018/06/26_07:01:49 INFO: Issuing
> graceful shutdown request for domain Omicron.
> Jun 26 07:01:54 [4552] alpha cib: info: cib_process_ping:
> Reporting our current digest to alpha:
> 31cad76e7e3b084f6b7ed1ea3e909c4b for 1.443.1 (0x560b20c971a0 0)
> Jun 26 07:02:09 [4552] alpha cib: info: cib_perform_op:
> Diff: --- 1.443.1 2
> Jun 26 07:02:09 [4552] alpha cib: info: cib_perform_op:
> Diff: +++ 1.443.2 (null)
> Jun 26 07:02:09 [4552] alpha cib: info: cib_perform_op:
> + /cib: @num_updates=2
> Jun 26 07:02:09 [4552] alpha cib: info: cib_perform_op:
> + /cib/status/node_state[@id='1084772369']/lrm[@id='1084772369']/lr
> m_resources/lrm_resource[@id='Lapras']/lrm_rsc_op[@id='Lapras_last_fa
> ilure_0']:
> @operation_key=Lapras_migrate_to_0, @operation=migrate_to,
> @transition-key=30:268:0:a472c072-7fdc-4996-abe6-64a46331a1df,
> @transition-magic=2:1;30:268:0:a472c072-7fdc-4996-abe6-64a46331a1df,
> @call-id=139, @rc-code=1, @op-status=2, @last-run=1530010909,
> @last-rc-change=1530010909, @exec-time=200
> Jun 26 07:02:09 [4552] alpha cib: info: cib_perform_op:
> + /cib/status/node_state[@id='1084772369']/lrm[@id='1084772369']/lr
> m_resources/lrm_resource[@id='Lapras']/lrm_rsc_op[@id='Lapras_last_0'
> ]:
> @transition-magic=2:1;30:268:0:a472c072-7fdc-4996-abe6-64a46331a1df,
> @call-id=139, @rc-code=1, @op-status=2, @exec-time=20004,
> @queue-time=1
> Jun 26 07:02:09 [4552] alpha cib: info:
> cib_process_request: Completed cib_modify operation for section
> status: OK (rc=0, origin=beta/crmd/118, version=1.443.2)
> Jun 26 07:02:09 [4557] alpha crmd: warning: status_from_rc:
> Action 30 (Lapras_migrate_to_0) on beta failed (target: 0 vs. rc:
> 1):
> Error
> Jun 26 07:02:09 [4557] alpha crmd: notice:
> abort_transition_graph: Transition aborted by operation
> Lapras_migrate_to_0 'modify' on beta: Event failed |
> magic=2:1;30:268:0:a472c072-7fdc-4996-abe6-64a46331a1df cib=1.443.2
> source=match_graph_event:310 complete=false
> Jun 26 07:02:09 [4557] alpha crmd: info: match_graph_event:
> Action Lapras_migrate_to_0 (30) confirmed on beta (rc=1)
> Jun 26 07:02:09 [4557] alpha crmd: info:
> process_graph_event: Detected action (268.30)
> Lapras_migrate_to_0.139=unknown error: failed
> Jun 26 07:02:09 [4557] alpha crmd: warning: status_from_rc:
> Action 30 (Lapras_migrate_to_0) on beta failed (target: 0 vs. rc:
> 1):
> Error
>
> Jun 26 07:02:09 [4557] alpha crmd: info:
> abort_transition_graph: Transition aborted by operation
> Lapras_migrate_to_0 'modify' on beta: Event failed |
> magic=2:1;30:268:0:a472c072-7fdc-4996-abe6-64a46331a1df cib=1.443.2
> source=match_graph_event:310 complete=false
> Jun 26 07:02:09 [4557] alpha crmd: info: match_graph_event:
> Action Lapras_migrate_to_0 (30) confirmed on beta (rc=1)
> Jun 26 07:02:09 [4557] alpha crmd: info:
> process_graph_event: Detected action (268.30)
> Lapras_migrate_to_0.139=unknown error: failed
> Jun 26 07:02:14 [4552] alpha cib: info: cib_process_ping:
> Reporting our current digest to alpha:
> 05f076e7f5fcb9bd9695af7a83f2ab0a for 1.443.2 (0x560b20c971a0 0)
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.
> pdf
> Bugs: http://bugs.clusterlabs.org
--
Ken Gaillot <kgaillot at redhat.com>
More information about the Users
mailing list