[ClusterLabs] Stop one VM, another tries to migrate
Jason Gauthier
jagauthier at gmail.com
Tue Jun 26 12:49:48 EDT 2018
On Tue, Jun 26, 2018 at 12:40 PM Ken Gaillot <kgaillot at redhat.com> wrote:
>
> On Tue, 2018-06-26 at 07:19 -0400, Jason Gauthier wrote:
> > Greetings,
> >
> > I am using my cluster platform primarily for virtual machines.
> > While I've still been in implementation mode, I felt like things were
> > somewhat stable. However, I've noticed that sometimes when I stop a
> > resource another resource tries to migrate. I did this morning, and
> > that scenario occurred. Basically, I 'crm resource stop Omicron',
> > and the machine 'Lapras' tried to migrate as well. I've included
> > cluster logs since I can't make heads or tails of this decision.
>
> Have a look at resource-stickiness.
>
> Basically, the cluster will by default try to balance the number of
> resources across all nodes (subject to your constraints of course).
> Stickiness tells it to prefer to keep running resources where they are,
> and only consider balancing when starting a resource.
>
Ah, I had no idea that was a thing! I wouldn't have noticed if the
migration didn't fail.
Which, is a secondary concern.
> > I've attached a cluster log, but also put it in line here since I'm
> > not sure the preferred way. This log only pertains to the actions
> > since issuing the resource stop.
> >
> > Jun 26 07:01:49 [4552] alpha cib: info: cib_perform_op:
> > Diff: --- 1.442.64 2
> > Jun 26 07:01:49 [4552] alpha cib: info: cib_perform_op:
> > Diff: +++ 1.443.0 92508eef9d32f83b93e7f1ed2dff3340
> > Jun 26 07:01:49 [4552] alpha cib: info: cib_perform_op:
> > + /cib: @epoch=443, @num_updates=0
> > Jun 26 07:01:49 [4552] alpha cib: info: cib_perform_op:
> > + /cib/configuration/resources/primitive[@id='Omicron']/meta_attrib
> > utes[@id='Omicron-meta_attributes']/nvpair[@id='Omicron-
> > meta_attributes-target-ro
> > le']: @value=Stopped
> > Jun 26 07:01:49 [4557] alpha crmd: info:
> > abort_transition_graph: Transition aborted by
> > Omicron-meta_attributes-target-role doing modify target-role=Stopped:
> > Configuration change | cib=1.443.0 source=te_upda
> > te_diff:444
> > path=/cib/configuration/resources/primitive[@id='Omicron']/meta_attri
> > butes[@id='Omicron-meta_attributes']/nvpair[@id='Omicron-
> > meta_attributes-target-role']
> > complete=true
> > Jun 26 07:01:49 [4557] alpha crmd: notice:
> > do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE |
> > input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph
> > Jun 26 07:01:49 [4553] alpha stonith-ng: info:
> > update_cib_stonith_devices_v2: Updating device list from the
> > cib: modify nvpair[@id='Omicron-meta_attributes-target-role']
> > Jun 26 07:01:49 [4553] alpha stonith-ng: info:
> > cib_devices_update:
> > Updating devices to version 1.443.0
> > Jun 26 07:01:49 [4552] alpha cib: info:
> > cib_process_request: Completed cib_apply_diff operation for section
> > 'all': OK (rc=0, origin=alpha/cibadmin/2, version=1.443.0)
> > Jun 26 07:01:49 [4553] alpha stonith-ng: info: cib_device_update:
> > Device ipmi_alpha has been disabled on alpha: score=-INFINITY
> > Jun 26 07:01:49 [4552] alpha cib: info: cib_file_backup:
> > Archived previous version as /var/lib/pacemaker/cib/cib-83.raw
> > Jun 26 07:01:49 [4552] alpha cib: info:
> > cib_file_write_with_digest: Wrote version 1.443.0 of the CIB to disk
> > (digest: 2a60981d2eceb59a6ed3015ce20f9dff)
> > Jun 26 07:01:49 [4552] alpha cib: info:
> > cib_file_write_with_digest: Reading cluster configuration file
> > /var/lib/pacemaker/cib/cib.g9gWwY (digest:
> > /var/lib/pacemaker/cib/cib.tslkpk)
> > Jun 26 07:01:49 [4556] alpha pengine: info:
> > determine_online_status_fencing: Node beta is active
> > Jun 26 07:01:49 [4556] alpha pengine: info:
> > determine_online_status: Node beta is online
> > Jun 26 07:01:49 [4556] alpha pengine: info:
> > determine_online_status_fencing: Node alpha is active
> > Jun 26 07:01:49 [4556] alpha pengine: info:
> > determine_online_status: Node alpha is online
> > Jun 26 07:01:49 [4556] alpha pengine: info:
> > determine_op_status: Operation monitor found resource Calibre active
> > on beta
> > Jun 26 07:01:49 [4556] alpha pengine: info:
> > determine_op_status: Operation monitor found resource Calibre active
> > on beta
> > Jun 26 07:01:49 [4556] alpha pengine: info:
> > determine_op_status: Operation monitor found resource Iota active on
> > beta
> > Jun 26 07:01:49 [4556] alpha pengine: info:
> > determine_op_status: Operation monitor found resource Iota active on
> > beta
> > Jun 26 07:01:49 [4556] alpha pengine: info:
> > determine_op_status: Operation monitor found resource Lapras active
> > on
> > beta
> > Jun 26 07:01:49 [4556] alpha pengine: info:
> > determine_op_status: Operation monitor found resource Lapras active
> > on
> > beta
> > Jun 26 07:01:49 [4556] alpha pengine: info:
> > determine_op_status: Operation monitor found resource Tau active on
> > beta
> > Jun 26 07:01:49 [4556] alpha pengine: info:
> > determine_op_status: Operation monitor found resource Tau active on
> > beta
> > Jun 26 07:01:49 [4556] alpha pengine: info:
> > determine_op_status: Operation monitor found resource Omicron active
> > on alpha
> > Jun 26 07:01:49 [4556] alpha pengine: info:
> > determine_op_status: Operation monitor found resource Omicron active
> > on alpha
> > Jun 26 07:01:49 [4556] alpha pengine: info:
> > determine_op_status: Operation monitor found resource Plex active on
> > alpha
> > Jun 26 07:01:49 [4556] alpha pengine: info:
> > determine_op_status: Operation monitor found resource Plex active on
> > alpha
> > Jun 26 07:01:49 [4556] alpha pengine: info:
> > determine_op_status: Operation monitor found resource Umbreon active
> > on alpha
> > Jun 26 07:01:49 [4556] alpha pengine: info:
> > determine_op_status: Operation monitor found resource Umbreon active
> > on alpha
> > Jun 26 07:01:49 [4556] alpha pengine: info:
> > determine_op_status: Operation monitor found resource Nu active on
> > alpha
> > Jun 26 07:01:49 [4556] alpha pengine: info:
> > determine_op_status: Operation monitor found resource Nu active on
> > alpha
> > Jun 26 07:01:49 [4556] alpha pengine: info: native_print:
> > Omicron (ocf::heartbeat:VirtualDomain): Started alpha (disabled)
> > Jun 26 07:01:49 [4556] alpha pengine: info: native_print:
> > Calibre (ocf::heartbeat:VirtualDomain): Started beta
> > Jun 26 07:01:49 [4556] alpha pengine: info: native_print:
> > Iota (ocf::heartbeat:VirtualDomain): Started beta
> > Jun 26 07:01:49 [4556] alpha pengine: info: native_print:
> > Plex (ocf::heartbeat:VirtualDomain): Started alpha
> > Jun 26 07:01:49 [4556] alpha pengine: info: native_print:
> > Nu (ocf::heartbeat:VirtualDomain): Started alpha
> > Jun 26 07:01:49 [4556] alpha pengine: info: native_print:
> > ipmi_alpha (stonith:external/ipmi): Started beta
> > Jun 26 07:01:49 [4556] alpha pengine: info: native_print:
> > ipmi_beta (stonith:external/ipmi): Started alpha
> > Jun 26 07:01:49 [4556] alpha pengine: info: native_print:
> > Tau (ocf::heartbeat:VirtualDomain): Started beta
> > Jun 26 07:01:49 [4556] alpha pengine: info: native_print:
> > Lapras (ocf::heartbeat:VirtualDomain): Started beta
> > Jun 26 07:01:49 [4556] alpha pengine: info: native_print:
> > Umbreon (ocf::heartbeat:VirtualDomain): Started alpha
> > Jun 26 07:01:49 [4556] alpha pengine: info: native_color:
> > Resource Omicron cannot run anywhere
> > Jun 26 07:01:49 [4556] alpha pengine: info:
> > RecurringOp: Start
> > recurring monitor (10s) for Lapras on alpha
> > Jun 26 07:01:49 [4556] alpha pengine: notice: LogActions: Stop
> > Omicron (alpha)
> > Jun 26 07:01:49 [4556] alpha pengine: info: LogActions: Leave
> > Calibre (Started beta)
> > Jun 26 07:01:49 [4556] alpha pengine: info: LogActions: Leave
> > Iota (Started beta)
> > Jun 26 07:01:49 [4556] alpha pengine: info: LogActions: Leave
> > Plex (Started alpha)
> > Jun 26 07:01:49 [4556] alpha pengine: info: LogActions: Leave
> > Nu (Started alpha)
> > Jun 26 07:01:49 [4556] alpha pengine: info: LogActions: Leave
> > ipmi_alpha (Started beta)
> > Jun 26 07:01:49 [4556] alpha pengine: info: LogActions: Leave
> > ipmi_beta (Started alpha)
> > Jun 26 07:01:49 [4556] alpha pengine: info: LogActions: Leave
> > Tau (Started beta)
> > Jun 26 07:01:49 [4556] alpha pengine: notice: LogActions:
> > Migrate Lapras (Started beta -> alpha)
> > Jun 26 07:01:49 [4556] alpha pengine: info: LogActions: Leave
> > Umbreon (Started alpha)
> > Jun 26 07:01:49 [4556] alpha pengine: notice:
> > process_pe_message:
> > Calculated transition 268, saving inputs in
> > /var/lib/pacemaker/pengine/pe-input-548.bz2
> > Jun 26 07:01:49 [4557] alpha crmd: info:
> > do_state_transition: State transition S_POLICY_ENGINE ->
> > S_TRANSITION_ENGINE | input=I_PE_SUCCESS cause=C_IPC_MESSAGE
> > origin=handle_response
> > Jun 26 07:01:49 [4557] alpha crmd: info: do_te_invoke:
> > Processing graph 268 (ref=pe_calc-dc-1530010909-426) derived from
> > /var/lib/pacemaker/pengine/pe-input-548.bz2
> > Jun 26 07:01:49 [4557] alpha crmd: notice: te_rsc_command:
> > Initiating stop operation Omicron_stop_0 locally on alpha | action
> > 12
> > Jun 26 07:01:49 [4554] alpha lrmd: info:
> > cancel_recurring_action: Cancelling ocf operation
> > Omicron_monitor_10000
> > Jun 26 07:01:49 [4557] alpha crmd: info: do_lrm_rsc_op:
> > Performing key=12:268:0:a472c072-7fdc-4996-abe6-64a46331a1df
> > op=Omicron_stop_0
> > Jun 26 07:01:49 [4554] alpha lrmd: info: log_execute:
> > executing - rsc:Omicron action:stop call_id:162
> > Jun 26 07:01:49 [4557] alpha crmd: notice: te_rsc_command:
> > Initiating migrate_to operation Lapras_migrate_to_0 on beta | action
> > 30
> > Jun 26 07:01:49 [4557] alpha crmd: info: process_lrm_event:
> > Result of monitor operation for Omicron on alpha: Cancelled |
> > call=156 key=Omicron_monitor_10000 confirmed=true
> > Jun 26 07:01:49 [4552] alpha cib: info: cib_perform_op:
> > Diff: --- 1.443.0 2
> > Jun 26 07:01:49 [4552] alpha cib: info: cib_perform_op:
> > Diff: +++ 1.443.1 (null)
> > Jun 26 07:01:49 [4552] alpha cib: info: cib_perform_op:
> > + /cib: @num_updates=1
> > Jun 26 07:01:49 [4552] alpha cib: info: cib_perform_op:
> > + /cib/status/node_state[@id='1084772369']/lrm[@id='1084772369']/lr
> > m_resources/lrm_resource[@id='Lapras']/lrm_rsc_op[@id='Lapras_last_0'
> > ]:
> > @operati
> > on_key=Lapras_migrate_to_0, @operation=migrate_to,
> > @transition-key=30:268:0:a472c072-7fdc-4996-abe6-64a46331a1df,
> > @transition-magic=-1:193;30:268:0:a472c072-7fdc-4996-abe6-
> > 64a46331a1df,
> > @call-id=-1, @rc-code=193, @op-status=-1, @last-run=1530010909,
> > @last-rc-change=1530010909, @exec-time=0, @mi
> > Jun 26 07:01:49 [4552] alpha cib: info:
> > cib_process_request: Completed cib_modify operation for section
> > status: OK (rc=0, origin=beta/crmd/117, version=1.443.1)
> > VirtualDomain(Omicron)[30885]: 2018/06/26_07:01:49 INFO: Issuing
> > graceful shutdown request for domain Omicron.
> > Jun 26 07:01:54 [4552] alpha cib: info: cib_process_ping:
> > Reporting our current digest to alpha:
> > 31cad76e7e3b084f6b7ed1ea3e909c4b for 1.443.1 (0x560b20c971a0 0)
> > Jun 26 07:02:09 [4552] alpha cib: info: cib_perform_op:
> > Diff: --- 1.443.1 2
> > Jun 26 07:02:09 [4552] alpha cib: info: cib_perform_op:
> > Diff: +++ 1.443.2 (null)
> > Jun 26 07:02:09 [4552] alpha cib: info: cib_perform_op:
> > + /cib: @num_updates=2
> > Jun 26 07:02:09 [4552] alpha cib: info: cib_perform_op:
> > + /cib/status/node_state[@id='1084772369']/lrm[@id='1084772369']/lr
> > m_resources/lrm_resource[@id='Lapras']/lrm_rsc_op[@id='Lapras_last_fa
> > ilure_0']:
> > @operation_key=Lapras_migrate_to_0, @operation=migrate_to,
> > @transition-key=30:268:0:a472c072-7fdc-4996-abe6-64a46331a1df,
> > @transition-magic=2:1;30:268:0:a472c072-7fdc-4996-abe6-64a46331a1df,
> > @call-id=139, @rc-code=1, @op-status=2, @last-run=1530010909,
> > @last-rc-change=1530010909, @exec-time=200
> > Jun 26 07:02:09 [4552] alpha cib: info: cib_perform_op:
> > + /cib/status/node_state[@id='1084772369']/lrm[@id='1084772369']/lr
> > m_resources/lrm_resource[@id='Lapras']/lrm_rsc_op[@id='Lapras_last_0'
> > ]:
> > @transition-magic=2:1;30:268:0:a472c072-7fdc-4996-abe6-64a46331a1df,
> > @call-id=139, @rc-code=1, @op-status=2, @exec-time=20004,
> > @queue-time=1
> > Jun 26 07:02:09 [4552] alpha cib: info:
> > cib_process_request: Completed cib_modify operation for section
> > status: OK (rc=0, origin=beta/crmd/118, version=1.443.2)
> > Jun 26 07:02:09 [4557] alpha crmd: warning: status_from_rc:
> > Action 30 (Lapras_migrate_to_0) on beta failed (target: 0 vs. rc:
> > 1):
> > Error
> > Jun 26 07:02:09 [4557] alpha crmd: notice:
> > abort_transition_graph: Transition aborted by operation
> > Lapras_migrate_to_0 'modify' on beta: Event failed |
> > magic=2:1;30:268:0:a472c072-7fdc-4996-abe6-64a46331a1df cib=1.443.2
> > source=match_graph_event:310 complete=false
> > Jun 26 07:02:09 [4557] alpha crmd: info: match_graph_event:
> > Action Lapras_migrate_to_0 (30) confirmed on beta (rc=1)
> > Jun 26 07:02:09 [4557] alpha crmd: info:
> > process_graph_event: Detected action (268.30)
> > Lapras_migrate_to_0.139=unknown error: failed
> > Jun 26 07:02:09 [4557] alpha crmd: warning: status_from_rc:
> > Action 30 (Lapras_migrate_to_0) on beta failed (target: 0 vs. rc:
> > 1):
> > Error
> >
> > Jun 26 07:02:09 [4557] alpha crmd: info:
> > abort_transition_graph: Transition aborted by operation
> > Lapras_migrate_to_0 'modify' on beta: Event failed |
> > magic=2:1;30:268:0:a472c072-7fdc-4996-abe6-64a46331a1df cib=1.443.2
> > source=match_graph_event:310 complete=false
> > Jun 26 07:02:09 [4557] alpha crmd: info: match_graph_event:
> > Action Lapras_migrate_to_0 (30) confirmed on beta (rc=1)
> > Jun 26 07:02:09 [4557] alpha crmd: info:
> > process_graph_event: Detected action (268.30)
> > Lapras_migrate_to_0.139=unknown error: failed
> > Jun 26 07:02:14 [4552] alpha cib: info: cib_process_ping:
> > Reporting our current digest to alpha:
> > 05f076e7f5fcb9bd9695af7a83f2ab0a for 1.443.2 (0x560b20c971a0 0)
> > _______________________________________________
> > Users mailing list: Users at clusterlabs.org
> > https://lists.clusterlabs.org/mailman/listinfo/users
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.
> > pdf
> > Bugs: http://bugs.clusterlabs.org
> --
> Ken Gaillot <kgaillot at redhat.com>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Users
mailing list