[ClusterLabs] Stop one VM, another tries to migrate

Jason Gauthier jagauthier at gmail.com
Tue Jun 26 16:49:48 UTC 2018


On Tue, Jun 26, 2018 at 12:40 PM Ken Gaillot <kgaillot at redhat.com> wrote:
>
> On Tue, 2018-06-26 at 07:19 -0400, Jason Gauthier wrote:
> > Greetings,
> >
> >    I am using my cluster platform primarily for virtual machines.
> > While I've still been in implementation mode, I felt like things were
> > somewhat stable. However, I've noticed that sometimes when I stop a
> > resource another resource tries to migrate.   I did this morning, and
> > that scenario occurred.   Basically, I 'crm resource stop Omicron',
> > and the machine 'Lapras' tried to migrate as well.  I've included
> > cluster logs since I can't make heads or tails of this decision.
>
> Have a look at resource-stickiness.
>
> Basically, the cluster will by default try to balance the number of
> resources across all nodes (subject to your constraints of course).
> Stickiness tells it to prefer to keep running resources where they are,
> and only consider balancing when starting a resource.
>

Ah, I had no idea that was a thing!   I wouldn't have noticed if the
migration didn't fail.
Which, is a secondary concern.

> > I've attached a cluster log, but also put it in line here since I'm
> > not sure the preferred way.  This log only pertains to the actions
> > since issuing the resource stop.
> >
> > Jun 26 07:01:49 [4552] alpha        cib:     info: cib_perform_op:
> >  Diff: --- 1.442.64 2
> > Jun 26 07:01:49 [4552] alpha        cib:     info: cib_perform_op:
> >  Diff: +++ 1.443.0 92508eef9d32f83b93e7f1ed2dff3340
> > Jun 26 07:01:49 [4552] alpha        cib:     info: cib_perform_op:
> >  +  /cib:  @epoch=443, @num_updates=0
> > Jun 26 07:01:49 [4552] alpha        cib:     info: cib_perform_op:
> >  +  /cib/configuration/resources/primitive[@id='Omicron']/meta_attrib
> > utes[@id='Omicron-meta_attributes']/nvpair[@id='Omicron-
> > meta_attributes-target-ro
> > le']:  @value=Stopped
> > Jun 26 07:01:49 [4557] alpha       crmd:     info:
> > abort_transition_graph:      Transition aborted by
> > Omicron-meta_attributes-target-role doing modify target-role=Stopped:
> > Configuration change | cib=1.443.0 source=te_upda
> > te_diff:444
> > path=/cib/configuration/resources/primitive[@id='Omicron']/meta_attri
> > butes[@id='Omicron-meta_attributes']/nvpair[@id='Omicron-
> > meta_attributes-target-role']
> > complete=true
> > Jun 26 07:01:49 [4557] alpha       crmd:   notice:
> > do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE |
> > input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph
> > Jun 26 07:01:49 [4553] alpha stonith-ng:     info:
> > update_cib_stonith_devices_v2:       Updating device list from the
> > cib: modify nvpair[@id='Omicron-meta_attributes-target-role']
> > Jun 26 07:01:49 [4553] alpha stonith-ng:     info:
> > cib_devices_update:
> >  Updating devices to version 1.443.0
> > Jun 26 07:01:49 [4552] alpha        cib:     info:
> > cib_process_request: Completed cib_apply_diff operation for section
> > 'all': OK (rc=0, origin=alpha/cibadmin/2, version=1.443.0)
> > Jun 26 07:01:49 [4553] alpha stonith-ng:     info: cib_device_update:
> >  Device ipmi_alpha has been disabled on alpha: score=-INFINITY
> > Jun 26 07:01:49 [4552] alpha        cib:     info: cib_file_backup:
> >  Archived previous version as /var/lib/pacemaker/cib/cib-83.raw
> > Jun 26 07:01:49 [4552] alpha        cib:     info:
> > cib_file_write_with_digest:  Wrote version 1.443.0 of the CIB to disk
> > (digest: 2a60981d2eceb59a6ed3015ce20f9dff)
> > Jun 26 07:01:49 [4552] alpha        cib:     info:
> > cib_file_write_with_digest:  Reading cluster configuration file
> > /var/lib/pacemaker/cib/cib.g9gWwY (digest:
> > /var/lib/pacemaker/cib/cib.tslkpk)
> > Jun 26 07:01:49 [4556] alpha    pengine:     info:
> > determine_online_status_fencing:     Node beta is active
> > Jun 26 07:01:49 [4556] alpha    pengine:     info:
> > determine_online_status:     Node beta is online
> > Jun 26 07:01:49 [4556] alpha    pengine:     info:
> > determine_online_status_fencing:     Node alpha is active
> > Jun 26 07:01:49 [4556] alpha    pengine:     info:
> > determine_online_status:     Node alpha is online
> > Jun 26 07:01:49 [4556] alpha    pengine:     info:
> > determine_op_status: Operation monitor found resource Calibre active
> > on beta
> > Jun 26 07:01:49 [4556] alpha    pengine:     info:
> > determine_op_status: Operation monitor found resource Calibre active
> > on beta
> > Jun 26 07:01:49 [4556] alpha    pengine:     info:
> > determine_op_status: Operation monitor found resource Iota active on
> > beta
> > Jun 26 07:01:49 [4556] alpha    pengine:     info:
> > determine_op_status: Operation monitor found resource Iota active on
> > beta
> > Jun 26 07:01:49 [4556] alpha    pengine:     info:
> > determine_op_status: Operation monitor found resource Lapras active
> > on
> > beta
> > Jun 26 07:01:49 [4556] alpha    pengine:     info:
> > determine_op_status: Operation monitor found resource Lapras active
> > on
> > beta
> > Jun 26 07:01:49 [4556] alpha    pengine:     info:
> > determine_op_status: Operation monitor found resource Tau active on
> > beta
> > Jun 26 07:01:49 [4556] alpha    pengine:     info:
> > determine_op_status: Operation monitor found resource Tau active on
> > beta
> > Jun 26 07:01:49 [4556] alpha    pengine:     info:
> > determine_op_status: Operation monitor found resource Omicron active
> > on alpha
> > Jun 26 07:01:49 [4556] alpha    pengine:     info:
> > determine_op_status: Operation monitor found resource Omicron active
> > on alpha
> > Jun 26 07:01:49 [4556] alpha    pengine:     info:
> > determine_op_status: Operation monitor found resource Plex active on
> > alpha
> > Jun 26 07:01:49 [4556] alpha    pengine:     info:
> > determine_op_status: Operation monitor found resource Plex active on
> > alpha
> > Jun 26 07:01:49 [4556] alpha    pengine:     info:
> > determine_op_status: Operation monitor found resource Umbreon active
> > on alpha
> > Jun 26 07:01:49 [4556] alpha    pengine:     info:
> > determine_op_status: Operation monitor found resource Umbreon active
> > on alpha
> > Jun 26 07:01:49 [4556] alpha    pengine:     info:
> > determine_op_status: Operation monitor found resource Nu active on
> > alpha
> > Jun 26 07:01:49 [4556] alpha    pengine:     info:
> > determine_op_status: Operation monitor found resource Nu active on
> > alpha
> > Jun 26 07:01:49 [4556] alpha    pengine:     info: native_print:
> >  Omicron (ocf::heartbeat:VirtualDomain): Started alpha (disabled)
> > Jun 26 07:01:49 [4556] alpha    pengine:     info: native_print:
> >  Calibre (ocf::heartbeat:VirtualDomain): Started beta
> > Jun 26 07:01:49 [4556] alpha    pengine:     info: native_print:
> >  Iota    (ocf::heartbeat:VirtualDomain): Started beta
> > Jun 26 07:01:49 [4556] alpha    pengine:     info: native_print:
> >  Plex    (ocf::heartbeat:VirtualDomain): Started alpha
> > Jun 26 07:01:49 [4556] alpha    pengine:     info: native_print:
> >  Nu      (ocf::heartbeat:VirtualDomain): Started alpha
> > Jun 26 07:01:49 [4556] alpha    pengine:     info: native_print:
> >  ipmi_alpha      (stonith:external/ipmi):        Started beta
> > Jun 26 07:01:49 [4556] alpha    pengine:     info: native_print:
> >  ipmi_beta       (stonith:external/ipmi):        Started alpha
> > Jun 26 07:01:49 [4556] alpha    pengine:     info: native_print:
> >  Tau     (ocf::heartbeat:VirtualDomain): Started beta
> > Jun 26 07:01:49 [4556] alpha    pengine:     info: native_print:
> >  Lapras  (ocf::heartbeat:VirtualDomain): Started beta
> > Jun 26 07:01:49 [4556] alpha    pengine:     info: native_print:
> >  Umbreon (ocf::heartbeat:VirtualDomain): Started alpha
> > Jun 26 07:01:49 [4556] alpha    pengine:     info: native_color:
> >  Resource Omicron cannot run anywhere
> > Jun 26 07:01:49 [4556] alpha    pengine:     info:
> > RecurringOp:  Start
> > recurring monitor (10s) for Lapras on alpha
> > Jun 26 07:01:49 [4556] alpha    pengine:   notice: LogActions:  Stop
> >  Omicron (alpha)
> > Jun 26 07:01:49 [4556] alpha    pengine:     info: LogActions:  Leave
> >  Calibre (Started beta)
> > Jun 26 07:01:49 [4556] alpha    pengine:     info: LogActions:  Leave
> >  Iota    (Started beta)
> > Jun 26 07:01:49 [4556] alpha    pengine:     info: LogActions:  Leave
> >  Plex    (Started alpha)
> > Jun 26 07:01:49 [4556] alpha    pengine:     info: LogActions:  Leave
> >  Nu      (Started alpha)
> > Jun 26 07:01:49 [4556] alpha    pengine:     info: LogActions:  Leave
> >  ipmi_alpha      (Started beta)
> > Jun 26 07:01:49 [4556] alpha    pengine:     info: LogActions:  Leave
> >  ipmi_beta       (Started alpha)
> > Jun 26 07:01:49 [4556] alpha    pengine:     info: LogActions:  Leave
> >  Tau     (Started beta)
> > Jun 26 07:01:49 [4556] alpha    pengine:   notice: LogActions:
> > Migrate Lapras  (Started beta -> alpha)
> > Jun 26 07:01:49 [4556] alpha    pengine:     info: LogActions:  Leave
> >  Umbreon (Started alpha)
> > Jun 26 07:01:49 [4556] alpha    pengine:   notice:
> > process_pe_message:
> >  Calculated transition 268, saving inputs in
> > /var/lib/pacemaker/pengine/pe-input-548.bz2
> > Jun 26 07:01:49 [4557] alpha       crmd:     info:
> > do_state_transition: State transition S_POLICY_ENGINE ->
> > S_TRANSITION_ENGINE | input=I_PE_SUCCESS cause=C_IPC_MESSAGE
> > origin=handle_response
> > Jun 26 07:01:49 [4557] alpha       crmd:     info: do_te_invoke:
> >  Processing graph 268 (ref=pe_calc-dc-1530010909-426) derived from
> > /var/lib/pacemaker/pengine/pe-input-548.bz2
> > Jun 26 07:01:49 [4557] alpha       crmd:   notice: te_rsc_command:
> >  Initiating stop operation Omicron_stop_0 locally on alpha | action
> > 12
> > Jun 26 07:01:49 [4554] alpha       lrmd:     info:
> > cancel_recurring_action:     Cancelling ocf operation
> > Omicron_monitor_10000
> > Jun 26 07:01:49 [4557] alpha       crmd:     info: do_lrm_rsc_op:
> >  Performing key=12:268:0:a472c072-7fdc-4996-abe6-64a46331a1df
> > op=Omicron_stop_0
> > Jun 26 07:01:49 [4554] alpha       lrmd:     info: log_execute:
> > executing - rsc:Omicron action:stop call_id:162
> > Jun 26 07:01:49 [4557] alpha       crmd:   notice: te_rsc_command:
> >  Initiating migrate_to operation Lapras_migrate_to_0 on beta | action
> > 30
> > Jun 26 07:01:49 [4557] alpha       crmd:     info: process_lrm_event:
> >  Result of monitor operation for Omicron on alpha: Cancelled |
> > call=156 key=Omicron_monitor_10000 confirmed=true
> > Jun 26 07:01:49 [4552] alpha        cib:     info: cib_perform_op:
> >  Diff: --- 1.443.0 2
> > Jun 26 07:01:49 [4552] alpha        cib:     info: cib_perform_op:
> >  Diff: +++ 1.443.1 (null)
> > Jun 26 07:01:49 [4552] alpha        cib:     info: cib_perform_op:
> >  +  /cib:  @num_updates=1
> > Jun 26 07:01:49 [4552] alpha        cib:     info: cib_perform_op:
> >  +  /cib/status/node_state[@id='1084772369']/lrm[@id='1084772369']/lr
> > m_resources/lrm_resource[@id='Lapras']/lrm_rsc_op[@id='Lapras_last_0'
> > ]:
> >  @operati
> > on_key=Lapras_migrate_to_0, @operation=migrate_to,
> > @transition-key=30:268:0:a472c072-7fdc-4996-abe6-64a46331a1df,
> > @transition-magic=-1:193;30:268:0:a472c072-7fdc-4996-abe6-
> > 64a46331a1df,
> > @call-id=-1, @rc-code=193, @op-status=-1, @last-run=1530010909,
> > @last-rc-change=1530010909, @exec-time=0, @mi
> > Jun 26 07:01:49 [4552] alpha        cib:     info:
> > cib_process_request: Completed cib_modify operation for section
> > status: OK (rc=0, origin=beta/crmd/117, version=1.443.1)
> > VirtualDomain(Omicron)[30885]:  2018/06/26_07:01:49 INFO: Issuing
> > graceful shutdown request for domain Omicron.
> > Jun 26 07:01:54 [4552] alpha        cib:     info: cib_process_ping:
> >  Reporting our current digest to alpha:
> > 31cad76e7e3b084f6b7ed1ea3e909c4b for 1.443.1 (0x560b20c971a0 0)
> > Jun 26 07:02:09 [4552] alpha        cib:     info: cib_perform_op:
> >  Diff: --- 1.443.1 2
> > Jun 26 07:02:09 [4552] alpha        cib:     info: cib_perform_op:
> >  Diff: +++ 1.443.2 (null)
> > Jun 26 07:02:09 [4552] alpha        cib:     info: cib_perform_op:
> >  +  /cib:  @num_updates=2
> > Jun 26 07:02:09 [4552] alpha        cib:     info: cib_perform_op:
> >  +  /cib/status/node_state[@id='1084772369']/lrm[@id='1084772369']/lr
> > m_resources/lrm_resource[@id='Lapras']/lrm_rsc_op[@id='Lapras_last_fa
> > ilure_0']:
> >  @operation_key=Lapras_migrate_to_0, @operation=migrate_to,
> > @transition-key=30:268:0:a472c072-7fdc-4996-abe6-64a46331a1df,
> > @transition-magic=2:1;30:268:0:a472c072-7fdc-4996-abe6-64a46331a1df,
> > @call-id=139, @rc-code=1, @op-status=2, @last-run=1530010909,
> > @last-rc-change=1530010909, @exec-time=200
> > Jun 26 07:02:09 [4552] alpha        cib:     info: cib_perform_op:
> >  +  /cib/status/node_state[@id='1084772369']/lrm[@id='1084772369']/lr
> > m_resources/lrm_resource[@id='Lapras']/lrm_rsc_op[@id='Lapras_last_0'
> > ]:
> >  @transition-magic=2:1;30:268:0:a472c072-7fdc-4996-abe6-64a46331a1df,
> > @call-id=139, @rc-code=1, @op-status=2, @exec-time=20004,
> > @queue-time=1
> > Jun 26 07:02:09 [4552] alpha        cib:     info:
> > cib_process_request: Completed cib_modify operation for section
> > status: OK (rc=0, origin=beta/crmd/118, version=1.443.2)
> > Jun 26 07:02:09 [4557] alpha       crmd:  warning: status_from_rc:
> >  Action 30 (Lapras_migrate_to_0) on beta failed (target: 0 vs. rc:
> > 1):
> > Error
> > Jun 26 07:02:09 [4557] alpha       crmd:   notice:
> > abort_transition_graph:      Transition aborted by operation
> > Lapras_migrate_to_0 'modify' on beta: Event failed |
> > magic=2:1;30:268:0:a472c072-7fdc-4996-abe6-64a46331a1df cib=1.443.2
> > source=match_graph_event:310 complete=false
> > Jun 26 07:02:09 [4557] alpha       crmd:     info: match_graph_event:
> >  Action Lapras_migrate_to_0 (30) confirmed on beta (rc=1)
> > Jun 26 07:02:09 [4557] alpha       crmd:     info:
> > process_graph_event: Detected action (268.30)
> > Lapras_migrate_to_0.139=unknown error: failed
> > Jun 26 07:02:09 [4557] alpha       crmd:  warning: status_from_rc:
> >  Action 30 (Lapras_migrate_to_0) on beta failed (target: 0 vs. rc:
> > 1):
> > Error
> >
> > Jun 26 07:02:09 [4557] alpha       crmd:     info:
> > abort_transition_graph:      Transition aborted by operation
> > Lapras_migrate_to_0 'modify' on beta: Event failed |
> > magic=2:1;30:268:0:a472c072-7fdc-4996-abe6-64a46331a1df cib=1.443.2
> > source=match_graph_event:310 complete=false
> > Jun 26 07:02:09 [4557] alpha       crmd:     info: match_graph_event:
> >  Action Lapras_migrate_to_0 (30) confirmed on beta (rc=1)
> > Jun 26 07:02:09 [4557] alpha       crmd:     info:
> > process_graph_event: Detected action (268.30)
> > Lapras_migrate_to_0.139=unknown error: failed
> > Jun 26 07:02:14 [4552] alpha        cib:     info: cib_process_ping:
> >  Reporting our current digest to alpha:
> > 05f076e7f5fcb9bd9695af7a83f2ab0a for 1.443.2 (0x560b20c971a0 0)
> > _______________________________________________
> > Users mailing list: Users at clusterlabs.org
> > https://lists.clusterlabs.org/mailman/listinfo/users
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.
> > pdf
> > Bugs: http://bugs.clusterlabs.org
> --
> Ken Gaillot <kgaillot at redhat.com>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org


More information about the Users mailing list