[ClusterLabs] Dangling migrate_to failues

Andrew Beekhof andrew at beekhof.net
Wed May 20 04:08:07 UTC 2015


Hmmm, I hadn’t noticed that before.
/me makes a note to investigate

> On 17 May 2015, at 5:10 pm, Vladislav Bogdanov <bubble at hoster-ok.com> wrote:
> 
> Hi,
> 
> I noticed that failures for migrate_top operation are not cleaned up from CIB
> status section after failure-timeout expires.
> This is 9470f07c (May 11)
> 
> (word-wrap turned off)
> # date +"%s";cibadmin -Q|grep failure
> 1431846184
>        <nvpair name="start-failure-is-fatal" value="true" id="cib-bootstrap-options-start-failure-is-fatal"/>
>        <nvpair name="failure-timeout" value="10m" id="rsc_options-failure-timeout"/>
>            <lrm_rsc_op id="vm1_last_failure_0" operation_key="vm1_migrate_to_0" operation="migrate_to" crm-debug-origin="do_update_resource" crm_feature_set="3.0.10" transition-key="597:550:0:e604cea6-9675-43a1-b159-dd9e23a628e8" transition-magic="0:1;597:550:0:e604cea6-9675-43a1-b159-dd9e23a628e8" on_node="v03-c" call-id="700" rc-code="1" op-status="0" interval="0" last-run="1431812101" last-rc-change="1431812101" exec-time="10625" queue-time="0" migrate_source="v03-c" migrate_target="v03-b" op-digest="45a721eebc50d085fae33232d7b6ad1a"/>
>            <lrm_rsc_op id="vm15_last_failure_0" operation_key="vm15_migrate_to_0" operation="migrate_to" crm-debug-origin="do_update_resource" crm_feature_set="3.0.10" transition-key="628:537:0:e604cea6-9675-43a1-b159-dd9e23a628e8" transition-magic="0:1;628:537:0:e604cea6-9675-43a1-b159-dd9e23a628e8" on_node="v03-c" call-id="644" rc-code="1" op-status="0" interval="0" last-run="1431811303" last-rc-change="1431811303" exec-time="41915" queue-time="0" migrate_source="v03-c" migrate_target="v03-a" op-digest="65e7c71cc42a81df454517835ea03cd5"/>
>            <lrm_rsc_op id="vm18-vm_last_failure_0" operation_key="vm18_migrate_to_0" operation="migrate_to" crm-debug-origin="do_update_resource" crm_feature_set="3.0.10" transition-key="648:537:0:e604cea6-9675-43a1-b159-dd9e23a628e8" transition-magic="0:1;648:537:0:e604cea6-9675-43a1-b159-dd9e23a628e8" on_node="v03-c" call-id="650" rc-code="1" op-status="0" interval="0" last-run="1431811327" last-rc-change="1431811327" exec-time="4232" queue-time="0" migrate_source="v03-c" migrate_target="v03-a" op-digest="6e36692d62d75497033cd6605b9834d6"/>
> 
> failure-timeout in rsc_defaults and cluster-recheck interval are both 10m.
> 
> Above VMs do not migrate any longer, but stop/start although crm_mon does not show any failures
> after timeout expired.
> 
> 
> Best,
> Vladislav
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org





More information about the Users mailing list