[ClusterLabs] Antw: Re: Antw: [EXT] clear_failcount operation times out, makes it impossible to use the cluster

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Thu Jan 5 02:22:07 EST 2023


>>> Krzysztof Bodora <krzysztof.bodora at open-e.com> schrieb am 04.01.2023 um
09:51
in Nachricht <f779bcf3-1936-25ba-c464-44144c4aff43 at open-e.com>:
> It's an old installation, the error started appearing when one of the 
> nodes was disconnected and the OS was re-installed, after which the 
> pacemaker configuration was restored from a backup (pcs cluster 
> cib-push) and the node rejoined the cluster. The failcount itself was at 
> 8 for some time before, though. The configuration looks like this:

My guess is that it was "not enough" to restore the pacemaker configuration
only.
What was the reason to re-install the OS? In the last 30 years I never had to
reinstall an OS, even if support suggested to do so ��.
I suggest to repeat and validate al lthe steps needed to set up a new
cluster.
What does "corosync-cfgtool -s" say?

Regards,
Ulrich


> 
> pcs config
> 
> Cluster Name:
> Corosync Nodes:
>   10.151.50.43 10.151.50.42
> Pacemaker Nodes:
>   swdal1-ISCSI01 swdal1-ISCSI02
> 
> Resources:
>   Clone: ping_resource-clone
>    Resource: ping_resource (class=ocf provider=pacemaker type=ping)
>     Attributes: multiplier=1000 dampen=15s attempts=4 timeout=3 
> host_list="10.151.16.60 10.151.16.50 10.151.17.50 10.151.17.60"
>     Operations: start interval=0s timeout=60 
> (ping_resource-start-interval-0s)
>                 stop interval=0s timeout=20
(ping_resource-stop-interval-0s)
>                 monitor interval=5s timeout=15s 
> (ping_resource-monitor-interval-5s)
>   Resource: Pool-1 (class=ocf provider=oe type=zfs)
>    Attributes: pool_name=Pool-1 pool_id=9200090953161398950 
> encryption_password_hash=None
>    Meta Attrs: failure-timeout=30 is-managed=True 
> encryption_password_hash=None
>    Operations: start interval=0s timeout=300 (Pool-1-start-interval-0s)
>                stop interval=0s timeout=300 (Pool-1-stop-interval-0s)
>                monitor interval=10 timeout=60 (Pool-1-monitor-interval-10)
>   Resource: Pool-0 (class=ocf provider=oe type=zfs)
>    Attributes: pool_name=Pool-0 pool_id=4165732781319344895 
> encryption_password_hash=None
>    Meta Attrs: failure-timeout=30 is-managed=True 
> encryption_password_hash=None
>    Operations: start interval=0s timeout=300 (Pool-0-start-interval-0s)
>                stop interval=0s timeout=300 (Pool-0-stop-interval-0s)
>                monitor interval=10 timeout=60 (Pool-0-monitor-interval-10)
> 
> Stonith Devices:
> Fencing Levels:
> 
> Location Constraints:
>    Resource: Pool-0
>      Enabled on: swdal1-ISCSI01 (score:1) 
> (id:location-Pool-0-swdal1-ISCSI01-1)
>      Constraint: location-Pool-0
>        Rule: score=-INFINITY boolean-op=or (id:location-Pool-0-rule)
>          Expression: pingd lt 1  (id:location-Pool-0-rule-expr)
>          Expression: not_defined pingd (id:location-Pool-0-rule-expr-1)
>    Resource: Pool-1
>      Enabled on: swdal1-ISCSI01 (score:1) 
> (id:location-Pool-1-swdal1-ISCSI01-1)
>      Constraint: location-Pool-1
>        Rule: score=-INFINITY boolean-op=or (id:location-Pool-1-rule)
>          Expression: pingd lt 1  (id:location-Pool-1-rule-expr)
>          Expression: not_defined pingd (id:location-Pool-1-rule-expr-1)
> Ordering Constraints:
> Colocation Constraints:
> 
> Resources Defaults:
>   resource-stickiness: 100000
> Operations Defaults:
>   record-pending: true
> 
> Cluster Properties:
>   batch-limit: 1
>   cluster-infrastructure: corosync
>   cluster-recheck-interval: 180
>   dc-version: 1.1.12-1.1.12+git+561c4cf
>   no-quorum-policy: ignore
>   stonith-enabled: false
>   stop-orphan-resources: false
> 
> W dniu 02.01.2023 o 13:12, Ulrich Windl pisze:
>> Hi!
>>
>> I wonder: Is this a new installation, or is it a new bug in an old 
> installation?
>> For the first case I'd recommend to start with current software, and for
the 
> second case please describe what had changed or what had triggered the 
> situation.
>> Also provide basic configuration data, please.
>>
>> Regards,
>> Ulrich
>>
>>
>>>>> Krzysztof Bodora <krzysztof.bodora at open-e.com> schrieb am 02.01.2023 um
12:16
>> in Nachricht <37d86b8d-c59f-5fe3-cba1-41d2c84fcb5b at open-e.com>:
>>> Hello Clusterlabs,
>>>
>>> I'm getting this error in the logs:
>>>
>>> Dec 20 09:25:43 [57862] swdal1-ISCSI01       crmd:    error:
>>> print_synapse:     [Action    7]: In-flight crm op
>>> ping_resource_clear_failcount_0   on swdal1-ISCSI01 (priority: 0,
>>> waiting: none)
>>>
>>> My specfications:
>>>
>>> OS: Debian 8
>>> Pacemaker version: 1.1.12
>>> Kernel version: 4.19.190
>>>
>>> I'd like to know what can cause this error to happen and how to prevent
>>> it in the future. I'm also currently unable to update to a newer version
>>> of pacemaker.
>>>
>>> Here is some context for when it happens. It seems that the
>>> ping_resource resources are in 'Restart' state:
>>>
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:   notice:
>>> LogActions:        Restart ping_resource:0 (Started swdal1-ISCSI01)
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:   notice:
>>> LogActions:        Restart ping_resource:1 (Started swdal1-ISCSI02)
>>>
>>> which causes pacemaker to try to clear the failcounts on those resources:
>>>
>>> Dec 20 09:24:23 [57862] swdal1-ISCSI01       crmd:     info:
>>> do_state_transition:       State transition S_POLICY_ENGINE ->
>>> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE
>>> origin=handle_response ]
>>> Dec 20 09:24:23 [57862] swdal1-ISCSI01       crmd:     info:
>>> do_te_invoke:      Processing graph 11 (ref=pe_calc-dc-1671528262-59)
>>> derived from /var/lib/pacemaker/pengine/pe-input-518.bz2
>>> Dec 20 09:24:23 [57862] swdal1-ISCSI01       crmd:     info:
>>> te_crm_command:    Executing crm-event (7): clear_failcount on
>>> swdal1-ISCSI01
>>> Dec 20 09:24:23 [57862] swdal1-ISCSI01       crmd:     info:
>>> handle_failcount_op:       Removing failcount for ping_resource
>>> Dec 20 09:24:23 [57841] swdal1-ISCSI01        cib:     info:
>>> cib_process_request:       Forwarding cib_delete operation for section
>>>
//node_state[@uname='swdal1-ISCSI01']//lrm_resource[@id='ping_resource']/lrm
>>> _rsc_op[@id='ping_resource_last_failure_0']
>>> to master (origin=local/crmd/118)
>>> Dec 20 09:24:23 [57841] swdal1-ISCSI01        cib:     info:
>>> cib_process_request:       Completed cib_delete operation for section
>>>
//node_state[@uname='swdal1-ISCSI01']//lrm_resource[@id='ping_resource']/lrm
>>> _rsc_op[@id='ping_resource_last_failure_0']:
>>> OK (rc=0, origin=swdal1-ISCSI01/crmd/118, version=0.60.0)
>>> Dec 20 09:24:28 [57841] swdal1-ISCSI01        cib:     info:
>>> cib_process_ping:  Reporting our current digest to swdal1-ISCSI01:
>>> ccf71244504d3deb02d0da64fa72cedc for 0.60.0 (0x55788a83c4b0 0)
>>> Dec 20 09:25:43 [57862] swdal1-ISCSI01       crmd:  warning:
>>> action_timer_callback:     Timer popped (timeout=20000, abort_level=0,
>>> complete=false)
>>> Dec 20 09:25:43 [57862] swdal1-ISCSI01       crmd:    error:
>>> print_synapse:     [Action    7]: In-flight crm op
>>> ping_resource_clear_failcount_0   on swdal1-ISCSI01 (priority: 0,
>>> waiting: none)
>>> Dec 20 09:25:43 [57862] swdal1-ISCSI01       crmd:   notice:
>>> abort_transition_graph:    Transition aborted: Action lost
>>> (source=action_timer_callback:772, 0)
>>> Dec 20 09:25:43 [57862] swdal1-ISCSI01       crmd:   notice:
>>> run_graph:         Transition 11 (Complete=1, Pending=0, Fired=0,
>>> Skipped=9, Incomplete=2,
>>> Source=/var/lib/pacemaker/pengine/pe-input-518.bz2): Stopped
>>>
>>> Clearing the failcount fails, so the whole transition is aborted. This
>>> make it impossible to do anything in the cluster, for example move
>>> Pool-0 resource, as it also trigger the clear_failcount operation which
>>> fails and aborts the transition, for example:
>>>
>>> Dec 20 09:35:04 [57851] swdal1-ISCSI01    pengine:     info:
>>> RecurringOp:        Start recurring monitor (5s) for ping_resource:0 on
>>> swdal1-ISCSI01
>>> Dec 20 09:35:04 [57851] swdal1-ISCSI01    pengine:     info:
>>> RecurringOp:        Start recurring monitor (5s) for ping_resource:1 on
>>> swdal1-ISCSI02
>>> Dec 20 09:35:04 [57851] swdal1-ISCSI01    pengine:     info:
>>> RecurringOp:        Start recurring monitor (10s) for Pool-0 on
>>> swdal1-ISCSI02
>>> Dec 20 09:35:04 [57851] swdal1-ISCSI01    pengine:   notice:
>>> LogActions:        Restart ping_resource:0 (Started swdal1-ISCSI01)
>>> Dec 20 09:35:04 [57851] swdal1-ISCSI01    pengine:   notice:
>>> LogActions:        Restart ping_resource:1 (Started swdal1-ISCSI02)
>>> Dec 20 09:35:04 [57851] swdal1-ISCSI01    pengine:     info:
>>> LogActions:        Leave   Pool-1  (Started swdal1-ISCSI01)
>>> Dec 20 09:35:04 [57851] swdal1-ISCSI01    pengine:   notice:
>>> LogActions:        Move    Pool-0  (Started swdal1-ISCSI01 ->
>>> swdal1-ISCSI02)
>>> Dec 20 09:35:04 [57851] swdal1-ISCSI01    pengine:   notice:
>>> process_pe_message:        Calculated Transition 19:
>>> /var/lib/pacemaker/pengine/pe-input-519.bz2
>>> Dec 20 09:35:04 [57862] swdal1-ISCSI01       crmd:     info:
>>> do_state_transition:       State transition S_POLICY_ENGINE ->
>>> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE
>>> origin=handle_response ]
>>> Dec 20 09:35:04 [57862] swdal1-ISCSI01       crmd:     info:
>>> do_te_invoke:      Processing graph 19 (ref=pe_calc-dc-1671528904-75)
>>> derived from /var/lib/pacemaker/pengine/pe-input-519.bz2
>>> Dec 20 09:35:04 [57862] swdal1-ISCSI01       crmd:     info:
>>> te_crm_command:    Executing crm-event (7): clear_failcount on
>>> swdal1-ISCSI01
>>> Dec 20 09:35:04 [57862] swdal1-ISCSI01       crmd:     info:
>>> handle_failcount_op:       Removing failcount for ping_resource
>>> Dec 20 09:35:04 [57841] swdal1-ISCSI01        cib:     info:
>>> cib_process_request:       Forwarding cib_delete operation for section
>>>
//node_state[@uname='swdal1-ISCSI01']//lrm_resource[@id='ping_resource']/lrm
>>> _rsc_op[@id='ping_resource_last_failure_0']
>>> to master (origin=local/crmd/134)
>>> Dec 20 09:35:04 [57841] swdal1-ISCSI01        cib:    info:
>>> cib_process_request:       Completed cib_delete operation for section
>>>
//node_state[@uname='swdal1-ISCSI01']//lrm_resource[@id='ping_resource']/lrm
>>> _rsc_op[@id='ping_resource_last_failure_0']:
>>> OK (rc=0, origin=swdal1-ISCSI01/crmd/134, version=0.61.0)
>>> Dec 20 09:35:09 [57841] swdal1-ISCSI01        cib:     info:
>>> cib_process_ping:  Reporting our current digest to swdal1-ISCSI01:
>>> decc3ad1315820648f242167998a5880 for 0.61.0 (0x55788a8408e0 0)
>>> Dec 20 09:36:24 [57862] swdal1-ISCSI01       crmd:  warning:
>>> action_timer_callback:     Timer popped (timeout=20000, abort_level=0,
>>> complete=false)
>>> Dec 20 09:36:24 [57862] swdal1-ISCSI01       crmd:    error:
>>> print_synapse:     [Action    7]: In-flight crm op
>>> ping_resource_clear_failcount_0   on swdal1-ISCSI01 (priority: 0,
>>> waiting: none)
>>> Dec 20 09:36:24 [57862] swdal1-ISCSI01       crmd:   notice:
>>> abort_transition_graph:    Transition aborted: Action lost
>>> (source=action_timer_callback:772, 0)
>>> Dec 20 09:36:24 [57862] swdal1-ISCSI01       crmd:   notice:
>>> run_graph:         Transition 19 (Complete=1, Pending=0, Fired=0,
>>> Skipped=12, Incomplete=2,
>>> Source=/var/lib/pacemaker/pengine/pe-input-519.bz2): Stopped
>>>
>>> As you can see the 'stop' operation for resource Pool-0 did not even
>>> run, as the transition was stopped by the clear_failcount error. This
>>> error kept happening until we restarted pacemaker. Here is some more
>>> context from one of the times this error has happened:
>>>
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:     info:
>>> process_pe_message:        Input has not changed since last time, not
>>> saving to disk
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:   notice:
>>> unpack_config:     On loss of CCM Quorum: Ignore
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:     info:
>>> determine_online_status:   Node swdal1-ISCSI01 is online
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:     info:
>>> determine_online_status:   Node swdal1-ISCSI02 is online
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:     info:
>>> determine_op_status:       Operation monitor found resource Pool-0
>>> active on swdal1-ISCSI01
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:     info:
>>> determine_op_status:       Operation monitor found resource Pool-0
>>> active on swdal1-ISCSI01
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:     info:
>>> determine_op_status:       Operation monitor found resource Pool-1
>>> active on swdal1-ISCSI01
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:     info:
>>> determine_op_status:       Operation monitor found resource Pool-1
>>> active on swdal1-ISCSI01
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:     info:
>>> clone_print:        Clone Set: ping_resource-clone [ping_resource]
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:     info:
>>> short_print:            Started: [ swdal1-ISCSI01 swdal1-ISCSI02 ]
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:     info:
>>> native_print:      Pool-1  (ocf::oe:zfs):  Started swdal1-ISCSI01
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:     info:
>>> native_print:      Pool-0  (ocf::oe:zfs):  Started swdal1-ISCSI01
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:     info:
>>> get_failcount_full:        ping_resource:0 has failed 8 times on
>>> swdal1-ISCSI01
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:     info:
>>> common_apply_stickiness:   ping_resource-clone can fail 999992 more
>>> times on swdal1-ISCSI01 before being forced off
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:     info:
>>> get_failcount_full:        ping_resource:1 has failed 8 times on
>>> swdal1-ISCSI01
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:     info:
>>> common_apply_stickiness:   ping_resource-clone can fail 999992 more
>>> times on swdal1-ISCSI01 before being forced off
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:     info:
>>> check_action_definition:   params:reload   <parameters multiplier="1000"
>>> dampen="15s" host_list="10.151.17.50 10.151.16.50 10.151.17.60
>>> 10.151.16.60" attempts="4" timeout="3"/>
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:     info:
>>> check_action_definition:   Parameters to ping_resource:0_start_0 on
>>> swdal1-ISCSI01 changed: was 57524cd0b7204dd60c127ba66fb83cd2 vs. now
>>> 1a37c0e0391890df8549f5fda647f4d9 (reload:3.0.9)
>>> 0:0;14:28:0:a0f1b96e-5089-4dad-9073-8c8feac4ea3a
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:     info:
>>> get_failcount_full:        ping_resource:0 has failed 8 times on
>>> swdal1-ISCSI01
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:     info:
>>> check_action_definition:   params:reload   <parameters multiplier="1000"
>>> dampen="15s" host_list="10.151.17.50 10.151.16.50 10.151.17.60
>>> 10.151.16.60" attempts="4" timeout="3" CRM_meta_timeout="15000"/>
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:     info:
>>> check_action_definition:   Parameters to ping_resource:0_monitor_5000 on
>>> swdal1-ISCSI01 changed: was f3b4adf4d46692f312296263faa50a75 vs. now
>>> c0d10fc8996c295dd1213d4ca058c0e7 (reload:3.0.9)
>>> 0:0;15:28:0:a0f1b96e-5089-4dad-9073-8c8feac4ea3a
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:     info:
>>> get_failcount_full:        ping_resource:0 has failed 8 times on
>>> swdal1-ISCSI01
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:     info:
>>> check_action_definition:   params:reload   <parameters multiplier="1000"
>>> dampen="15s" host_list="10.151.17.50 10.151.16.50 10.151.17.60
>>> 10.151.16.60" attempts="4" timeout="3"/>
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:     info:
>>> check_action_definition:   Parameters to ping_resource:1_start_0 on
>>> swdal1-ISCSI02 changed: was 57524cd0b7204dd60c127ba66fb83cd2 vs. now
>>> 1a37c0e0391890df8549f5fda647f4d9 (reload:3.0.9)
>>> 0:0;17:7:0:0ea53274-56ef-48f6-9de1-38d635fa2530
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:     info:
>>> check_action_definition:   params:reload   <parameters multiplier="1000"
>>> dampen="15s" host_list="10.151.17.50 10.151.16.50 10.151.17.60
>>> 10.151.16.60" attempts="4" timeout="3" CRM_meta_timeout="15000"/>
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:     info:
>>> check_action_definition:   Parameters to ping_resource:1_monitor_5000 on
>>> swdal1-ISCSI02 changed: was f3b4adf4d46692f312296263faa50a75 vs. now
>>> c0d10fc8996c295dd1213d4ca058c0e7 (reload:3.0.9)
>>> 0:0;18:7:0:0ea53274-56ef-48f6-9de1-38d635fa2530
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:     info:
>>> RecurringOp:        Start recurring monitor (5s) for ping_resource:0 on
>>> swdal1-ISCSI01
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:     info:
>>> RecurringOp:        Start recurring monitor (5s) for ping_resource:1 on
>>> swdal1-ISCSI02
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:   notice:
>>> LogActions:        Restart ping_resource:0 (Started swdal1-ISCSI01)
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:   notice:
>>> LogActions:        Restart ping_resource:1 (Started swdal1-ISCSI02)
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:     info:
>>> LogActions:        Leave   Pool-1  (Started swdal1-ISCSI01)
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:     info:
>>> LogActions:        Leave   Pool-0  (Started swdal1-ISCSI01)
>>> Dec 20 09:24:23 [57851] swdal1-ISCSI01    pengine:   notice:
>>> process_pe_message:        Calculated Transition 11:
>>> /var/lib/pacemaker/pengine/pe-input-518.bz2
>>> Dec 20 09:24:23 [57862] swdal1-ISCSI01       crmd:     info:
>>> do_state_transition:       State transition S_POLICY_ENGINE ->
>>> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE
>>> origin=handle_response ]
>>> Dec 20 09:24:23 [57862] swdal1-ISCSI01       crmd:     info:
>>> do_te_invoke:      Processing graph 11 (ref=pe_calc-dc-1671528262-59)
>>> derived from /var/lib/pacemaker/pengine/pe-input-518.bz2
>>> Dec 20 09:24:23 [57862] swdal1-ISCSI01       crmd:     info:
>>> te_crm_command:    Executing crm-event (7): clear_failcount on
>>> swdal1-ISCSI01
>>> Dec 20 09:24:23 [57862] swdal1-ISCSI01       crmd:     info:
>>> handle_failcount_op:       Removing failcount for ping_resource
>>> Dec 20 09:24:23 [57841] swdal1-ISCSI01        cib:     info:
>>> cib_process_request:       Forwarding cib_delete operation for section
>>>
//node_state[@uname='swdal1-ISCSI01']//lrm_resource[@id='ping_resource']/lrm
>>> _rsc_op[@id='ping_resource_last_failure_0']
>>> to master (origin=local/crmd/118)
>>> Dec 20 09:24:23 [57841] swdal1-ISCSI01        cib:     info:
>>> cib_process_request:       Completed cib_delete operation for section
>>>
//node_state[@uname='swdal1-ISCSI01']//lrm_resource[@id='ping_resource']/lrm
>>> _rsc_op[@id='ping_resource_last_failure_0']:
>>> OK (rc=0, origin=swdal1-ISCSI01/crmd/118, version=0.60.0)
>>> Dec 20 09:24:28 [57841] swdal1-ISCSI01        cib:     info:
>>> cib_process_ping:  Reporting our current digest to swdal1-ISCSI01:
>>> ccf71244504d3deb02d0da64fa72cedc for 0.60.0 (0x55788a83c4b0 0)
>>> Dec 20 09:25:43 [57862] swdal1-ISCSI01       crmd:  warning:
>>> action_timer_callback:     Timer popped (timeout=20000, abort_level=0,
>>> complete=false)
>>> Dec 20 09:25:43 [57862] swdal1-ISCSI01       crmd:    error:
>>> print_synapse:     [Action    7]: In-flight crm op
>>> ping_resource_clear_failcount_0   on swdal1-ISCSI01 (priority: 0,
>>> waiting: none)
>>> Dec 20 09:25:43 [57862] swdal1-ISCSI01       crmd:   notice:
>>> abort_transition_graph:    Transition aborted: Action lost
>>> (source=action_timer_callback:772, 0)
>>> Dec 20 09:25:43 [57862] swdal1-ISCSI01       crmd:   notice:
>>> run_graph:         Transition 11 (Complete=1, Pending=0, Fired=0,
>>> Skipped=9, Incomplete=2,
>>> Source=/var/lib/pacemaker/pengine/pe-input-518.bz2): Stopped
>>> Dec 20 09:25:43 [57862] swdal1-ISCSI01       crmd:     info:
>>> do_state_transition:       State transition S_TRANSITION_ENGINE ->
>>> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=notify_crmd
]
>>>
>>> I'd appreciate some information about this topic.
>>>
>>> _______________________________________________
>>> Manage your subscription:
>>> https://lists.clusterlabs.org/mailman/listinfo/users 
>>>
>>> ClusterLabs home: https://www.clusterlabs.org/ 
>>
>>
>> _______________________________________________
>> Manage your subscription:
>> https://lists.clusterlabs.org/mailman/listinfo/users 
>>
>> ClusterLabs home: https://www.clusterlabs.org/ 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 





More information about the Users mailing list