[ClusterLabs] Antw: [EXT] Inquiry - remote node fencing issue
Andrei Borzenkov
arvidjaar at gmail.com
Fri Nov 5 04:22:56 EDT 2021
On 05.11.2021 01:20, Ken Gaillot wrote:
>>
>> There are two issues discussed in this thread.
>>
>> 1. Remote node is fenced when connection with this node is lost. For
>> all
>> I can tell this is intended and expected behavior. That was the
>> original
>> question.
>
> It's expected only because the connection can't be recovered elsewhere.
> If another node can run the connection, pacemaker will try to reconnect
> from there and re-probe everything to make sure what the current state
> is.
>
That's not what I see in sources and documentation and not what I
obverse. Pacemaker will reprobe from another node only after attempting
fencing of remote node.
Documentation (pacemaker remote)
reconnect_interval: If this is a positive time interval, the cluster
will attempt to
reconnect to a remote node after an active
connection has been
lost at this interval. Otherwise, the cluster will
attempt to
reconnect immediately (*after any fencing needed*).
Config:
node 1: ha1 \
attributes pingd=1 \
utilization cpu=20
node 2: ha2 \
attributes pingd=1 \
utilization cpu=20
primitive qnetd ocf:pacemaker:remote \
params reconnect_interval=30s \
op monitor timeout=5s interval=10s \
meta target-role=Started
primitive stonith_sbd stonith:external/sbd \
op monitor interval=3600
property cib-bootstrap-options: \
cluster-infrastructure=corosync \
cluster-name=ha \
dc-version="2.1.0+20210816.c6a4f6e6c-1.1-2.1.0+20210816.c6a4f6e6c" \
last-lrm-refresh=1635607816 \
stonith-enabled=true \
have-watchdog=true \
stonith-watchdog-timeout=0 \
placement-strategy=balanced
Logs (skipping CIB updates, remote node active on ha1, ha2 is DC):
Nov 05 10:31:26.742 ha1 pacemaker-controld [3246]
(monitor_timeout_cb at controld_remote_ra.c:474) info: Timed out
waiting for remote poke response from qnetd
Nov 05 10:31:26.742 ha1 pacemaker-controld [3246]
(process_lrm_event at controld_execd.c:2826) error: Result of monitor
operation for qnetd on ha1: Timed Out | call=3 key=qnetd_monitor_10000
timeout=5000ms
Nov 05 10:31:26.742 ha1 pacemaker-controld [3246]
(lrmd_api_disconnect at lrmd_client.c:1640) info: Disconnecting TLS
qnetd executor connection
Nov 05 10:31:26.742 ha1 pacemaker-controld [3246]
(lrmd_tls_connection_destroy at lrmd_client.c:562) info: TLS
connection destroyed
Nov 05 10:31:26.742 ha1 pacemaker-controld [3246]
(remote_lrm_op_callback at controld_remote_ra.c:578) error: Lost
connection to Pacemaker Remote node qnetd
Nov 05 10:31:26.742 ha1 pacemaker-controld [3246]
(lrmd_api_disconnect at lrmd_client.c:1640) info: Disconnecting TLS
qnetd executor connection
Nov 05 10:31:26.773 ha2 pacemaker-schedulerd[3313]
(determine_online_status_fencing at unpack.c:1434) info: Node ha2 is
active
Nov 05 10:31:26.773 ha2 pacemaker-schedulerd[3313]
(determine_online_status at unpack.c:1574) info: Node ha2 is online
Nov 05 10:31:26.773 ha2 pacemaker-schedulerd[3313]
(determine_online_status_fencing at unpack.c:1434) info: Node ha1 is
active
Nov 05 10:31:26.773 ha2 pacemaker-schedulerd[3313]
(determine_online_status at unpack.c:1574) info: Node ha1 is online
Nov 05 10:31:26.773 ha2 pacemaker-schedulerd[3313]
(unpack_rsc_op_failure at unpack.c:3022) warning: Unexpected result
(error) was recorded for monitor of qnetd on ha1 at Nov 5 10:31:26 2021
| rc=1 id=qnetd_last_failure_0
Nov 05 10:31:26.773 ha2 pacemaker-schedulerd[3313]
(unpack_rsc_op_failure at unpack.c:3117) notice: qnetd will not be
started under current conditions
Nov 05 10:31:26.773 ha2 pacemaker-schedulerd[3313]
(pe_fence_node at unpack.c:143) warning: Remote node qnetd will be
fenced: remote connection is unrecoverable
Nov 05 10:31:26.773 ha2 pacemaker-schedulerd[3313]
(pcmk__native_allocate at pcmk_sched_native.c:626) info: Resource
qnetd cannot run anywhere
...
Nov 05 10:31:26.777 ha2 pacemaker-schedulerd[3313]
(stage6 at pcmk_sched_allocate.c:1634) warning: Scheduling Node qnetd for
STONITH
Nov 05 10:31:26.777 ha2 pacemaker-schedulerd[3313]
(log_list_item at output_log.c:198) notice: Actions: Fence (reboot)
qnetd 'remote connection is unrecoverable'
Nov 05 10:31:26.777 ha2 pacemaker-schedulerd[3313]
(rsc_action_default at pcmk_output.c:928) info: Leave stonith_sbd
(Started ha2)
Nov 05 10:31:26.777 ha2 pacemaker-schedulerd[3313]
(log_list_item at output_log.c:198) notice: Actions: Stop qnetd
( ha1 ) due to node availability
...
Nov 05 10:31:26.869 ha2 pacemaker-controld [3314]
(te_rsc_command at controld_te_actions.c:320) notice: Initiating stop
operation qnetd_stop_0 on ha1 | action 4
Nov 05 10:31:26.869 ha2 pacemaker-controld [3314]
(te_fence_node at controld_fencing.c:869) notice: Requesting fencing
(reboot) of node qnetd | action=3 timeout=60000
Nov 05 10:31:26.869 ha2 pacemaker-fenced [3310]
(handle_request at fenced_commands.c:2759) notice: Client
pacemaker-controld.3314 wants to fence (reboot) qnetd using any device
Nov 05 10:31:26.869 ha2 pacemaker-fenced [3310]
(initiate_remote_stonith_op at fenced_remote.c:1162) notice: Requesting
peer fencing (reboot) targeting qnetd | id=9c3bee1c state=querying
base_timeout=60
...
Nov 05 10:31:38.441 ha2 pacemaker-fenced [3310]
(log_operation at fenced_commands.c:2122) notice: Operation 'reboot'
[3568] (call 2 from pacemaker-controld.3314) targeting qnetd using
stonith_sbd returned 0 (OK)
Nov 05 10:31:38.445 ha2 pacemaker-fenced [3310]
(remote_op_done at fenced_remote.c:544) notice: Operation 'reboot'
targeting qnetd by ha2 for pacemaker-controld.3314 at ha2: OK | id=9c3bee1c
Nov 05 10:31:38.445 ha2 pacemaker-controld [3314]
(tengine_stonith_callback at controld_fencing.c:721) notice: Stonith
operation 2/3:3:0:181e4342-7bd4-49c7-895e-aad769dd7f0a: OK (0)
Nov 05 10:31:38.445 ha2 pacemaker-controld [3314]
(tengine_stonith_callback at controld_fencing.c:754) info: Stonith
operation 2 for qnetd passed
...
Nov 05 10:31:38.449 ha2 pacemaker-controld [3314]
(tengine_stonith_notify at controld_fencing.c:502) notice: Peer qnetd
was terminated (reboot) by ha2 on behalf of pacemaker-controld.3314: OK
| initiator=ha2 ref=9c3bee1c-a4ab-46b5-9c79-9a0e1b3b8841
Nov 05 10:31:38.449 ha2 pacemaker-controld [3314]
(controld_delete_node_state at controld_based.c:258) info: Deleting all
state for node qnetd (via CIB call 46) |
xpath=//node_state[@uname='qnetd']/*
...
Nov 05 10:31:57.449 ha2 pacemaker-controld [3314]
(crm_timer_popped at controld_timers.c:122) info: Cluster Recheck Timer
just popped | input=I_PE_CALC time=19000ms
That message is interesting. I do not think it was recheck timeout, it
was too early. This is probably result of successful fencing, but it
certainly misleading.
Nov 05 10:31:57.449 ha2 pacemaker-controld [3314]
(do_state_transition at controld_fsa.c:543) notice: State transition
S_IDLE -> S_POLICY_ENGINE | input=I_PE_CALC cause=C_TIMER_POPPED
origin=crm_timer_popped
Nov 05 10:31:57.449 ha2 pacemaker-controld [3314]
(do_state_transition at controld_fsa.c:629) info: Progressed to state
S_POLICY_ENGINE after C_TIMER_POPPED
Nov 05 10:31:57.453 ha2 pacemaker-schedulerd[3313]
(unpack_config at unpack.c:226) info: Watchdog-based self-fencing
will be performed via SBD if fencing is required and
stonith-watchdog-timeout is nonzero
Nov 05 10:31:57.453 ha2 pacemaker-schedulerd[3313]
(determine_online_status_fencing at unpack.c:1434) info: Node ha2 is
active
Nov 05 10:31:57.453 ha2 pacemaker-schedulerd[3313]
(determine_online_status at unpack.c:1574) info: Node ha2 is online
Nov 05 10:31:57.453 ha2 pacemaker-schedulerd[3313]
(determine_online_status_fencing at unpack.c:1434) info: Node ha1 is
active
Nov 05 10:31:57.453 ha2 pacemaker-schedulerd[3313]
(determine_online_status at unpack.c:1574) info: Node ha1 is online
Nov 05 10:31:57.453 ha2 pacemaker-schedulerd[3313]
(pe_get_failcount at failcounts.c:344) info: qnetd has failed 1 times on ha1
Nov 05 10:31:57.453 ha2 pacemaker-schedulerd[3313]
(pe__clear_failcount at failcounts.c:376) notice: Clearing failure of
qnetd on ha1 because it expired | qnetd_clear_failcount_0
Nov 05 10:31:57.453 ha2 pacemaker-schedulerd[3313]
(check_operation_expiry at unpack.c:3486) info: Clearing monitor
failure will wait until any scheduled fencing of qnetd completes
Nov 05 10:31:57.453 ha2 pacemaker-schedulerd[3313]
(pe_get_failcount at failcounts.c:344) info: qnetd has failed 1 times on ha1
Nov 05 10:31:57.453 ha2 pacemaker-schedulerd[3313]
(pe__clear_failcount at failcounts.c:376) notice: Clearing failure of
qnetd on ha1 because it expired | qnetd_clear_failcount_0
Nov 05 10:31:57.453 ha2 pacemaker-schedulerd[3313]
(check_operation_expiry at unpack.c:3486) info: Clearing monitor
failure will wait until any scheduled fencing of qnetd completes
Nov 05 10:31:57.453 ha2 pacemaker-schedulerd[3313]
(unpack_rsc_op at unpack.c:3760) notice: Rescheduling
qnetd_monitor_10000 after failure expired on ha1 | actual=1 expected=0
magic=2:1;6:1:0:181e4342-7bd4-49c7-895e-aad769dd7f0a
Nov 05 10:31:57.453 ha2 pacemaker-schedulerd[3313]
(pe_get_failcount at failcounts.c:344) info: qnetd has failed 1 times on ha1
Nov 05 10:31:57.453 ha2 pacemaker-schedulerd[3313]
(pe__clear_failcount at failcounts.c:376) notice: Clearing failure of
qnetd on ha1 because it expired | qnetd_clear_failcount_0
...
Nov 05 10:31:57.453 ha2 pacemaker-schedulerd[3313]
(rsc_action_default at pcmk_output.c:928) info: Leave stonith_sbd
(Started ha2)
Nov 05 10:31:57.453 ha2 pacemaker-schedulerd[3313]
(log_list_item at output_log.c:198) notice: Actions: Start qnetd
( ha1 )
Nov 05 10:31:57.462 ha1 pacemaker-controld [3246]
(do_lrm_rsc_op at controld_execd.c:2281) notice: Requesting local
execution of start operation for qnetd on ha1 |
transition_key=6:4:0:181e4342-7bd4-49c7-895e-aad769dd7f0a
op_key=qnetd_start_0
Nov 05 10:31:57.466 ha1 pacemaker-controld [3246]
(pcmk__connect_remote at remote.c:1126) info: Attempting remote
connection to 192.168.1.3:3121
...
Nov 05 10:32:03.974 ha1 pacemaker-controld [3246]
(lrmd_tls_connection_destroy at lrmd_client.c:562) info: TLS
connection destroyed
Nov 05 10:32:03.974 ha1 pacemaker-controld [3246]
(lrmd_tcp_connect_cb at lrmd_client.c:1381) info: Could not connect to
Pacemaker Remote at qnetd:3121: No route to host | rc=113
Nov 05 10:32:04.978 ha1 pacemaker-controld [3246]
(pcmk__connect_remote at remote.c:1126) info: Attempting remote
connection to 192.168.1.3:3121
Nov 05 10:32:06.978 ha1 pacemaker-controld [3246]
(lrmd_tls_connection_destroy at lrmd_client.c:562) info: TLS
connection destroyed
Nov 05 10:32:13.990 ha1 pacemaker-controld [3246]
(pcmk__connect_remote at remote.c:1126) info: Attempting remote
connection to 192.168.1.3:3121
Nov 05 10:32:15.994 ha1 pacemaker-controld [3246]
(lrmd_tls_connection_destroy at lrmd_client.c:562) info: TLS
connection destroyed
Nov 05 10:32:15.994 ha1 pacemaker-controld [3246]
(lrmd_tcp_connect_cb at lrmd_client.c:1381) info: Could not connect to
Pacemaker Remote at qnetd:3121: No route to host | rc=113
Nov 05 10:32:15.994 ha1 pacemaker-controld [3246]
(process_lrm_event at controld_execd.c:2826) error: Result of start
operation for qnetd on ha1: Timed Out | call=5 key=qnetd_start_0
timeout=20000ms
Nov 05 10:32:16.001 ha2 pacemaker-controld [3314]
(update_failcount at controld_te_events.c:172) info: Updating failcount
for qnetd on ha1 after failed start: rc=1 (update=INFINITY, time=1636097536)
...
Nov 05 10:32:16.041 ha2 pacemaker-schedulerd[3313]
(log_list_item at output_log.c:198) notice: Actions: Move
stonith_sbd ( ha2 -> ha1 )
Nov 05 10:32:16.045 ha2 pacemaker-schedulerd[3313]
(log_list_item at output_log.c:198) notice: Actions: Recover qnetd
( ha1 -> ha2 )
And only now will it move resource to ha2 and succeed.
>
>> 2. Remote resource appears to not fail over. I cannot reproduce it,
>> but
>> then we also do not have the complete CIB, so something may affect
>> it.
>> OTOH logs shown stop before fencing has possibly succeeded, so may be
>> remote resource *did* fail over.
>>
>> What I see is - connection to remote node is lost, pacemaker fences
>> remote node and attempts to restart remote resource, if this is
>> unsuccessful (meaning - connection still could not be established)
>> remote resource fails over to another node.
>>
>> I do not know if it is possible to avoid fencing of remote node under
>> described conditions.
>>
>> What is somewhat interesting (and looks like a bug) - in my testing
>> pacemaker ignored failed fencing attempt and proceeded with
>> restarting
>> of remote resource. Is it expected behavior?
>
> I don't see a failed fencing attempt (or any result of the fencing
> attempt) in the logs in the original message, only failures of the
> connection monitor.
>
Here are logs without reconnect_interval parameter explicitly defined:
Nov 05 11:08:37.339 ha1 pacemaker-controld [4649]
(lrmd_api_disconnect at lrmd_client.c:1640) info: Disconnecting TLS
qnetd executor connection
Nov 05 11:08:37.339 ha1 pacemaker-controld [4649]
(lrmd_tls_connection_destroy at lrmd_client.c:562) info: TLS
connection destroyed
Nov 05 11:08:37.339 ha1 pacemaker-controld [4649]
(remote_lrm_op_callback at controld_remote_ra.c:578) error: Lost
connection to Pacemaker Remote node qnetd
Nov 05 11:08:37.339 ha1 pacemaker-controld [4649]
(lrmd_api_disconnect at lrmd_client.c:1640) info: Disconnecting TLS
qnetd executor connection
...
Nov 05 11:08:37.355 ha1 pacemaker-schedulerd[4648]
(unpack_rsc_op_failure at unpack.c:3022) warning: Unexpected result
(error) was recorded for monitor of qnetd on ha1 at Nov 5 11:08:37 2021
| rc=1 id=qnetd_last_failure_0
Nov 05 11:08:37.355 ha1 pacemaker-schedulerd[4648]
(pe_fence_node at unpack.c:143) warning: Remote node qnetd will be
fenced: remote connection is unrecoverable
Nov 05 11:08:37.355 ha1 pacemaker-schedulerd[4648]
(RecurringOp at pcmk_sched_native.c:855) info: Start recurring
monitor (10s) for qnetd on ha1
Nov 05 11:08:37.355 ha1 pacemaker-schedulerd[4648]
(stage6 at pcmk_sched_allocate.c:1634) warning: Scheduling Node qnetd for
STONITH
Nov 05 11:08:37.355 ha1 pacemaker-schedulerd[4648]
(log_list_item at output_log.c:198) notice: Actions: Fence (reboot)
qnetd 'remote connection is unrecoverable'
Nov 05 11:08:37.355 ha1 pacemaker-schedulerd[4648]
(rsc_action_default at pcmk_output.c:928) info: Leave stonith_sbd
(Started ha2)
Nov 05 11:08:37.355 ha1 pacemaker-schedulerd[4648]
(log_list_item at output_log.c:198) notice: Actions: Recover qnetd
( ha1 )
...
Nov 05 11:08:37.363 ha1 pacemaker-fenced [4645]
(can_fence_host_with_device at fenced_commands.c:1784) notice: stonith_sbd
is not eligible to fence (reboot) qnetd: static-list
...
Nov 05 11:08:37.375 ha1 pacemaker-fenced [4645]
(stonith_choose_peer at fenced_remote.c:1301) notice: Couldn't find
anyone to fence (reboot) qnetd using any device
...
Nov 05 11:08:37.379 ha1 pacemaker-fenced [4645]
(remote_op_done at fenced_remote.c:544) error: Operation 'reboot'
targeting qnetd by unknown node for pacemaker-controld.4649 at ha1: No such
device | id=260f5b86
Nov 05 11:08:37.379 ha1 pacemaker-controld [4649]
(tengine_stonith_callback at controld_fencing.c:721) notice: Stonith
operation 2/2:2:0:809530af-8cc5-4ff5-9193-bfd2ddc6a442: No such device (-19)
Nov 05 11:08:37.379 ha1 pacemaker-controld [4649]
(tengine_stonith_callback at controld_fencing.c:797) notice: Stonith
operation 2 for qnetd failed (No such device): aborting transition.
... and still
Nov 05 11:08:37.467 ha1 pacemaker-schedulerd[4648]
(RecurringOp at pcmk_sched_native.c:855) info: Start recurring
monitor (10s) for qnetd on ha1
Nov 05 11:08:37.467 ha1 pacemaker-schedulerd[4648]
(rsc_action_default at pcmk_output.c:928) info: Leave stonith_sbd
(Started ha2)
Nov 05 11:08:37.467 ha1 pacemaker-schedulerd[4648]
(log_list_item at output_log.c:198) notice: Actions: Start qnetd
( ha1 )
Which fails and it moves to ha2.
The difference seems to be reconnect_interval parameter. If it is
present in remote resource definition, pacemaker will not proceed after
failed fencing.
As there is no real documentation how it is supposed to work I do not
know whether all of this is a bug or not. But one is certainly sure -
when connection to remote node is lost the first thing pacemaker does is
to fence it and only then initiate any recovery action.
More information about the Users
mailing list