[ClusterLabs] "No such device" with fence_pve agent

Ken Gaillot kgaillot at redhat.com
Tue Mar 22 10:24:11 EDT 2016


On 03/22/2016 06:32 AM, Stanislav Kopp wrote:
> Hi,
> 
> I have problem with using "fence_pve" agent with pacemaker, the agent
> works fine from command line, but if I simulate stonith action or use
> "crm node fence <node>", it doesn't work:
> 
>  Mar 22 10:38:06 [675] redis2 stonith-ng: debug:
> xml_patch_version_check: Can apply patch 0.50.22 to 0.50.21
> Mar 22 10:38:06 [675] redis2 stonith-ng: debug: stonith_command:
> Processing st_query 0 from redis1 ( 0)
> Mar 22 10:38:06 [675] redis2 stonith-ng: debug: stonith_query: Query
> <stonith_command name="stonith_command" t="stonith-ng"
> st_async_id="da507055-afc3-4b5c-bcc0-f887dd1736d3" st_op="st_query"
> st_callid="2" st_callopt="0"
> st_remote_op="da507055-afc3-4b5c-bcc0-f887dd1736d3" st_target="redis1"
> st_device_action="reboot" st_origin="redis1"
> st_clientid="95f8f271-e698-4a9b-91ae-950c55c230a5"
> st_clientname="crmd.674" st_timeout="60" src="redis1"/>
> Mar 22 10:38:06 [675] redis2 stonith-ng: debug: get_capable_devices:
> Searching through 1 devices to see what is capable of action (reboot)
> for target redis1
> Mar 22 10:38:06 [675] redis2 stonith-ng: debug:
> schedule_stonith_command: Scheduling list on stonith-redis1 for
> stonith-ng (timeout=60s)
> Mar 22 10:38:06 [675] redis2 stonith-ng: debug: stonith_command:
> Processed st_query from redis1: OK (0)
> Mar 22 10:38:06 [675] redis2 stonith-ng: debug: stonith_action_create:
> Initiating action list for agent fence_pve (target=(null))
> Mar 22 10:38:06 [675] redis2 stonith-ng: debug:
> internal_stonith_action_execute: forking
> Mar 22 10:38:06 [675] redis2 stonith-ng: debug:
> internal_stonith_action_execute: sending args
> Mar 22 10:38:06 [675] redis2 stonith-ng: debug:
> stonith_device_execute: Operation list on stonith-redis1 now running
> with pid=8707, timeout=60s
> Mar 22 10:38:07 [675] redis2 stonith-ng: debug:
> stonith_action_async_done: Child process 8707 performing action 'list'
> exited with rc 0

There is a fence parameter pcmk_host_check that specifies how pacemaker
determines which fence devices can fence which nodes. The default is
dynamic-list, which means to run the fence agent's list command to get
the nodes. So that's what we're seeing above ...

> Mar 22 10:38:07 [675] redis2 stonith-ng: info: dynamic_list_search_cb:
> Refreshing port list for stonith-redis1
> Mar 22 10:38:07 [675] redis2 stonith-ng: debug:
> search_devices_record_result: Finished Search. 0 devices can perform
> action (reboot) on node redis1

... however not all fence agents can figure out their targets
dynamically. Above, we can see either that's the case, or the device
really can't fence redis1.

> Mar 22 10:38:07 [675] redis2 stonith-ng: debug:
> stonith_query_capable_device_cb: Found 0 matching devices for 'redis1'
> Mar 22 10:38:07 [675] redis2 stonith-ng: debug: stonith_command:
> Processing st_notify reply 0 from redis1 ( 0)
> Mar 22 10:38:07 [675] redis2 stonith-ng: debug:
> process_remote_stonith_exec: Marking call to reboot for redis1 on
> behalf of crmd.674 at da507055-afc3-4b5c-bcc0-f887dd1736d3.redis1: No
> such device (-19)
> Mar 22 10:38:07 [675] redis2 stonith-ng: notice: remote_op_done:
> Operation reboot of redis1 by <no-one> for crmd.674 at redis1.da507055:
> No such device
> Mar 22 10:38:07 [675] redis2 stonith-ng: debug: stonith_command:
> Processed st_notify reply from redis1: OK (0)
> Mar 22 10:38:07 [679] redis2 crmd: notice: tengine_stonith_notify:
> Peer redis1 was not terminated (reboot) by <anyone> for redis1: No
> such device (ref=da507055-afc3-4b5c-bcc0-f887dd1736d3) by client
> crmd.674
> Connection to 192.168.122.137 closed by remote host.
> 
> I already read similar thread ("fence_scsi no such device"), but
> didn't find anything what can help me. I'm using pacemaker
> 1.1.14-2~bpo8+1 with corosync 2.3.5-3~bpo8+1 on Debian jessie.
> 
> some info:
> 
> redis1:~# stonith_admin -L
>  stonith-redis2
> 1 devices found
> 
> redis1:~# crm configure show
> node 3232266889: redis2
> node 3232266923: redis1
> primitive ClusterIP IPaddr2 \
>         params ip=192.168.122.10 nic=eth0 \
>         op monitor interval=10s \
>         meta is-managed=true
> primitive stonith-redis1 stonith:fence_pve \
>         params ipaddr=192.168.122.6 \
>         params login="root at pam" passwd=secret port=100 \
>         op start interval=0 timeout=60s \
>         meta target-role=Started is-managed=true

You can specify pcmk_host_list or pcmk_host_map to use a static target
list for the device. For example pcmk_host_list=redis1 would say this
fence device can target redis1 only. pcmk_host_map is the same but lets
you specify a different name for the target when calling the device --
for example, pcmk_host_map=redis1:1 would target redis1, but send just
"1" to the device.

> primitive stonith-redis2 stonith:fence_pve \
>         params ipaddr=192.168.122.7 \
>         params login="root at pam" passwd=secret port=101 \
>         op start interval=0 timeout=60s \
>         meta target-role=Started is-managed=true
> location loc_stonith-redis1 stonith-redis1 -inf: redis1
> location loc_stonith-redis2 stonith-redis2 -inf: redis2
> property cib-bootstrap-options: \
>         have-watchdog=false \
>         dc-version=1.1.14-70404b0 \
>         cluster-infrastructure=corosync \
>         cluster-name=debian \
>         stonith-enabled=true \
>         no-quorum-policy=ignore \

FYI with corosync 2, you can set "two_node: 1" in corosync.conf and let
no-quorum-policy default in pacemaker.

>         last-lrm-refresh=1458576579
> 
> 
> Best,
> Stan





More information about the Users mailing list