[ClusterLabs] stonith - no route to host
Oscar Salvador
osalvador.vilardaga at gmail.com
Tue Jun 16 08:18:55 UTC 2015
2015-06-16 5:59 GMT+02:00 Andrew Beekhof <andrew at beekhof.net>:
>
> > On 16 Jun 2015, at 12:00 am, Oscar Salvador <
> osalvador.vilardaga at gmail.com> wrote:
> >
> > Hi,
> >
> > I've configured a fencing with libvirt, but I'm having some problem with
> stonith, due to the error "no route to host”
>
> That message is a bit wonky.
> What it really means is that there were no devices that advertise the
> ability to fence that node.
>
> In this case, pacemaker wants to fence “server” but hostlist is set to
> server.fqdn
> Drop the .fqdn and it should work
>
Get rid of the +fqdn was not an option, sorry, but I could fix it in
another way with the help of digimer.
I've used the fence_virsh, from fence_agents.
First of all I configured it in this way:
*primitive fence_server01 stonith:fence_virsh \*
* params ipaddr=virtnode01 port=server01.fqdn action=reboot
login=root passwd=passwd delay=15 \*
* op monitor interval=60s *
*primitive fence_server02 stonith:fence_virsh \*
* params ipaddr=virtnode02 port=server02.fqdn action=reboot
login=root passwd=passwd delay=15 \*
* op monitor interval=60s *
But when I tried to fence a node, I received this errors:
1. Jun 16 09:37:59 [1298] server01 pengine: warning: pe_fence_node:
Node server02 will be fenced because p_fence_server01 is thought to be
active there
2. Jun 16 09:37:59 [1299] server01 crmd: notice: te_fence_node:
Executing reboot fencing operation (12) on server02 (timeout=60000)
3. Jun 16 09:37:59 [1295] server01 stonithd: notice: handle_request:
Client crmd.1299.d339ea94 wants to fence (reboot) 'server02' with device
'(any)'
4. Jun 16 09:37:59 [1295] server01 stonithd: notice:
initiate_remote_stonith_op: Initiating remote operation reboot for
server02: 19fdb8e0-2611-45a7-b44d-b58fa0e99cab (0)
5. Jun 16 09:37:59 [1297] server01 attrd: info:
attrd_cib_callback: Update 12 for probe_complete: OK (0)
6. Jun 16 09:37:59 [1297] server01 attrd: info:
attrd_cib_callback: Update 12 for probe_complete[server01]=true: OK
(0)
7. Jun 16 09:37:59 [1295] server01 stonithd: notice:
can_fence_host_with_device: p_fence_server02 can not fence (reboot)
server02: dynamic-list
8. Jun 16 09:37:59 [1295] server01 stonithd: info:
process_remote_stonith_query: All queries have arrived, continuing (1,
1, 1, 19fdb8e0-2611-45a7-b44d-b58fa0e99cab)
9. Jun 16 09:37:59 [1295] server01 stonithd: notice:
stonith_choose_peer: Couldn't find anyone to fence server02 with <any>
10. Jun 16 09:37:59 [1295] server01 stonithd: info:
call_remote_stonith: Total remote op timeout set to 60 for fencing of
node server02 for crmd.1299.19fdb8e0
11. Jun 16 09:37:59 [1295] server01 stonithd: info:
call_remote_stonith: None of the 1 peers have devices capable of
terminating server02 for crmd.1299 (0)
12. Jun 16 09:37:59 [1295] server01 stonithd: warning:
get_xpath_object: No match for //@st_delegate in /st-reply
13. Jun 16 09:37:59 [1295] server01 stonithd: error:
remote_op_done: Operation reboot of server02 by server01 for
crmd.1299 at server01.19fdb8e0: No such device
14. Jun 16 09:37:59 [1299] server01 crmd: notice:
tengine_stonith_callback: Stonith operation
3/12:1:0:a989fb7b-1af1-4bac-992b-eef416e25775: No such device (-19)
15. Jun 16 09:37:59 [1299] server01 crmd: notice:
tengine_stonith_callback: Stonith operation 3 for server02 failed (No such
device): aborting transition.
16. Jun 16 09:37:59 [1299] server01 crmd: notice:
abort_transition_graph: Transition aborted: Stonith failed
(source=tengine_stonith_callback:697, 0)
17. Jun 16 09:37:59 [1299] server01 crmd: notice:
tengine_stonith_notify: Peer server02 was not terminated (reboot) by
server01 for server01: No such device
(ref=19fdb8e0-2611-45a7-b44d-b58fa0e99cab) by client crmd.1299
So, I had to put *pcmk_host_list *parameter, like:
primitive fence_server01 stonith:fence_virsh \
params ipaddr=virtnode01 port=server01.fqdn action=reboot
login=root passwd=passwd delay=15 pcmk_host_list=server01 \
op monitor interval=60s
primitive fence_server02 stonith:fence_virsh \
params ipaddr=virtnode02 port=server02.fqdn action=reboot
login=root passwd=passwd delay=15 pcmk_host_list=server02 \
op monitor interval=60s
Could you explain me, why? I hope that this doesn't not sound rough, it's
only I don't understand why.
Thank you very much
Oscar Salvador
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20150616/5322162e/attachment.htm>
More information about the Users
mailing list