[Pacemaker] wrong device in stonith_admin -l

Andrew Beekhof andrew at beekhof.net
Tue Dec 11 21:34:41 EST 2012


On Wed, Dec 12, 2012 at 11:51 AM,  <laurent+pacemaker at u-picardie.fr> wrote:
>
> Hi,
>
> I've just observed something weird.
> A node is running a stonith resource for which gethosts gives an empty
> node list. The result of stonith_admin -l does include it in the
> device list !
>
> result of "stonith_admin -l elasticsearch-05" run from
> elasticsearch-06 :
>  stonith-xen-peatbull
>  stonith-xen-eddu
> 2 devices found
>
> stonith-xen-peatbull is a correct fencing device
> stonith-xen-eddu is a fencing device with an empty hostlist
>
> running "my-xen0 gethosts" with the stonith-xen-eddu params by hand
> doesn't return any host, and it does exit with 0 (is that correct to
> return 0 with an empty host list ?)
>
> logs :
> Dec 12 01:09:10 elasticsearch-06 stonith-ng[18181]:   notice: stonith_device_register: Added 'stonith-cluster-xen' to the device list (6 active devices)
> Dec 12 01:09:10 elasticsearch-06 attrd[18183]:   notice: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true)
> Dec 12 01:09:10 elasticsearch-06 attrd[18183]:   notice: attrd_perform_update: Sent update 5: probe_complete=true
> Dec 12 01:09:11 elasticsearch-06 stonith-ng[18181]:   notice: stonith_device_register: Added 'stonith-xen-eddu' to the device list (6 active devices)
> Dec 12 01:09:11 elasticsearch-06 stonith-ng[18181]:   notice: stonith_device_register: Added 'stonith-xen-peatbull' to the device list (6 active devices)
> Dec 12 01:09:12 elasticsearch-06 stonith: [18434]: info: external/my-xen0-ha device OK.
> Dec 12 01:09:12 elasticsearch-06 crmd[18185]:   notice: process_lrm_event: LRM operation stonith-cluster-xen_start_0 (call=61,rc=0, cib-update=27, confirmed=true) ok
> Dec 12 01:09:14 elasticsearch-06 stonith: [18465]: info: external_run_cmd: '/usr/lib/stonith/plugins/external/my-xen0 status' output: elasticsearch-05
> Dec 12 01:09:14 elasticsearch-06 stonith: [18465]: info: external_run_cmd: '/usr/lib/stonith/plugins/external/my-xen0 status' output: elasticsearch-06
> Dec 12 01:09:15 elasticsearch-06 stonith: [18465]: info: external/my-xen0 device OK.
> Dec 12 01:09:15 elasticsearch-06 crmd[18185]:   notice: process_lrm_event: LRM operation stonith-xen-peatbull_start_0 (call=68, rc=0, cib-update=28, confirmed=true) ok
> Dec 12 01:09:15 elasticsearch-06 stonith: [18458]: info: external/my-xen0 device OK.
> Dec 12 01:09:15 elasticsearch-06 crmd[18185]:   notice: process_lrm_event: LRM operation stonith-xen-eddu_start_0 (call=66, rc=0, cib-update=29, confirmed=true) ok
> Dec 12 01:12:34 elasticsearch-06 stonith-ng[18181]:   notice: dynamic_list_search_cb: Disabling port list queries for stonith-xen-kornog (1): (null)
> Dec 12 01:12:34 elasticsearch-06 stonith-ng[18181]:   notice: dynamic_list_search_cb: Disabling port list queries for stonith-xen-nikka (1): (null)
> Dec 12 01:12:34 elasticsearch-06 stonith-ng[18181]:   notice: dynamic_list_search_cb: Disabling port list queries for stonith-xen-yoichi (1): (null)
> Dec 12 01:12:34 elasticsearch-06 stonith: [19301]: CRIT: external_hostlist: 'my-xen0 gethosts' returned an empty hostlist
> Dec 12 01:12:34 elasticsearch-06 stonith: [19301]: ERROR: Could not list hosts for external/my-xen0.
> Dec 12 01:12:37 elasticsearch-06 stonith: [19332]: CRIT: external_hostlist: 'my-xen0 gethosts' returned an empty hostlist
> Dec 12 01:12:37 elasticsearch-06 stonith: [19332]: ERROR: Could not list hosts for external/my-xen0.
> Dec 12 01:12:37 elasticsearch-06 stonith-ng[18181]:   notice: dynamic_list_search_cb: Disabling port list queries for stonith-xen-eddu (1): failed:  255
>
> David, I mentioned a node being wrongly fenced in the "stonith-timeout
> duration 0 is too low" bug, could it be related ?

Doubtful, what does your config look like?
IIRC, these agents want to be told which machines they can fence




More information about the Pacemaker mailing list