[ClusterLabs] fence_ipmilan power_timeout ValueError
Ron Kerry
ron.kerry at hpe.com
Thu Jul 13 09:11:54 EDT 2017
I have a customer who recently tried to create fence-ipmilan resources
with a power_timeout parameters set to 60s. His reasoning was that the
parameter is set to be a string value, so he expected the 's' modifier
on the 60 value to be interpreted correctly. It is not.
<parameter name="power_timeout" unique="0" required="0">
<getopt mixed="--power-timeout=[seconds]" />
<content type="string" default="20" />
<shortdesc lang="en">Test X seconds for status change after
ON/OFF</shortdesc>
</parameter>
Here is an example of his resource definition.
primitive smc0002_fencing stonith:fence_ipmilan \
params ipaddr=X.X.X.X login=admin passwd=pppppp lanplus=true
power_timeout=60s pcmk_host_check=static-list pcmk_host_list=smc0002 \
op stop on-fail=ignore interval=0 timeout=20s \
op monitor interval=60s timeout=20s \
meta target-role=Started
STONITH does not work and gets the following errors.
-----
Jul 13 00:34:20 smc0001 stonith-ng[3921]: notice: Client
stonith_admin.30337.1d0dee53 wants to fence (reboot) 'smc0002' with
device '(any)'
Jul 13 00:34:20 smc0001 stonith-ng[3921]: notice: Requesting peer
fencing (reboot) of smc0002
Jul 13 00:34:20 smc0001 stonith-ng[3921]: notice: smc0002_fencing can
fence (reboot) smc0002: static-list
Jul 13 00:34:20 smc0001 stonith-ng[3921]: notice: smc0002_fencing can
fence (reboot) smc0002: static-list
Jul 13 00:34:20 smc0001 python: detected unhandled Python exception in
'/usr/sbin/fence_ipmilan'
Jul 13 00:34:20 smc0001 python: can't communicate with ABRT daemon, is
it running? [Errno 2] No such file or directory
Jul 13 00:34:20 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30338]
stderr: [ Traceback (most recent call last): ]
Jul 13 00:34:20 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30338]
stderr: [ File "/usr/sbin/fence_ipmilan", line 204, in <module> ]
Jul 13 00:34:20 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30338]
stderr: [ main() ]
Jul 13 00:34:20 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30338]
stderr: [ File "/usr/sbin/fence_ipmilan", line 200, in main ]
Jul 13 00:34:20 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30338]
stderr: [ result = fence_action(None, options, set_power_status,
get_power_status, None, reboot_fn) ]
Jul 13 00:34:20 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30338]
stderr: [ File "/usr/share/fence/fencing.py", line 973, in fence_action ]
Jul 13 00:34:20 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30338]
stderr: [ status = get_multi_power_fn(tn, options, get_power_fn) ]
Jul 13 00:34:20 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30338]
stderr: [ File "/usr/share/fence/fencing.py", line 880, in
get_multi_power_fn ]
Jul 13 00:34:20 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30338]
stderr: [ plug_status = get_power_fn(tn, options) ]
Jul 13 00:34:20 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30338]
stderr: [ File "/usr/sbin/fence_ipmilan", line 17, in get_power_status ]
Jul 13 00:34:20 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30338]
stderr: [ output = run_command(options, create_command(options,
"status")) ]
Jul 13 00:34:20 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30338]
stderr: [ File "/usr/share/fence/fencing.py", line 1192, in run_command ]
Jul 13 00:34:20 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30338]
stderr: [ timeout = float(timeout) ]
Jul 13 00:34:20 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30338]
stderr: [ ValueError: invalid literal for float(): 60s ]
Jul 13 00:34:21 smc0001 python: detected unhandled Python exception in
'/usr/sbin/fence_ipmilan'
Jul 13 00:34:22 smc0001 python: can't communicate with ABRT daemon, is
it running? [Errno 2] No such file or directory
Jul 13 00:34:22 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30343]
stderr: [ Traceback (most recent call last): ]
Jul 13 00:34:22 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30343]
stderr: [ File "/usr/sbin/fence_ipmilan", line 204, in <module> ]
Jul 13 00:34:22 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30343]
stderr: [ main() ]
Jul 13 00:34:22 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30343]
stderr: [ File "/usr/sbin/fence_ipmilan", line 200, in main ]
Jul 13 00:34:22 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30343]
stderr: [ result = fence_action(None, options, set_power_status,
get_power_status, None, reboot_fn) ]
Jul 13 00:34:22 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30343]
stderr: [ File "/usr/share/fence/fencing.py", line 973, in fence_action ]
Command failed: No route to host
Jul 13 00:34:22 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30343]
stderr: [ status = get_multi_power_fn(tn, options, get_power_fn) ]
Jul 13 00:34:22 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30343]
stderr: [ File "/usr/share/fence/fencing.py", line 880, in
get_multi_power_fn ]
Jul 13 00:34:22 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30343]
stderr: [ plug_status = get_power_fn(tn, options) ]
Jul 13 00:34:22 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30343]
stderr: [ File "/usr/sbin/fence_ipmilan", line 17, in get_power_status ]
Jul 13 00:34:22 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30343]
stderr: [ output = run_command(options, create_command(options,
"status")) ]
Jul 13 00:34:22 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30343]
stderr: [ File "/usr/share/fence/fencing.py", line 1192, in run_command ]
Jul 13 00:34:22 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30343]
stderr: [ timeout = float(timeout) ]
[root at smc0001 lib64]# Jul 13 00:34:22 smc0001 stonith-ng[3921]: warning:
fence_ipmilan[30343] stderr: [ ValueError: invalid literal for float():
60s ]
Jul 13 00:34:22 smc0001 stonith-ng[3921]: error: Operation 'reboot'
[30343] (call 2 from stonith_admin.30337) for host 'smc0002' with device
'smc0002_fencing' returned: -201 (Generic Pacemaker error)
Jul 13 00:34:22 smc0001 stonith-ng[3921]: notice: Couldn't find anyone
to fence (reboot) smc0002 with any device
Jul 13 00:34:22 smc0001 stonith-ng[3921]: error: Operation reboot of
smc0002 by <no-one> for stonith_admin.30337 at smc0001.a7b73c3b: No route
to host
Jul 13 00:34:22 smc0001 crmd[3925]: notice: Peer smc0002 was not
terminated (reboot) by <anyone> for smc0001: No route to host
(ref=a7b73c3b-b67f-4a85-815e-df877055eb37) by client stonith_admin.30337
----
So clearly, the code is expecting power_timeout to be an integer value.
I suspect this is also true for the other internal timeout values:
timeout, login_timeout, shell_timeout, power_timeout.
Is this a bug? Should the metadata definitions define these parameters
as integers or floats instead of strings? Or should the code correctly
interpret the 's' (or other modifiers)? Or should the metadata
documentation clearly state that no time modifiers like 's' are allowed
in order to facilitate specifying fractional numbers of seconds? Or is
this just the way it is (if so, please give me some logical reasoning
that I can pass along to the customer)?
--
Ron Kerry
Global Product Support
ron.kerry at hpe.com
Hewlett Packard Enterprise
More information about the Users
mailing list