[ClusterLabs] fence_ipmilan power_timeout ValueError

Marek Grac mgrac at redhat.com
Fri Jul 14 06:05:37 EDT 2017


Hi,

Fence agents have not understood 'XXs' notations until recently. Currently,
it is supported only in unreleased upstream (master branch of git repo) and
it is planned to be included in next releases of both upstream and RHEL. As
it requires some changes in pcs, it is hard to expect when it will happend.

Bug is tracked: https://bugzilla.redhat.com/show_bug.cgi?id=1377928

m,

On Thu, Jul 13, 2017 at 3:11 PM, Ron Kerry <ron.kerry at hpe.com> wrote:

> I have a customer who recently tried to create fence-ipmilan resources
> with a power_timeout parameters set to 60s. His reasoning was that the
> parameter is set to be a string value, so he expected the 's' modifier on
> the 60 value to be interpreted correctly. It is not.
>
>     <parameter name="power_timeout" unique="0" required="0">
>         <getopt mixed="--power-timeout=[seconds]" />
>         <content type="string" default="20"  />
>         <shortdesc lang="en">Test X seconds for status change after
> ON/OFF</shortdesc>
>     </parameter>
>
> Here is an example of his resource definition.
>
> primitive smc0002_fencing stonith:fence_ipmilan \
>         params ipaddr=X.X.X.X login=admin passwd=pppppp lanplus=true
> power_timeout=60s pcmk_host_check=static-list pcmk_host_list=smc0002 \
>         op stop on-fail=ignore interval=0 timeout=20s \
>         op monitor interval=60s timeout=20s \
>         meta target-role=Started
>
> STONITH does not work and gets the following errors.
>
> -----
> Jul 13 00:34:20 smc0001 stonith-ng[3921]:  notice: Client
> stonith_admin.30337.1d0dee53 wants to fence (reboot) 'smc0002' with device
> '(any)'
> Jul 13 00:34:20 smc0001 stonith-ng[3921]:  notice: Requesting peer fencing
> (reboot) of smc0002
> Jul 13 00:34:20 smc0001 stonith-ng[3921]:  notice: smc0002_fencing can
> fence (reboot) smc0002: static-list
> Jul 13 00:34:20 smc0001 stonith-ng[3921]:  notice: smc0002_fencing can
> fence (reboot) smc0002: static-list
> Jul 13 00:34:20 smc0001 python: detected unhandled Python exception in
> '/usr/sbin/fence_ipmilan'
> Jul 13 00:34:20 smc0001 python: can't communicate with ABRT daemon, is it
> running? [Errno 2] No such file or directory
> Jul 13 00:34:20 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30338]
> stderr: [ Traceback (most recent call last): ]
> Jul 13 00:34:20 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30338]
> stderr: [   File "/usr/sbin/fence_ipmilan", line 204, in <module> ]
> Jul 13 00:34:20 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30338]
> stderr: [     main() ]
> Jul 13 00:34:20 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30338]
> stderr: [   File "/usr/sbin/fence_ipmilan", line 200, in main ]
> Jul 13 00:34:20 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30338]
> stderr: [     result = fence_action(None, options, set_power_status,
> get_power_status, None, reboot_fn) ]
> Jul 13 00:34:20 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30338]
> stderr: [   File "/usr/share/fence/fencing.py", line 973, in fence_action ]
> Jul 13 00:34:20 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30338]
> stderr: [     status = get_multi_power_fn(tn, options, get_power_fn) ]
> Jul 13 00:34:20 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30338]
> stderr: [   File "/usr/share/fence/fencing.py", line 880, in
> get_multi_power_fn ]
> Jul 13 00:34:20 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30338]
> stderr: [     plug_status = get_power_fn(tn, options) ]
> Jul 13 00:34:20 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30338]
> stderr: [   File "/usr/sbin/fence_ipmilan", line 17, in get_power_status ]
> Jul 13 00:34:20 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30338]
> stderr: [     output = run_command(options, create_command(options,
> "status")) ]
> Jul 13 00:34:20 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30338]
> stderr: [   File "/usr/share/fence/fencing.py", line 1192, in run_command ]
> Jul 13 00:34:20 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30338]
> stderr: [     timeout = float(timeout) ]
> Jul 13 00:34:20 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30338]
> stderr: [ ValueError: invalid literal for float(): 60s ]
> Jul 13 00:34:21 smc0001 python: detected unhandled Python exception in
> '/usr/sbin/fence_ipmilan'
> Jul 13 00:34:22 smc0001 python: can't communicate with ABRT daemon, is it
> running? [Errno 2] No such file or directory
> Jul 13 00:34:22 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30343]
> stderr: [ Traceback (most recent call last): ]
> Jul 13 00:34:22 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30343]
> stderr: [   File "/usr/sbin/fence_ipmilan", line 204, in <module> ]
> Jul 13 00:34:22 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30343]
> stderr: [     main() ]
> Jul 13 00:34:22 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30343]
> stderr: [   File "/usr/sbin/fence_ipmilan", line 200, in main ]
> Jul 13 00:34:22 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30343]
> stderr: [     result = fence_action(None, options, set_power_status,
> get_power_status, None, reboot_fn) ]
> Jul 13 00:34:22 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30343]
> stderr: [   File "/usr/share/fence/fencing.py", line 973, in fence_action ]
> Command failed: No route to host
> Jul 13 00:34:22 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30343]
> stderr: [     status = get_multi_power_fn(tn, options, get_power_fn) ]
> Jul 13 00:34:22 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30343]
> stderr: [   File "/usr/share/fence/fencing.py", line 880, in
> get_multi_power_fn ]
> Jul 13 00:34:22 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30343]
> stderr: [     plug_status = get_power_fn(tn, options) ]
> Jul 13 00:34:22 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30343]
> stderr: [   File "/usr/sbin/fence_ipmilan", line 17, in get_power_status ]
> Jul 13 00:34:22 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30343]
> stderr: [     output = run_command(options, create_command(options,
> "status")) ]
> Jul 13 00:34:22 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30343]
> stderr: [   File "/usr/share/fence/fencing.py", line 1192, in run_command ]
> Jul 13 00:34:22 smc0001 stonith-ng[3921]: warning: fence_ipmilan[30343]
> stderr: [     timeout = float(timeout) ]
> [root at smc0001 lib64]# Jul 13 00:34:22 smc0001 stonith-ng[3921]: warning:
> fence_ipmilan[30343] stderr: [ ValueError: invalid literal for float(): 60s
> ]
> Jul 13 00:34:22 smc0001 stonith-ng[3921]:   error: Operation 'reboot'
> [30343] (call 2 from stonith_admin.30337) for host 'smc0002' with device
> 'smc0002_fencing' returned: -201 (Generic Pacemaker error)
> Jul 13 00:34:22 smc0001 stonith-ng[3921]:  notice: Couldn't find anyone to
> fence (reboot) smc0002 with any device
> Jul 13 00:34:22 smc0001 stonith-ng[3921]:   error: Operation reboot of
> smc0002 by <no-one> for stonith_admin.30337 at smc0001.a7b73c3b: No route to
> host
> Jul 13 00:34:22 smc0001 crmd[3925]:  notice: Peer smc0002 was not
> terminated (reboot) by <anyone> for smc0001: No route to host
> (ref=a7b73c3b-b67f-4a85-815e-df877055eb37) by client stonith_admin.30337
> ----
>
> So clearly, the code is expecting power_timeout to be an integer value. I
> suspect this is also true for the other internal timeout values: timeout,
> login_timeout, shell_timeout, power_timeout.
>
> Is this a bug? Should the metadata definitions define these parameters as
> integers or floats instead of strings? Or should the code correctly
> interpret the 's' (or other modifiers)? Or should the metadata
> documentation clearly state that no time modifiers like 's' are allowed in
> order to facilitate specifying fractional numbers of seconds? Or is this
> just the way it is (if so, please give me some logical reasoning that I can
> pass along to the customer)?
>
> --
>
> Ron Kerry
> Global Product Support
>
> ron.kerry at hpe.com
> Hewlett Packard Enterprise
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20170714/fec6e094/attachment-0003.html>


More information about the Users mailing list