[ClusterLabs Developers] strange migration-threshold overflow, and fail-count update aborting it's own recovery transition

Jan Pokorný jpokorny at redhat.com
Fri Apr 5 16:43:32 UTC 2019


On 05/04/19 17:19 +0200, Lars Ellenberg wrote:
> On Fri, Apr 05, 2019 at 09:56:51AM -0500, Ken Gaillot wrote:
>> On Fri, 2019-04-05 at 09:44 -0500, Ken Gaillot wrote:
>>> On Fri, 2019-04-05 at 15:50 +0200, Lars Ellenberg wrote:
>>>> But in this case, someone tried to be smart
>>>> and set a migration-threshold of "very large",
>>>> in this case the string in xml was: 999999999999, 
>>>> and that probably is "parsed" into some negative value,
>>> 
>>> Anything above "INFINITY" (actually 1,000,000) should be mapped to
>>> INFINITY. If that's not what happens, there's a bug. Running
>>> crm_simulate in verbose mode should be helpful.
> 
> I think I found it already.
> 
> char2score() does crm_parse_int(),
> and reasonably assumes that the result is the parsed int.
> Which it is not, if the result is -1, and errno is set to EINVAL or
> ERANGE ;-)
> 
> 	 char2score -> crm_parse_int 
> 		 "999999999999"  -> result of strtoll is > INT_MAX,
> 		 result -1, errno ERANGE
> 	 migration_threshold = -1;
> 
> Not sure what to do there, though.
> Yet an other helper,
> mapping ERANGE to appropriate MIN/MAX for the conversion?
> 
> But any "sane" configuration would not even trigger that.

Exactly, but the configuration data model is sinfully underspecified,
although it's not the only problem there.

> Where and how would we point out the "in-sane-ness" to the user,
> though?

I think the correct answer is version 4 of the CIB schema with
proper data-typing.  Version 3 was just an upgrade trigger + gating.

>>>> which means the fail-count=1 now results in "forcing away ...",
>>>> different resource placements,
>>>> and the file system placement elsewhere now results in much more
>>>> actions, demoting/role changes/movement of other dependent
>>>> resources
>>>> ...
>>>> 
>>>> 
>>>> So I think we have two issues here:
>>>> 
>>>> [...]
>>>> 
>>>> b) migration-theshold (and possibly other scores) should be
>>>> properly parsed/converted/capped/scaled/rejected
>>> 
>>> That should already be happening

See the schema-based enforcement that's currently missing though shall
be present to avoid the problems as early as possible.

-- 
Nazdar,
Jan (Poki)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/developers/attachments/20190405/3f41340a/attachment-0002.sig>


More information about the Developers mailing list