[ClusterLabs] [pacemaker] Discretion with glib v2.59.0+ recommended

Jan Pokorný jpokorny at redhat.com
Mon Feb 11 16:48:59 EST 2019


On 20/01/19 12:44 +0100, Jan Pokorný wrote:
> On 18/01/19 20:32 +0100, Jan Pokorný wrote:
>> It was discovered that this release of glib project changed sligthly
>> some parameters of how distribution of values within  hash tables
>> structures work, undermining pacemaker's hard (alas unfeasible) attempt
>> to turn this data type into fully predictable entity.
>> 
>> Current impact is unknown beside some internal regression test failing
>> due to this, so that, e.g., in the environment variables passed in the
>> notification messages, the order of the active nodes (being a space
>> separarated list) may be appear shuffled in comparison with the long
>> standing (and perhaps making a false impression of determinism)
>> behaviour witnessed with older versions of glib in the game.
> 
> Our immediate response is to, at the very least, make the
> cts-scheduler regression suite (the only localhost one that was
> rendered broken with 52 tests out of 733 failed) skip those tests
> where reliance on the exact order of hash-table-driven items was
> sported, so it won't fail as a whole:
> 
> https://github.com/ClusterLabs/pacemaker/pull/1677/commits/15ace890ef0b987db035ee2d71994e37f7eaff96
> [above edit: updated with the newer version of the patch]

Shout-out to Ken for fixing the immediate fallout (deterministic
output breakages in some cts-scheduler tests, making the above
change superfluous) for the upcoming 2.0.1 release!

>> Variations like these are expected, and you may take it as an
>> opportunity to fix incorrect order-wise (like in the stated case)
>> assumptions.
> 
> [intentionally CC'd developers@, should have done it since beginning]
> 
> At this point, testing with glib v2.59.0+, preferably using 2.0.1-rc3
> due to the release cycle timing, is VERY DESIRED if you are considering
> providing some volunteer capacity to pacemaker project, especially if
> you have your own agents and scripts that rely on the exact (and
> previously likely stable) order of "set data made linear, hence
> artificially ordered", like with OCF_RESKEY_CRM_meta_notify_active_uname
> environment variable in clone notifications (as was already suggested;
> complete list is also unknown at this point, unfortunately, for a lack
> of systemic and precise data items tracking in general).

While some of these if not all are now ordered, I'd call using
"stable ordered list" approach to these variable, as opposed to
"plain unordered set" one, from within agents as continuously
frowned-upon unless explicitly lifted.  For predictable
backward/forward pacemaker+glib version compatibility if
for no other reason.

Ken, do you agree?

(If so, we shall keep that in mind for future documentation tweaks
[possibly including also OCF updates], so no false assumptions won't
be cast for new agent implementations going forward.)

>> More serious troubles stemming from this expectation-reality mismatch
>> regarding said data type cannot be denied at this point, subject of
>> further investigation.  When in doubt, staying with glib up to and
>> including v2.58.2 (said tests are passing with it, though any later
>> v2.58.* may keep working "as always") is likely a good idea for the
>> time being.

It think this still partially holds and only time-proven as fully
settled?  I mean, for anything truly reproducible (as in crm_simulate),
either pacemaker prior to 2.0.1 combined with glib pre- or equal-or-post-
2.59.0 need to be uniformly (reproducers need to follow the original)
combined to get the same results, and with pacemaker 2.0.1+, identical
results (but possibly differing against either of the former combos)
will _likely_ be obtained regardless of particular run-time linked glib
version, but strength of this "likely" will only be established with
future experience, I suppose (but shall universally hold with the same
glib class per stated division, so no change in this already positive
regard).

Just scratched the surface, so gladly be corrected.

-- 
Jan (Poki)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20190211/8aa89007/attachment-0001.sig>


More information about the Users mailing list