[ClusterLabs] [pacemaker] Discretion with glib v2.59.0+ recommended

Jan Pokorný jpokorny at redhat.com
Sun Jan 20 06:44:53 EST 2019


On 18/01/19 20:32 +0100, Jan Pokorný wrote:
> It was discovered that this release of glib project changed sligthly
> some parameters of how distribution of values within  hash tables
> structures work, undermining pacemaker's hard (alas unfeasible) attempt
> to turn this data type into fully predictable entity.
> 
> Current impact is unknown beside some internal regression test failing
> due to this, so that, e.g., in the environment variables passed in the
> notification messages, the order of the active nodes (being a space
> separarated list) may be appear shuffled in comparison with the long
> standing (and perhaps making a false impression of determinism)
> behaviour witnessed with older versions of glib in the game.

Our immediate response is to, at the very least, make the
cts-scheduler regression suite (the only localhost one that was
rendered broken with 52 tests out of 733 failed) skip those tests
where reliance on the exact order of hash-table-driven items was
sported, so it won't fail as a whole:

https://github.com/ClusterLabs/pacemaker/pull/1677/commits/d76a2614ded697fb4adb117e5a6633008c31f60e

> Variations like these are expected, and you may take it as an
> opportunity to fix incorrect order-wise (like in the stated case)
> assumptions.

[intentionally CC'd developers@, should have done it since beginning]

At this point, testing with glib v2.59.0+, preferably using 2.0.1-rc3
due to the release cycle timing, is VERY DESIRED if you are considering
providing some volunteer capacity to pacemaker project, especially if
you have your own agents and scripts that rely on the exact (and
previously likely stable) order of "set data made linear, hence
artificially ordered", like with OCF_RESKEY_CRM_meta_notify_active_uname
environment variable in clone notifications (as was already suggested;
complete list is also unknown at this point, unfortunately, for a lack
of systemic and precise data items tracking in general).

To do that, spinning a test cluster with the current Fedora Rawhide[*]
(that already ships glib v2.59 since beginning of this year) is
perhaps a most convenient option -- I've just built 2.0.1-rc3 packages
here so they will eventually get to the distribution mirrors, or you
can grab them for your architecture at
https://koji.fedoraproject.org/koji/buildinfo?buildID=1180970
right away.

[*] it shall be possible to point virt-install/respective dialog
    in virt-manager to the direct location for Rawhide packages, see
    https://fedoraproject.org/wiki/Releases/Rawhide#Point_installer_to_Rawhide

> More serious troubles stemming from this expectation-reality mismatch
> regarding said data type cannot be denied at this point, subject of
> further investigation.  When in doubt, staying with glib up to and
> including v2.58.2 (said tests are passing with it, though any later
> v2.58.* may keep working "as always") is likely a good idea for the
> time being.

-- 
Nazdar,
Jan (Poki)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20190120/0683e411/attachment-0002.sig>


More information about the Users mailing list