[Pacemaker] Reminder: Pacemaker-1.1.10-rc5 is out there

Thu Jun 27 00:28:19 EDT 2013

On 26/06/2013, at 10:37 PM, Lars Marowsky-Bree <lmb at suse.com> wrote:

> On 2013-06-26T21:31:14, Andrew Beekhof <andrew at beekhof.net> wrote:
> 
>>> Distributions can take care of them when they integrate them; basically
>>> they'll trickle through until the whole stack the distributions ship
>>> builds again.
>> If we let 2.0.x be anything like 1.1.x, I suspect this would be rather difficult.
> 
> Not sure. With the sore exception of 1.1.8, the integration effort was
> reasonable, even for an Enterprise distribution. Yes, for changes so
> large and intrusive, a temporary branch (or a longer release cycle)
> would probably be preferable.

I wouldn't say the 6 months between 1.1.7 and 1.1.8 was a particularly aggressive release cycle.
Generally though, it has always been hard/dangerous to backport specific fixes because of the complexity of the interactions - particularly in the PE.

> 
>> The change I'm thinking of (CPG codepaths and global variables) was becoming a major support overhead and all-round headache.
>> I hadn't planned to make that change, but it was the best way to fix a bug that was holding up the release.
> 
> Yeah, that one. If it fixes a bug, it was probably unavoidable (though
> the specific commit (953bedf8f7f54f91a76672aeee5f44dc465741e9) didn't
> mention a bugzilla id).

It has always been the case that I find and fix far more bugs than people report.
I don't plan to start filing bugs for myself.

> But that trickles through all consumers here - OCFS2, DLM, sbd. Means we
> have to do more validation than a -rc should normally need - normally,
> during an rcX phase, I'd expect small, well-contained bugfixes for
> regressions only.
> 
> But perhaps this was one such exception.

Normally I would have waited until after the final release, and have done so in the past for other changes.

In this case though, I made an exception because the plan is to NOT have another 1.1.x and "it is still my intention not to have API changes in 2.0.x, so better before than after".

Granted I had completely forgotten about the plugin editions of ocfs2/dlm, but I was told you'd already deep frozen what you were planning to ship, so I don't understand the specific concern.

There is never a good point to make these changes, even if I make them just after a release people will just grumble when it comes time to look at the next one - just like you did above for 1.1.8.

> (Which bug did it fix, by the way? Can't immediately spot it from the
> commit code.)

Processes spinning for a few minutes while trying to send a CPG message.
First for corosync 2.x, then later for cman, then again for pacemakerd.

I borked at creating a third copy of that code when I noticed a bug in the second.
I much preferred the old cib_ais_dispatch() method signature but to make it work with the corosync's API required all kinds of nastiness which made it very brittle.

>> Plus it is still my intention not to have API changes in 2.0.x, so better before than after.
> 
> I wonder how that will go ;-)

We did pretty well with 1.0 once the line was drawn (after about .5 iirc).

> I don't really mind the API changes much,
> for me it's mostly a question of timing and how big the pile is at every
> single release.

I thought you wanted longer release cycles... wouldn't that make the pile bigger?  
And is it not better to batch them up and have one point of incompatibility rather than a continuous stream of them?

> If you consider the API from a customer point of view, the things like
> build or runtime dependencies on shared libraries aren't so much of an
> issue - hopefully, the distribution provider hides that before they
> release updates. Hence my "Oh well, I don't care" stance.

Except if it affects ocf2/dlm/sbd?

> What's more troublesome are changes to existing commands (even something
> minimal like "crm_resource -M" now generating a short location
> constraint,

I find it confusing how an contained 10 line change to a CLI tool is troublesome but you're prepared to wear the overhead of holding back API changes - which usually impact all sorts of code paths, sometimes across multiple projects.

Surely this would be the easiest of any possible change to hold back.

> which could potentially break scripts that interact with the
> CIB), or major changes to log messages (since those do break customer's
> scripts and monitoring environments).

CLI output I can usually be convinced of, but log messages are most definitely not something I will consider as a valid programming interface.

I have not and will not change them just to annoy people, but I must be allowed to reduce the level of noise and other improve them or rename the functions that produce them when appropriate (which changes the "functionname:" portion).

I have been hammered for years on the amount of logs Pacemaker produces, yet the moment I try to do something about it... sigh.

> 
>>> Important is to of course keep the major/minor numbers of the libraries
>>> updated so one doesn't get runtime problems.
>> I have been quite diligent running ./bumplibs.sh in preparation for releases for a while now.
> 
> Yes. Didn't mean to say it isn't working, just wanted to mention it.

The make target that generates the changelog prints out a reminder in bold magenta that ./bumplibs.sh needs to be run.
I have re-run it for 1.1.10 several times - its "at your own risk" if you're taking something between 1.1.x and 1.1.y though.
Upstream can't be held responsible for that.

> Because an update that fails to install until all dependencies are fixed
> is (mostly) fine, but one that installs and then breaks really annoys
> customers ;-)

Yep.