[ClusterLabs Developers] Reference to private bugzillas in commit messages

Tue Jan 9 10:37:32 EST 2018

Jan Pokorný <jpokorny at redhat.com> wrote: 
>On 09/01/18 10:35 +0000, Adam Spiers wrote: 
>>Andrei Borzenkov <arvidjaar at gmail.com> wrote: 
>>>On Tue, Jan 9, 2018 at 11:23 AM, Kristoffer Grönlund 
>>><deceiver.g at gmail.com> wrote: 
>>>>Andrei Borzenkov <arvidjaar at gmail.com> writes: 
>>>>
>>>>>I wonder what is the policy here. 
>>>>>
>>>>>commit 7b7521c95d635d8b4cf04f645a6badc1069c6b46 
>>>>>Author: liangxin1300 <XLiang at suse.com> 
>>>>>Date:   Fri Dec 29 15:27:40 2017 +0800 
>>>>>
>>>>>    fix: ui_resource: Using crm_failcount instead of 
>>>>>crm_attribute(bsc#1074127) 
>>>>>
>>>>>
>>>>>Apart from the obvious - how would contributor know what "bsc" is in the 
>>>>>first place and how to check it - attempt to access 
>>>>>https://bugzilla.suse.com/show_bug.cgi?id=1074127 gives 
>>>>>
>>>>>You are not authorized to access bug #1074127 
>>>>>
>>>>>Randomly checking other bsc# references gives the same "permissions 
>>>>>denied" result. 
>>>>
>>>>We include those bugzilla references to make it easier for ourselves to 
>>>>connect fixes to bugs in the rpm changelogs (for example). I can 
>>>>honestly say that I don't know if there is a policy or what it is in 
>>>>that case, it was "established practice" when I joined the project. 
>
>Well, in fact there is no such official policy around this, but 
>I tried to change that in past: 
>
>  https://github.com/ClusterLabs/pacemaker/pull/1119 
>
>as this no-open-access hubris (seconded by related 
>no-change-selfcontainment) disturbs me _a lot_ in the context 
>of _free_ (as in freedom) software.  Just think about it. 

I totally appreciate that sentiment, and with my pure FL/OSS hat on I 
agree that development should be as open as possible.  However ... 

>>>>I think Red Hat does the same? 
>
>The above reference gives you an answer that this camp is also not 
>guilt-free here (https://github.com/ClusterLabs/pacemaker/pull/887). 

Sure, but it's important to also see the other side: a lot of this 
development is funded by companies such as SUSE and Red Hat, which in 
turn are funded by their customers, and part of what their customers 
are paying them for is privacy.  I'm sure you can understand the 
ramifications of accidentally leaking confidential information through 
say, an hb_report attached to a bugzilla entry which was accidentally 
public.  It's simply too risky for such a commercially-oriented 
bugzilla to make all submissions public by default. 

But there are IMHO perfectly acceptable workarounds to this which I've 
already alluded to in my previous post. 

>>>I have private account on (open)SUSE bugzilla and I'm denied access to 
>>>these bugs. 
>>
>>Some commercial products in the (open)SUSE bugzilla, presumably 
>>including SUSE Linux Enterprise High Availability, are configured such 
>>that newly submitted bugs default to being private to SUSE employees 
>>only, in order to protect potentially confidential information 
>>submitted by our customers.  My best guess is that the bug referenced 
>>above is one of these bugs which defaulted to private. 
>>
>>However, there is a solution!  Assuming there is no confidential 
>>information in a bug such as log files or other info provided by one 
>>of our customers 
>
>AFAIK, the privacy can be set on particular comment/attachment basis 
>in Bugzilla instances (ok, with the associated risk added that something 
>will leak unintentionally)... 

Correct - clearly that was decided by the Powers That Be at SUSE that 
this was an unacceptable increase in risk. 

>>any SUSE employee can set any of these bugs as being visible 
>>externally.  And indeed this should be done as much as possible. 

There are better approaches to expecting every developer to remember 
to do this every time, e.g. creating an upstream product or component 
in a downstream bugzilla, or having a separate upstream bugzilla 
(which we already do) and making sure that upstream commits only refer 
to upstream bugzilla entries.  On top of that clean separation of 
course there can be links between upstream and downstream bugs where 
appropriate too, and that brings other advantages, such as cleanly 
distinguishing between upstream work (e.g. fix in git master) 
vs. downstream work (e.g. backport to old product releases). 

>... however, this is a moot discussion we would be better off avoiding 
>in the first place as: 
>
>1. the changes tracked in the repo would preferably be self-contained
>   as mentioned 
>
>   - on random commit access, the change should be comprehensible just
>     by the means of code + in-code comments + commit message, without
>     any reliance on external tracker or on out-of-repo PR comments

Absolutely, IMHO it's a cardinal sin for the commit message not to be 
self-contained :-)  This point is independent from the whole bugzilla 
discussion.

>     (e.g., I don't understand why the explanation did not go into
>     the commit itself in case of
>     https://github.com/ClusterLabs/pacemaker/pull/1402) -- come on
>     people, when the code base is to stand the test of time, is it
>     more likely that the context survives in the proprietary
>     free-of-charge service without massive replication, or in the
>     bits being indivisible part of the distributed repo?

+100.  This is mentioned here too: 

https://wiki.openstack.org/wiki/GitCommitMessages#Information_in_commit_messages 

    Do not assume the reviewer has access to external web services/site.

    In 6 months time when someone is on a train/plane/coach/beach/pub
    troubleshooting a problem & browsing Git history, there is no
    guarantee they will have access to the online bug tracker, or
    online blueprint documents. The great step forward with
    distributed SCM is that you no longer need to be "online" to have
    access to all information about the code repository. The commit
    message should be totally self-contained, to maintain that
    benefit.

>2. if the bug identifier is absolutely necessary for some reason,
>   ClusterLabs host the Bugzilla instance at
>   https://bugs.clusterlabs.org/
>
>   - items in other trackers could be cross-linked from there
>
>>If there *is* confidential information, but it is desired for the fix 
>>to be public (e.g. referenced within a commit message in, say, the 
>>Pacemaker repository), then I would recommend my colleagues to ensure 
>>that there are two bugs: a private one containing the confidential 
>>information, which links to a public one which contains all the 
>>information which can be shared with the upstream FL/OSS project. 
>
>Proper problem statement in the commit message accompanying the fix 
>would alleviate these sorts of redundancies, and would lead to 
>improvements on the non-code/soft-skills aspects of the contributions, 
>IMHO.

I totally agree, although I think we should aim for both practices 
(self-contained commits, *and* a hygienic upstream/downstream 
separation of bugs), not just one or the other. 

One last thought: even if we reach consensus in this thread, from 
experience, that consensus will be worth very little unless the policy 
is *documented* somewhere visible and only modifiable via peer review. 
I'd strongly suggest a git repository.  This kind of thing can even be
enforced via CI with tools such as http://danger.systems/ruby/

Adam