[cdwg] pending review list

Tue Sep 3 21:53:16 PDT 2013

On 08/29/2013 06:10 AM, Denis Kondratenko wrote:
> Hi All,
[cut]
> What conclusions should be done, numbers like:
> 31, 26, 150, 180, 120, 30, 70, 120, 112
> indicate issues.
>
> Problem stages are: no update from engineer, no review.

I think we could have all agreed to that even without specific examples. 
:)  But I think I should clarify that "no update from engineer" means 
"no update from the patch submitter".

[cut]
> So I think you probably understood the way. We need stat numbers and not
> individual check for every ticket.

I am not sure that I understand your point about needing more 
statistics.  I think the software developers all knew the major slow 
spots in the patch submission process without seeing these selected 
patch examples.

I think we are all in agreement: patch submission is sometimes a long 
process.  But it is not clear to me how additional statistics can solve, 
or really even help, with that problem.

I think what your numbers show is that a patch submitter must remain an 
active participant in the process.  If a submitter does not participate 
actively, the process can, and frequently will, take longer than is 
necessary.

> We might be need some automatic notification for stages longer than 1-2
> weeks (what ever we will define).

I think that such a notification would probably be more annoying than 
useful.  If I want to know the state of my patches, I can pull up gerrit 
at any time and learn their current status.  I also get notification 
email any time my patches are reviewed, someone comments on the patch, 
someone comments in jira, etc.  In my opinion there are plenty of 
existing notifications, and I personally don't really want any 
additional notifications unless they provide new information.

Further, how could we possibly pick a number that makes sense?  Patches 
sitting in the review stage may do so for a whole host of reasons, many 
that are perfectly reasonable.  For instance, a new feature that is 
submitted after feature freeze is fairly likely to sit unreviewed until 
master opens again for feature landings.  Patch submissions also need to 
be triaged and prioritized, and sometimes less important patches will 
need to wait until higher priority patches have been addressed.

A tool cannot tell the difference between a patch submitter working 
tirelessly for a week to find an better solution, and a patch submitter 
ignoring the patch for a week.

No simple time limit or accounting scheme is going to account for the 
reasonable delays.  And I would rather not go down the route of adding a 
great deal of complicated machinery to account for every possible 
reasonable delay.  After all, we are primarily here to develop Lustre 
software, not spend our time developing patch management software. :)

Please don't take that to the extreme and think I am against all tool 
improvement.  I am heartily in favor of some improvements that were 
suggested in a previous thread.  I just want to spend our effort where 
it is likely to make the most impact.

The two problem stages that you point out really don't seem much 
different than any other open source project of reasonable size.  Folks 
always need to prioritize the incoming work and take into account the 
stability of the project when considering landings.

Of course, there is one major difference from other open source 
projects, and that is the Lustre Development Community Tree Maintenance 
contract that OpenSFS has with Intel.  This contract means that we are 
guaranteed to have a say in the prioritizing of patch work.  If a 
developer feels that a patch is not prioritized high enough and moving 
fast enough towards landing, they may bring the patch to the attention 
of the CDWG and OpenSFS's Technical Representative on the contract 
(Chris Morrone).

This is another example of a way that the patch submitter needs to stay 
actively engaged in the process.  The CDWG and the Technical 
Representative are not likely to be much help with a patch problem if 
they are not made aware of the problem.

Please don't ask me to monitor all patches in gerrit from all 
organizations.  I simply don't have the time to do that, regardless of 
how good the tools are.  I need the individual software developers to 
make a reasonable attempt to first resolve the delays amongst themselves 
through normal conversation, be it in jira or in gerrit. Then, if in 
their reasoned judgment a patch's priority needs to be raised and worked 
on sooner, they can bring the patch to the attention of the CDWG and myself.

> Numbers will not say us all trues about review - they will indicate
> problem reviews and problem stages.

They will also reveal a large number of false positives.  Many patches 
sit for good reason.  How do we avoid wasting time on the false positives?

> That what I want to discuss on CDWG every week - what problem reviews we
> have, what problem stages now.

I am certainly in favor of discussing process improvement on a regular 
basis!

But keep in mind that our goal here is not to land patches as fast as 
possible.  Our goal is to strike a reasonable balance in Lustre between 
stability, performance, and new features.  All patches fall somewhere 
different on that continuum, and there are necessarily many judgment 
calls along the way to landing.  That will mean that some patches get 
attention within hours or days, some will not get attention for weeks or 
months.

And that is OK.

That is just the reality of an active, vibrant open source development 
project.

> Please review and provide your feedback.
>
> Thanks,
> Denis