Discuss auto-close stale PR's

This is a follow-up of this, and this discussion, as they have become… stale.

At the time of those discussions, the number of PR’s was lower than the current extreme number of 1.6k. The growth does not seem to stop, so something should be done to keep some sort of manageable number.

Based on the previous discussions I would like to propose to auto-close PR’s that have been stale (e.g. not a single comment or commit) for 6 months.

This is after the already automated “mark as stale” label+comment that occurs after 30 days, which could and should be a trigger for authors and reviewers to consider the PR again.
The stale message could be updated to give the author pointers how to continue: fix the requested changes or try to get a review if non is performed yet.

This proposal takes the points into consideration that were raised before:

  • Huge number of PR’s is intimidating for any contributor (see the poll results in the first discussion).
  • Closing PR’s without a reasonable amount of time (e.g. months) might be considered rude and discouraging to the author.
5 Likes

Here’s the number of open PRs over time:

(Made with Get PR info and count open PRs over time by hugovk · Pull Request #6 · hugovk/github-tools · GitHub)

And of the 1,582 open PRs: 873 were last updated more than six months ago, or 55%.

Would there be a way to visualize that number over time as well? Or some other indicator of “how stale” the average PR is?

In the end this all depends on what question we’re trying to answer. Just looking at the graph of open PRs over time does not provide much insight, maybe there are other ways to look at the data that will help us come up with a solution. For example of the PRs that do get merged, what distribution does the time from opening to merging have? How does that distribution change over time? If we knew this (and perhaps some other stats like about comments), we could make a statistical estimation of how likely any given PR is to ever be merged, and auto-close the ones that just haven’t got a chance any more.

2 Likes

Here’s the days taken to merge plotted against creation date:

Made from this output CSV from this script.

A pleasing decline. But of course, it’s impossible for old PRs to be created more recently (e.g. 3-year-old PR cannot have been created yesterday).

Here’s adding a line showing days since created, the upper bound:

1 Like

Cool, so each line represents a PR opened on that date and the height is how long it took to close? Could you show the reverse, i.e. for each date a PR is closed show how long it was open?

And what about PRs that aren’t merged yet? Are those not in the list at all? That’s a separately interesting graph – you could e.g. make a histogram showing for each week how many PRs were opened that week that aren’t merged or closed yet.

(Hm, what about PRs that were closed? Is that category large enough to learn something from it?)

While a pleasing decline, it only tells us that that younger PR’s are merged within their younger lifetime.

(Sorry, I have to split my post since new users are only allowed to post 2 links)

I took a closer look at just at the open PR’s, and my hypothesis is now two-fold:

  1. There is not enough review capacity to even look at PR’s:
    • There are 799 open RP’s that did not receive a single review (!!): Pull requests · python/cpython · GitHub
    • On top of that are 450 open PR’s that still need a core review or a review of changes: Pull requests · python/cpython · GitHub
    • Oke, that’s already 1250 PR’s that never got a (re) consideration by the Core Devs.
      But, some do have active discussions, about 420 of the awaiting review have more than 5 comments.
2 Likes
  1. There are contributors that never apply the requested changes.

What I also have noticed observing the PR’s over the past few weeks is that PR’s made by Core Devs have much shorter cycles in general. Core Devs are faster in reviewing and approving each other’s PR’s. This was shown graphically in the previous discussion: Decision needed: should we close stale PRs? and how many lapsed days are PRs considered stale? - #9 by ammaraskar
This can be explained by the fact that Core Devs are better accustomed to writing code in an acceptable way, but personally I get the feeling that “outsider”-PR’s just get less priority. I can’t back that up with numbers.

So, where does that leave us? These numbers make me reconsider if auto-closing is helping anything… It just masks the problem of not having enough reviewers.

3 Likes

Cool, so each line represents a PR opened on that date and the height is how long it took to close?

Yes

Could you show the reverse, i.e. for each date a PR is closed show how long it was open?

It’s pretty similar, the bars are shuffled a bit:

@Safihre says:

While a pleasing decline, it only tells us that that younger PR’s are merged within their younger lifetime.

Yep, not sure how useful these are.

And what about PRs that aren’t merged yet? Are those not in the list at all?

The above only shows PRs that were successfully merged.

That’s a separately interesting graph – you could e.g. make a histogram showing for each week how many PRs were opened that week that aren’t merged or closed yet.

Here’s all PRs opened per week, whatever their current status (blue) and only those still open (orange), skipping this half-week:

(via python count_pull_requests.py --opened_per_week > prs_opened_per_week.csv)

The blue is interesting in that it shows the number of PRs opened per week is fairly steady with a slight increase.

The orange is less useful: newer PRs haven’t been merged (yet), although there’s a good 5-10 being “forgotten” each week.

(Hm, what about PRs that were closed? Is that category large enough to learn something from it?)

Note: merged PRs are a subset of closed PRs:

(via python count_pull_requests.py ----closed_per_week > prs_closed_per_week.csv)

Again, fairly constant with a slight increase. But more are opened per week than closed, as shown in the first chart in this thread.

1 Like

Is closing 50% of all open PRs really going to help with reviewing? I would expect an order of magnitude (90%) would need to be closed for that. I’m not suggesting to reduce the threshold untouched day count, rather to have a quicker and easier triage process

At the time of those discussions, the number of PR’s was lower than the current extreme number of 1.6k

I think it isn’t a problem anymore.

That’s the number of issues, freshly migrated from BPO.

The number of PRs is still more-or-less the same, 1,548 showing as 1.5k:

1 Like