GitHub Issues Migration: label mapping

Thanks everyone for the feedback!

Fair enough. If the triagers are fine with the price of dealing with a long list of labels in order to have a more powerful issue search/filtering, then I guess it’s a worthy trade-off.

PIP currently has 90 labels (Labels · pypa/pip · GitHub) and it seems to mostly use one-letter prefixes that are not immediately obvious (in fact it took me several clicks to find this page that lists the labels) and also has some “long” prefixes (like type: *). Black also uses a similar one-letter convention and has 45 labels (Labels · psf/black · GitHub). Jupyter Notebook mostly uses “long” prefixes and has 48 labels (Labels · jupyter/notebook · GitHub). CPython now has 31 labels (Labels · python/cpython · GitHub – one disappeared overnight :thinking:).

All of them also have some unprefixed labels (e.g. trivial, stale, invalid). The colors seem somewhat inconsistent within the same category, except in the case of Jupyter which is fairly consistent. Even within category that use the same color, there are some differently-colored exceptions (e.g. the red type-security among the other blue type-*).

If everyone else has experience working on repos with a many labels, I would be curious to hear about it.


That said, if we agree on having more labels, we still need to decide which ones we want to keep. I will update the list above and ask again for feedback.

The good news is that the label name, description, and color can be changed at any time, so we will have plenty of time to bikeshed after the migration.

I think we need some issue management labels. Things like:

awaiting-information (OP was requested to clarify something).
pending-close (a core dev or triager suggests closing, but wants to give others a chance to object).

Maybe the bots can automatically close issues if they have one of these labels for a month.

6 Likes

FWIW, on the main Spyder repository, we get 150+ issues per month (and used to get double that or more before we implemented a bunch of automatic troubleshooting and triage steps).

Back when I was in charge of it, we introduced a status: Awaiting Followup tag for this case, since it was very common that users would not give us enough information to propose a solution, reproduce a bug, close as a duplicate or otherwise act on it. If they didn’t answer after 8 days, the issue would be closed.

It has since become very successful and kept our issue backlog more manageable, and I believe we’ve automated that nowadays (and there are many bots and GH Actions to do so). I’ve seen it used on a number of other projects as well, in combination with bots/GH Actions/etc that take care of the followup.

A big issue for us with tags in general was core devs not applying and using them consistently; having a dedicated and relatively experienced person in charge of triaging issues and PRs (me; now its handled by everyone on a rotating basis, one core dev per day of the week) was pretty critical for that. In this case, you have a whole team of them, so that should hopefully be somewhat less of a concern.

3 Likes

ISTM that both indicate an issue that should be closed unless more information that prove its validity are provided. Something like this is definitely useful to have, but perhaps we don’t need to distinguish between the two.

On bpo we currently have test needed and pending. There is some overlap between them and with the labels you suggest. I think pending could be mapped, but probably not test needed.

FTR Black and PyPI have S: awaiting response and S: needs repro, Jupyter has status:Needs info.

I think it would make sense to automatically close after a few days issues that have been marked with the aforementioned labels by a human, since they are not useful in their current state. I wouldn’t close old/stale issues just because they are old.

2 Likes

We already have a GH Action to label old PRs as stale after 30 days, and remove stale after activity.

After another 14 days, PRs labelled stale + CLA not signed are auto-closed.

(Those also with stale + CLA signed are not auto-closed.)

I can see value auto-closing issues that have stale and human-applied label(s), and we could use the same GH Action.

2 Likes

In JupyterLab we use Labeler · Actions · GitHub Marketplace · GitHub to apply labels by path. There is a limit of 100 labels that can be applied.

We also have a probot that automatically applies a “status:Needs Triage” label to new issues.

1 Like

compile error is fundamentally different from the other two, and I suspect if we don’t have a crash tag then people will be tempted to put [crash] in the title.

I’d just keep them separate. The definitions you have in the first post are (almost) clear enough to handle triaging (I’d clarify that compile error is for compiling CPython, rather than “Python”).

2 Likes

This sounds quite useful, especially because we have a number of labels that are directly correlated to specific files/paths. Thanks for sharing!

Upon further thought, I think it’s better keeping them separate, for two reasons:

  • While reporting, triagers know the difference and users can’t set labels directly, so misreporting shouldn’t be a problem;
  • While searching/filtering, devs might be interested in finding crashes and compile errors specifically, without having these kinds of issues lost among other generic bugs.

I would consider renaming type-compile-error to type-build in order to encompass all build related errors: configure not functioning as intended, makefile bugs, problems with build related tools (freeze, etc.), and of course compilation errors.

3 Likes

Note that we already have the Build and the Cross-Build components, and I was already planning to merge them. Should we also add type-compile-error and combine all three into type-build (or just build)?

2 Likes

Since renaming and removing labels can be easily done at any time after the migration, I decided to map most of the fields to labels during the migration, and we can then update the labels after the migration. Colors and descriptions can be updated too.

Merging label can technically be done after the migration too, by selecting all issues with label B, adding label A to all of them, and removing label B. Doing this after the migration however will generate two new events on each issue (addition of A and removal of B), and will update their “Last updated” date, making some searches more difficult. Because of this, doing it before the migration is better – the downside is that if we change our minds on a merge, it’s tricky to undo it.

Based on the feedback I received, I changed the following things:

  • Removed type-compile-error and type-performance and added a build and performance labels that can be combined with the other type-* labels
  • Temporarily renamed docs and tests to type-documentation and type-tests to match the existing labels in the CPython repo (these should be renamed)
  • Renamed type-enhancement to type-feature
  • Added labels for 3.7-3.11 (these can be removed afterwards)
  • Added release-blocker and deferred-blocker labels (these can also be removed)
  • Updated pending to map to pending instead of stale
  • Added labels for most components

Regarding the components, this is the full mapping:

  • Library (Lib)-> library
  • Documentation-> type-documentation
  • Interpreter Core-> interpreter-core
  • Windows-> OS-windows
  • Extension Modules-> extension-modules
  • Tests-> type-tests
  • asyncio-> expert-asyncio
  • IDLE-> expert-IDLE
  • Build-> build
  • email-> expert-email
  • IO-> expert-IO
  • macOS-> OS-mac
  • ctypes-> expert-ctypes
  • C API-> expert-C-API
  • Unicode-> expert-unicode
  • Installation-> expert-installation
  • Tkinter-> expert-tkinter
  • SSL-> expert-SSL
  • XML-> expert-XML
  • 2to3 (2.x to 3.x conversion tool)-> expert-2to3
  • Cross-Build-> build (easily searchable, not too useful)
  • Demos and Tools-> `` (only a few issues, not too useful)
  • Subinterpreters-> expert-subinterpreters
  • Regular Expressions-> expert-regex
  • Argument Clinic-> expert-argument-clinic
  • FreeBSD-> `` (only a few issues, easily searchable)
  • Parser-> interpreter-core (only a few issues, easily searchable)
  • Distutils-> library (only a few issues, easily searchable)

FreeBSD and Demos and Tools have no corresponding labels, Cross-build and Build have been merged into build, Distutils has been included into library, Parser into interpreter-core.

(I’ll update the initial post shortly) – done

What about renaming type-behavior to type-bug?

4 Likes

This can be easily done either before or after the migration, and renaming type-behavior to type-bug is fine with me.

The python/cpython repo already has type-bugfix for PRs though, so there are a few options:

  1. Rename type-bugfix to type-bug and use type-bug in the mapping, so that both issues and PRs will end up under the same label;
  2. Get rid of the type-bugfix label for PRs and only mark issues with type-bug (issues are linked to PRs anyway, so we don’t need to mark them twice unless some bot/script needs that label)
  3. Keep both labels, use type-bug for issues and type-bugfix for PRs.

+1

Nothing will be improved by having two different tags, you can filter with is:pr or is:issue (IIRC).

9 Likes

I’ve already renamed some of the labels on the python/cpython repo:

  • type-bugfixtype-bug (this will replace type-behavior too)
  • type-enhancementtype-feature
  • type-documentationdocs
  • type-performanceperformance
  • type-teststests

I also noticed that the python/cpython already has some stage-like labels: awaiting change review, awaiting changes, awaiting core review, awaiting merge, awaiting review. bpo has these stages test needed, needs patch, patch review, commit review, backport needed, resolved.

  • :question: Should we map patch review and commit review to awaiting review?

The awaiting * labels don’t seem to be documented, so I’m not entirely sure what mapping (if any) would make more sense.

IIRC some bot owns the ‘awaiting’ labels.

1 Like

Bedevere uses:

  • Awaiting review
  • Awaiting core review
  • Awaiting changes
  • Awaiting change review
  • Awaiting merge

2 Likes

Commit review typically means it’s been merged already, so that wouldn’t map to anything.

Probably best to leave those tags for PRs only. It’ll very likely confuse the bot if they start showing up on issues.

3 Likes

The reason I was suggesting this mapping is that patch review/commit review are currently used on bpo to indicate issues that have a proposed solution (either a PR or a patch) ready to be reviewed. If we add the awaiting review, it will be possible to filter for these issues with is:issue is:open label:"awaiting review".

In theory this should be possible by searching for is:issue is:open linked:pr, but this has two problems:

  • PR linking is not supported during the migration (I’m still looking for a solution)
  • Issues with a patch on bpo won’t be included in the search without a label

Moving forward it won’t be necessary to add this label to issues, since PRs will be linked properly, so it’s mostly a way to preserve some extra metadata of the migrated issues. If we decide against it, it will still be possible to search for issues awaiting for review on bpo even though the list will become outdated as more issues gets closed on GitHub.

FWIW the devguide says:

commit review: A triager performed a patch review and it looks good. This signals to core developers the patch or pull request needs a quick once-over to make sure nothing was overlooked before committing it.

Either way there are only 21 issues with commit review, so it doesn’t make much difference if they are not included. There are 2199 issues with patch review and also 2869 with the patch keyword (added automatically for PRs).

Any chance of adding ‘draft’ label? This will help with searching PRs.