GitHub Issues Migration: label mapping

ezio-melotti · March 12, 2022, 9:23am

On bpo we have a number of different fields such as Components, Versions, Resolution, Keywords, etc., that need to be transferred to GitHub. On GitHub there is a flat list of labels that can be assigned to both issues and pull requests.

During the migration, we have the opportunity to get rid of a few fields and values, reorganize or recombine some others, and possibly adding a few new ones. I already created a mapping for most fields, but there are still a few open questions. I would also like to hear feedback from other core devs, triagers, and release managers to make sure that the proposed mapping works for their use-cases.

There are a few things to keep in mind:

You can now subscribe/follow specific labels on all repos of the python org on GitHub.
The nosy autocomplete of bpo, that allows you to filter users based on entries on the Expert Index, doesn’t exist on GitHub. Adding labels is the only way to indirectly subscribe someone else to an issue – assuming that they are following that label. Direct @mentions works too, if you know whom to add (as pointed out by @pradyunsg).
Our issue classification on bpo is very fine-grained, but also complex. Once we are on GitHub, a wider audience with different levels of expertise will be able to contribute, and a simpler classification system will make reporting and triaging easier.
Even though we have a lot of fields, several are rarely used and could be removed or combined.
Labels are shared between issues and pull request, and we already have 32 PR-specific labels. Some of them can be reused for issues and some might be removed.
Having a lot of labels will affect both the “Labels” dropdown in the right sidebar (that is already 3 “pages” long), and the page that lists all issues, since labels are listed there too.
Labels can be grouped with a prefix (which makes the name longer) and/or with colors. This makes it easier to group related labels, but it’s not ideal.
After the migration is done, it will be difficult to change/add labels on all the migrated issues, so it’s better to get them right before the migration.

Below you can find the proposed mapping, copied here for your convenience from a message on Map bpo issue metadata to GitHub fields/labels · Issue #5 · psf/gh-migration · GitHub. I will keep both mapping in sync.

The first message from issue #5 has more background information if you are interested.

For more information about the migration see this Discourse thread and the update. Also remember to link your GitHub username to your profile on bpo.

Please discuss on this Discourse thread instead of using the GitHub issue to keep the discussion focused in one place.

type

type-bug: “An unexpected behavior, bug, or error”
type-feature: “A feature request or enhancement”
type-security: “A security issue”
type-crash: “A hard crash of the interpreter, possibly with a core dump”

In addition:

It was suggested to expand type-compile-error to include all build errors (e.g. configure/Makefile issues). Since we already have a build label, the type-compile-error has been removed.
Similarly, performance and resource usage have been replaced by a performance label that can be combined either with type-bug or type-feature
bug, crash, compile error could be merged under type-behavior (users often have trouble telling them apart).
- Should we merge them or keep them separate?
  - type-crash has been kept, compile errors can be indicated with type-bug + build
- Should crash became a standalone label instead of a type-* label?
We might want to get rid of type-security if security issues should be reported under the Security tab of the repo.
I’m not sure if we can detect this when users select the issue type from the template, or when they add the label before they submit, but it could either be written in the template or be handled by an action after the report.
type-bugfix has been renamed to type-bug.
- do we need this classification for PRs when the issue is already classified?
type-documentation and type-tests have been renamed to docs and tests

stage

We can remove stages
We currently have awaiting change review, awaiting changes, awaiting core review, awaiting merge, awaiting review on python/cpython and test needed, needs patch, patch review, commit review, backport needed, resolved
- Should we map patch review and commit review to awaiting review?

components

Labels in this group are related to the location of the affected files:

library: “Python modules in the Lib dir”
documentation: “Documentation in the Doc dir”
interpreter-core: “Interpreter core (Objects, Python, Grammar, and Parser dirs)”
extension-modules: “C modules in the Modules dir”
tests: “Tests in the Lib/test dir”

They could have their own namespace prefix (not sure what to use though, and the names are already long enough), or just a specific color.

expertise (was included in components before)

expert-asyncio: this is already on python/cpython
Could be grouped with expert-* or just by color
What other components do we want to keep? (e.g. email, IDLE, IO, Unicode, etc.)
- asyncio-> expert-asyncio
- IDLE-> expert-IDLE
- Build-> build
- email-> expert-email
- IO-> expert-IO
- ctypes-> expert-ctypes
- C API-> expert-C-API
- Unicode-> expert-unicode
- Installation-> expert-installation
- Tkinter-> expert-tkinter
- SSL-> expert-SSL
- XML-> expert-XML
- 2to3 (2.x to 3.x conversion tool)-> expert-2to3
- Subinterpreters-> expert-subinterpreters
- Regular Expressions-> expert-regex
- Argument Clinic-> expert-argument-clinic
FreeBSD and Demos and Tools have no corresponding labels, Cross-build and Build have been merged into build, Distutils has been included into library, Parser into interpreter-core.

OS (was included in components before)

OS-windows and OS-mac: these are already on python/cpython
We could add OS-FreeBSD and possibly others
Any other OS that deserves a label?
- no

versions

We already have needs backport to * on python/cpython
There is a discussion on Discourse about this
In the same thread, it was suggested to just have labels to indicate if it only applies to main, if it should be backported to maintenance releases, and also to security releases
- This could be inferred by the issue type (feature, bug, security) and marked with the needs backport to * labels
Should we remove versions, only keep two, or keep them all?
- all active versions (3.7-3.11) have been kept. They can be converted to milestones after the migartion.

resolution

I only kept invalid (since it was already on python/cpython). There is also a spam label.

priority

Are the RMs fine with using milestones/projects to track release/deferred, or do they prefer to have labels?
- they are fine, but for now the release blocker and deferred blocker labels have been added. This will make it easier to identify issues and add them to milestone/projects.

keywords

I only kept easy. The others are barely used.
Is there any other keyword that we should keep?
- no(?)

pradyunsg · March 12, 2022, 9:35am

If you know the specific individual, you can @-mention them in a comment – which will subscribe them.

It’s true that there’s no real way to find out who exactly is subscribed to an issue but I’ve always found that a bit weird on b.p.o anyway.

arhadthedev · March 12, 2022, 11:39am

Can CODEOWNERS be heavily employed for this instead, becoming a de facto expert list?

arhadthedev · March 12, 2022, 12:07pm

I see type labels as a way to swiftly prioritize public stuff. Like, loss of user data first, behavior not conformant to the documentation second, degradation of performance third, everything else fourth. With such gradation an exact effect stops to matter, be it a hard termination after detection (“crash”) or just an undetected deviation of logic (“bug”).

In this light, we can have the following mapping:

type-data-loss (crash)
type-not-as-documented (behavior, compile-error)
type-performance (stays as is)
type-enhancement (stays as is)

I’m open to ideas how to make type-not-as-documented shorted considering it to be the most popular label.

ezio-melotti · March 12, 2022, 12:26pm

As far as I understand, the CODEOWNERS file only works for pull requests, so it can not be used automatically for issues. In addition, the Expert Index also covers other categories and areas of interests, not just files.

arhadthedev · March 12, 2022, 12:37pm

Yeah, I overlooked that PRs are not required to be created together with bound issues. Like, there can be a sufficient delay until issue participants get a proper approach condensed and a draft implementation is published, not to mention discussion-only issues.

arhadthedev · March 12, 2022, 12:42pm

And here is a timely example (bpo-43224: Implement substitution of unpacked TypeVarTuple in C by serhiy-storchaka · Pull Request #31828 · python/cpython · GitHub):

I think we should pause work on this until we’ve decided on exactly which approach we’re proceeding with.

Jelle · March 12, 2022, 2:27pm

I’d really like labels for each of the expert areas, so I can easily find all bugs related to typing or sqlite or asyncio or some other major part of the stdlib. Maybe we just have to accept that the list of labels will be extremely long; I don’t think the costs you list are that bad.

pradyunsg · March 13, 2022, 9:29am

I’d recommend doing both.

I’m pretty sure that only distinguishing by color will not work well, since I’m pretty sure we’d have colour blind folks looking at these pages.
I don’t think that a long list of labels is problematic, as long as its well organised.

It takes a bit of upfront effort but that effort pays off well. It’s also a decent amount of work to apply them on every issue though, something that bpo currently offsets onto the person filing the issue; which we might want to make possible.

I’ve found pip’s issue tracker much easier to triage since we’ve moved to our prefix-based model for labels.

And, as a purely bikesheding note: I think it’s reasonable to do something like “type: bug” instead of “type-bug”. Both of them will show up when you say “bug” or “type” or “type bug” in the label picker, and I personally find the visual separation of type: bug to be nicer than the lack of separation in type-bug.

hugovk · March 13, 2022, 10:03am

Yes, especially as the “Filter by label” box lets you find things quickly:

ezio-melotti · March 13, 2022, 12:44pm

Thanks everyone for the feedback!

Fair enough. If the triagers are fine with the price of dealing with a long list of labels in order to have a more powerful issue search/filtering, then I guess it’s a worthy trade-off.

PIP currently has 90 labels (Labels · pypa/pip · GitHub) and it seems to mostly use one-letter prefixes that are not immediately obvious (in fact it took me several clicks to find this page that lists the labels) and also has some “long” prefixes (like type: *). Black also uses a similar one-letter convention and has 45 labels (Labels · psf/black · GitHub). Jupyter Notebook mostly uses “long” prefixes and has 48 labels (Labels · jupyter/notebook · GitHub). CPython now has 31 labels (Labels · python/cpython · GitHub – one disappeared overnight ).

All of them also have some unprefixed labels (e.g. trivial, stale, invalid). The colors seem somewhat inconsistent within the same category, except in the case of Jupyter which is fairly consistent. Even within category that use the same color, there are some differently-colored exceptions (e.g. the red type-security among the other blue type-*).

If everyone else has experience working on repos with a many labels, I would be curious to hear about it.

That said, if we agree on having more labels, we still need to decide which ones we want to keep. I will update the list above and ask again for feedback.

The good news is that the label name, description, and color can be changed at any time, so we will have plenty of time to bikeshed after the migration.

iritkatriel · March 13, 2022, 7:42pm

I think we need some issue management labels. Things like:

awaiting-information (OP was requested to clarify something).
pending-close (a core dev or triager suggests closing, but wants to give others a chance to object).

Maybe the bots can automatically close issues if they have one of these labels for a month.

CAM-Gerlach · March 13, 2022, 9:55pm

FWIW, on the main Spyder repository, we get 150+ issues per month (and used to get double that or more before we implemented a bunch of automatic troubleshooting and triage steps).

Back when I was in charge of it, we introduced a status: Awaiting Followup tag for this case, since it was very common that users would not give us enough information to propose a solution, reproduce a bug, close as a duplicate or otherwise act on it. If they didn’t answer after 8 days, the issue would be closed.

It has since become very successful and kept our issue backlog more manageable, and I believe we’ve automated that nowadays (and there are many bots and GH Actions to do so). I’ve seen it used on a number of other projects as well, in combination with bots/GH Actions/etc that take care of the followup.

A big issue for us with tags in general was core devs not applying and using them consistently; having a dedicated and relatively experienced person in charge of triaging issues and PRs (me; now its handled by everyone on a rotating basis, one core dev per day of the week) was pretty critical for that. In this case, you have a whole team of them, so that should hopefully be somewhat less of a concern.

ezio-melotti · March 14, 2022, 4:39am

ISTM that both indicate an issue that should be closed unless more information that prove its validity are provided. Something like this is definitely useful to have, but perhaps we don’t need to distinguish between the two.

On bpo we currently have test needed and pending. There is some overlap between them and with the labels you suggest. I think pending could be mapped, but probably not test needed.

FTR Black and PyPI have S: awaiting response and S: needs repro, Jupyter has status:Needs info.

I think it would make sense to automatically close after a few days issues that have been marked with the aforementioned labels by a human, since they are not useful in their current state. I wouldn’t close old/stale issues just because they are old.

hugovk · March 14, 2022, 8:08am

We already have a GH Action to label old PRs as stale after 30 days, and remove stale after activity.

After another 14 days, PRs labelled stale + CLA not signed are auto-closed.

(Those also with stale + CLA signed are not auto-closed.)

I can see value auto-closing issues that have stale and human-applied label(s), and we could use the same GH Action.

blink1073 · March 14, 2022, 10:22am

In JupyterLab we use Labeler · Actions · GitHub Marketplace · GitHub to apply labels by path. There is a limit of 100 labels that can be applied.

We also have a probot that automatically applies a “status:Needs Triage” label to new issues.

steve.dower · March 14, 2022, 5:18pm

compile error is fundamentally different from the other two, and I suspect if we don’t have a crash tag then people will be tempted to put [crash] in the title.

I’d just keep them separate. The definitions you have in the first post are (almost) clear enough to handle triaging (I’d clarify that compile error is for compiling CPython, rather than “Python”).

ezio-melotti · March 15, 2022, 1:21pm

This sounds quite useful, especially because we have a number of labels that are directly correlated to specific files/paths. Thanks for sharing!

Upon further thought, I think it’s better keeping them separate, for two reasons:

While reporting, triagers know the difference and users can’t set labels directly, so misreporting shouldn’t be a problem;
While searching/filtering, devs might be interested in finding crashes and compile errors specifically, without having these kinds of issues lost among other generic bugs.

erlendaasland · March 15, 2022, 2:49pm

I would consider renaming type-compile-error to type-build in order to encompass all build related errors: configure not functioning as intended, makefile bugs, problems with build related tools (freeze, etc.), and of course compilation errors.

ezio-melotti · March 15, 2022, 7:37pm

Note that we already have the Build and the Cross-Build components, and I was already planning to merge them. Should we also add type-compile-error and combine all three into type-build (or just build)?