Improve communication with contributors on the issue tracker and PRs (Language summit follow up)

I created an issue on the core-workflow tracker to add “valid” and remove “invalid”. This particular proposal can be discussed there:

2 Likes

First we should reach an agreement on the goals and the semantics of this new label, then we can apply it accordingly. Most of the open issues are already triaged/acknowledged, so they won’t need the label. We could still add it to issues with no replies or that lack certain labels (e.g. type-*).

My goal was to make sure that all new issues get looked at by a triager and/or core dev. The untriaged label can be used to automatically mark these new issues, so that triagers can look for them. Triagers can decide to remove the label or ping the relevant experts and let them remove it.

That’s not the goal of the “acknowledged/valid” label, so can we discuss that on a different thread?

“looked at by a triager/core dev” or “has an expert-*” label is a much lower bar than what we are discussing for this new label.

There is some overlapping though: if the issue has been triaged and the untriaged label removed, it’s implicitly valid/accepted (otherwise it would have been closed).

I don’t think so. Most triage work doesn’t go as far as determining whether the issue is valid. For example, putting an “expert-asyncio” label just means the issue is related to asyncio. It doesn’t mean that the user didn’t misunderstand the output and reported a valid bug that needs to be fixed.

As I mentioned above, in that case the triagers should leave the label and let the experts remove it or close the issue. This was also discussed here: Triaging/reviewing/fixing issues and PRs - #15 by ezio-melotti (and in the following two messages).

It doesn’t make sense to me that a label is called “untriaged” and it’s the expert’s (not the triager’s) job to bring the issue to a state where the label can be removed.

I’m open to use a different name. Some previous bikesheddings included new, unclassified, classify [1], awaiting triage, needs triage, unacknowledged, waiting-triager.


  1. this is meant to indicate the next action that should be taken ↩︎

All those names sound like something that a triager can quite quickly remove from an issue after a cursory look. So you clearly have in mind a label for triagers, while the “valid” label is primarily for experts to use (we don’t expect triagers to be able to make that decision). There really isn’t much overlap between these two labels. They mean different things and they will be applied by different people.

I created an issue for the “valid” label on the core-workflow tracker. How about you create another issue for the triagers’ label, and then they can be discussed on separate threads, which will hopefully reduce the confusion.

2 Likes

As a reminder, if consensus can’t be reached, you can always ask the SC to weigh in and make a call.

2 Likes

I think you are misunderstanding me. I’ll try to summarize:

  • issues can be grouped in three categories: “valid”, “invalid”, “not yet determined”
  • invalid issues can be closed, leaving us with “not yet determined” and “valid”
  • the distinction between the two can be determined by the presence (or absence) of a label

I think we both agree on this.


One of the initial point of contention, which is also related to the name of the label and to the group of people that apply the label, is what exactly “valid” means.

An issue can initially go through the following stages:

  • new/untriaged → partially triaged → fully triaged → approved

My initial goal was to draw the line around “fully triaged”[1], whereas you want to draw it further ahead, around “approved”[2]. This is fine with me, because it seems to accomplish both my and your goals:

  • My goal was to make sure that all new issues were looked at by a triager/core-dev. If you search for issues that are not yet approved and sort by creation date (which is already the default), you will find the new/untriaged on top. There will also be some additional issues that have ongoing discussions (or are languishing in a limbo) at the bottom of the list, but I can easily live with that.
  • Your goal was (IIUC) to determine if the issue is valid or not, in order to tell the OP whether he still has to do something or if it’s now up to the core-devs to push the issue forward.

This is why I said that there is overlapping, even though with the line pushed further ahead, the name (un)triaged clearly becomes inappropriate.


The next point of contention is whether to:

  • have “not yet determined” as the default, and add a label to mark issues as “valid”
  • have a label for “not yet determined” added automatically on new issues, and remove it once the issue is accepted as “valid”

I prefer the latter, for the reasons stated above:

  • it’s easy to search for issues that need attention
  • it’s easier to remove the label than adding it
  • it doesn’t add an extra label to all open issues

Another point of contention (somewhat irrelevant, but I might as well clarify), is who should approve issues:

  • some issues are clearly invalid: both triagers and experts can close them;
  • some issues are clearly valid: both triagers and expert can add/remove the label;
  • some issues are ambiguous: triagers should ping relevant experts and let them decide.

Finally, the name of the label should be decided based on whether the label should be removed or added:

  • if the label is added automatically to new issues: awaiting approval, needs approval, pending approval or something along these lines might work. Since relatively few issues will have the label for a short period of time before they are approved or rejected, it doesn’t matter if it’s a bit long.
  • if the label is added once the issue is approved: valid, approved, accepted, acknowledged might work.

In addition, I also wanted to ask you if you are still planning to use this label programmatically to have a bot pinging the OP/core-devs after a period of inactivity.


  1. meaning that all relevant labels have been set and the experts mentioned ↩︎

  2. meaning that either the bug has been confirmed valid, or a feature requested approved ↩︎

2 Likes

Looks like we’re beginning to converge towards some common understanding. Indeed, we are targeting different problems as you described.

I have a couple of comments on what you wrote:

  1. I don’t know what “fully triaged” means. When would you say that an issue is fully triaged?

  2. I don’t think we should assume that all currently open issues are valid (re “it doesn’t add an extra label to all open issues”.)

I don’t think this is what will happen. There are many issues that are undecided for a long time. “valid” essentially means “we should definitely do something about this issue”. It’s often not an easy decision to make.

2 Likes

It basically means that the triager’s job here is done: all the relevant labels (type, versions, experts, etc) have been applied, and relevant people/teams have been pinged. This is in contrast with a partial triaging, where e.g. the type has been set, but the affected versions have not.

This distinction was relevant in my initial proposal (since a triager could perform a partial triage and leave the untriaged label until someone else did the rest), but now that we moved the line on “accepted” it is no longer relevant and can be ignored.

What approach are you suggesting if we add a valid label? Should we re-evaluate all issues individually before adding the label?

If we do this, eventually almost all open issues will end up with the valid label (the others will either be closed as invalid or will have ongoing discussions before deciding their validity and adding the label or closing them).

In addition, we won’t be able to deploy a bot that pings the OP for issues not yet marked as valid until we marked most of them – if we do a lot of OPs will get pinged even for issues that are technically approved but don’t have the label yet.

The approach I suggest with a label like needs approval is to:

  • assign it automatically to all new issues
  • look for issues with no (or few) replies and assign it to those too
  • assume that most other open issues are valid (if they aren’t the label can still be added at any time)

Even in this case, I think they will be much less than the valid ones, and they will either be new issues that need attention, or old issues that sit at the bottom of the list and are easy to ignore.

Pretty much, yes. Most of the open issues are not on anyone’s todo list at the moment, and they need to be re-evaluated to decide their fate.

That would be a good outcome. When most of the open issues are valid we have an issue tracker with a very high signal to noise ratio. It will take a lot of work to get there, I’m afraid. But that’s what this whole effort is aiming at.

4 Likes

Thank you. I’ve requested a discussion by the steering council.

Let’s go with a triaged label.

– Petr, on behalf of the SC

1 Like

Some reasoning (not approved communication from the SC as a whole, but also not just my personal thoughts):

  • For communicating with contributors it would be good to use comments, since they don’t necessarily know the workflow and meaning of labels. Labels are useful for categorization, searching, and bots.
  • For searches, “valid”/“invalid” or “triaged”/“untriaged” only changes the GH query by a -, so it doesn’t matter much.
    • Having a bot add a label that people would remove sounds like an unnecessary step, compared to people adding a label.
  • valid could imply that other issues are “invalid”, even if a triager just didn’t get to them yet.
  • As Ezio pointed out, acknowledged / accepted might create false expectations.
  • As Irit points out, “triaged” is an ongoing process rather than a boolean state. The triaged label would signify that the process is done (for now, at least) – and since the issue wasn’t closed, it’s “valid” and waiting for a core dev.
2 Likes

For better or worse, triaged has been added but as far as I could see, not documented. I think it belongs under
GitHub Labels under ‘Other’.

As I see it, such a label, be it triaged or any of the other variants, inverted (untriaged) or not, are only of real value if actively used. If triaged is not used, it may end up being noise. IMHO: either we actively use triaged, or we should consider removing it. Currently, it seems only a handful of people are actively using triaged.

For better or worse, triaged has been added but as far as I could see, not documented.

I created Document the `triaged` label · Issue #929 · python/devguide · GitHub.