Pre-PEP: describing/documenting extras

Hi everyone,

I’d like to start a discussion on an idea relating to extras, in order to see if there is sufficient consensus to warrant a PEP.

Background

As part of the PEP 771 discussion, @aragilar mentioned the following idea:

In general, discoverability of extras is not great at the moment. For example, as far as I can tell, pip and uv can’t easily tell you what extras exist for a package (though I’m happy to be corrected on this), and PyPI doesn’t show any information about extras. Part of the issue is that even if it were possible to list extras, it’s not always clear what each one does based just on the name.

If it were possible to document extras in a standardized way, it could make it easier for tools to improve discoverability.

Example

Documenting extras could, for example, be done via a new section in project metadata which would list a description for each extra, such as:

[project.optional-dependencies.descriptions]
recommended = "Dependencies that are recommended for most users"
docs = "Dependencies needed for building the documentation"
jupyter = "Dependencies that can optionally provide enhanced functionality in Jupyter notebooks and lab"

[project.optional-dependencies]
recommended = [
    "scipy",
    "matplotlib",
]
docs = [
    "sphinx"
]
jupyter = [
    "ipywidgets",
]

Of course, if well named, extras might not always need a description (e.g. recommended and test), although even here it’s not really obvious what jupyter does without a description. Does it install Jupyter notebook/lab? Or does it add functionality for users who already have Jupyter notebook/lab? Is it required for users who want to use Jupyter (but still listed as optional in general because not everyone uses Jupyter)?

There might also be cases where it would be useful to tell the user about interactions between different extras, for instance stating that the user should pick one of a few options but not install all extras, and so on.

What this could be used for

If this did become a PEP, I don’t think it should mandate any particular way tools should use the information. But as an example, one could imagine command-line tools implementing something like:

$ pip extras package
recommended: Dependencies that are recommended for most users
test: Dependencies needed for building the documentation
jupyter: Dependencies that can optionally provide enhanced functionality in Jupyter notebooks and lab

or this could be part of e.g. pip show or another similar command. A similar approach could be taken in uv and other tools.

Currently, all these tools could already have a way to list the extras if they wanted, and this could already be useful, but it would be much more useful in general if there were some kind of description associated with them.

Another example of usage would be to list available extras on PyPI. Again, there’s no reason this couldn’t be done already, but having a description would make this a lot more useful.

Prior art

Tox uses a similar approach, where it’s possible to describe the different factors:

description =
    run tests
    recdeps: with recommended optional dependencies
    alldeps: with all optional and test dependencies

and it then makes it possible for tox -l to list environments with useful information for the user, such as:

$ tox -l
py311-test-recdeps -> run tests with recommended optional dependencies
py311-test-alldeps -> run tests with all optional and test dependencies

Open questions and next steps

There are a lot of open questions and details to figure out related to this idea, such as whether we would want a single description for a given extra, or both a short (with character limit) and long (with no limit) description; whether PyPI would expose this metadata to tools so that they don’t have to download a package to find out what extras are present; whether having a description for a non-existent extra should be silently ignored or raise errors; and so on.

However, at this point, I’m more interested in understanding whether there is appetite for something like this, in which case we can try and figure out all the details. Alternatively, maybe it’s something that is not desirable conceptually?

3 Likes

Are you asking about before a package is installed? If yes, then it’s possible but optional as it requires the the metadata file for the project to be served separately. If you’re asking about after installation then it is definitely available.

In this case I mean specifically before the project is installed. How would one do that currently with pip or uv? (or are you saying they could implement it currently but haven’t necessarily done so?)

For pip it’s the latter. We haven’t implemented this because no-one has asked for it, and also because pip doesn’t provide commands to display information about packages that aren’t installed (so such a feature is basically out of scope).

Would it be safe to assume that these descriptions would never be dynamic and therefore be consistent across wheels so that anything consuming this data (including some hypothetical display on PyPI) doesn’t have to deal with the per-distribution vs per-artifact metadata conundrum?

Isn’t pip index versions an example of such a command?

Ok thanks for the clarification! Just a correction on my original post - PyPI does list available extras these days:

but having descriptions would make it possible to e.g. add alt-text for each of the extras with the descriptions.

1 Like

It is, but it was added as essentially just a way to replace the (awful) hack people were using of checking the error mesage from pip install pkg== to find what versions of pkg existed.

We could expand on the command to do more index queries, but as I said, no-one has so far requested it, or offered to write a PR. Personally, I would prefer it if index queries were handled by a separate utility (the implementation would be almost entirely independent of pip’s main functionality), but I don’t know how the other maintainers feel.

Regardless, it is true that this functionality can be implemented with existing metadata, but pip doesn’t include it. Adding descriptions for extras would require an extension to the standards, though. Personally, I feel that it’s probably not worth it given that people haven’t until now even seemed that bothered about having tools for listing the extras.

The one qualification I’ll add is that there does seem to be an upsurge in interest in extras in general. But rather than adding bits of functionality piecemeal, I’d argue that if we want to extend extras, we should look at the concept as a whole, deciding what use cases we’re trying to address, and how best to achieve that. We shouldn’t constrain ourselves by the limitations of the existing model.

5 Likes

Extras/optional dependencies have always struck me as a little odd with how they are a well supported part of the pyproject.toml spec, core metadata spec, and dependency specifier spec, but don’t have an easy way to see (as a consumer) what the extra actually means

I think this example is a great demonstration, there’s currently no way [1](other than putting a comment in pyproject.toml and/or a description in an “Installation” documentation page) to actually say what it means to include the “jupyter” extra or even see what that will install.

Most projects are pretty good about putting that in the documentation, but there have been a handful of times where I’ve had to go to a project’s repo and look directly at the setup.py/pyproject.toml to see what will be included with each extra and then just guess about why.

If pip/uv/some other tool added support to query and display this metadata, I’d definitely use it

My own bikeshedding of how it could be written would be to allow the values in the optional-dependencies table to be either an array of strings (what is currently allowed) or a subtable with “description” and “dependencies”[2] keys, e.g.

[project.optional-dependencies]
recommended = ["scipy", "matplotlib"]
docs = ["sphinx"]
jupyter = {
    dependencies = ["ipywidgets"],
    description = "Dependencies that can optionally provide enhanced functionality in Jupyter notebooks and lab",
}

or

[project.optional-dependencies]
recommended = ["scipy", "matplotlib"]
docs = ["sphinx"]

[project.optional-dependencies.jupyter]
dependencies = ["ipywidgets"]
description = "Dependencies that can optionally provide enhanced functionality in Jupyter notebooks and lab"

  1. at least to my knowledge, someone please feel free to correct me if I’m wrong ↩︎

  2. though I actually don’t like repeating “dependencies” again, but couldn’t think of a better option right now ↩︎

1 Like

To Paul’s point, there have been quite a few different topics lately that have dealt with extras either directly or indirectly (e.g. PEP 771 default extras, support in PEP 751 lock files) and the current specs are pretty limiting, so maybe a ground-up rework would be more successful.

I haven’t checked in on the “PEP 771: Default Extras for Python Software Packages” threads in a while, so they may have already considered and rejected the idea, but using a subtable for the optional dependencies could support that concept as well by just adding in a default boolean valued key, rather than the current proposal there with a separate default-optional-dependency-keys array in the project table

Some other features that’d be more easily supported with a subtable approach could be:

  • Recursive optional dependencies
  • Mutually exclusive optional dependencies
  • Multiple kinds of default extras
  • Any other metadata to add about the extra itself

Another angle on this, to expand our view, is that this is a suggestion for structured documentation, and extras are just the usage site of current interest.
From that perspective, the idea could be phrased as [project.documentation.optional-dependencies].

This may open up future use cases, or it might be a distraction – I can’t really tell.

What I do know is that some of my packages have soft dependencies which are not extras, and the proposal doesn’t give me a way to document those. Nor does it leave space for such a solution to be designed and incorporated in the future: it only supports extras.


I’m struggling a little to see the user story here. A user has identified that they want to install or consider installing a package. They do not want to open that package’s documentation (why not?). Their toolchain (IDE?) prefetches this structured doc and displays it to them (when? What interaction mode told the IDE to do this?). Now they can see the extras and select pkg[all].

For me, this is too much guesswork right now. I’d prefer to discuss this with some more concrete scenarios in mind. Specific packages and specific workflows.


A bit of UX to think about: how long are these descriptions? If they’re unbounded, consider display, wrapping, etc. If they have a max length, will I be able to fit a link to docs in there?

3 Likes