PEP 771: Default Extras for Python Software Packages

For me the main benefit is keeping library dependencies as light as possible. Many packages have dependencies that are only used by a subset of their functionality. If I want to depend on, for example, the xarray.Variable API but not the xarray.Dataset API, I do not need pandas to be installed.

Arguably, this could also be resolved by having an ergonomic way to turn large projects into monorepos of smaller projects with more tightly-scoped dependencies, so that my library might depend on xarray-variable or xarray-dataset, while a user planning on having tools at hand would just install xarray.

2 Likes

If extras were documented properly and discoverable, I do not see why that would be problematic. How is this an expectation any higher than actually be able to use SymPy (or any other Python library)? How low do we have to lower the bar? Silly argument, but: Python is already the most popular programming language, isn’t it? :smiley:

In principle, I am sympathetic to making things easy for beginners of course, but I am not convinced this proposal has a value vs. cost ratio that is good enough. Seems to me like there are other ways to make more impactful user friendliness improvements in this area with less cons. I was initially enthusiastic about this idea when it was suggested a couple of years ago (or a very similar one), but reading about the finer details even then made me change my position.

Anyway, I do not have any weight on the decision nor any horse in this race. I am mostly just worried about the churn that it will cause and as said by others above worried that it might be abused and overall not much of a win.

3 Likes

For me, the benefit is relatively clear (people have been asking for something like this for some time now) but the cost isn’t. Specifically there are two elements of the cost that I don’t feel that I really understand yet:

  1. The cost for tool developers of coming up with a good UI for this feature. Tool maintainers will have to decide things like whether to have an opt-out flag and what form it takes, whether the user needs to be notified that a default extra has been included, etc. None of these questions have (to me) clear answers, and the PEP currently doesn’t give much guidance.
  2. The cost for the community of misuse of the feature. I have no real feel about whether the design of this feature tends to lead people towards good usage patterns and away from bad ones, or vice versa. I think that when used well, and with care, default extras could be beneficial, but there’s very little in the way of guidance, either in how the feature is designed or in the documentation in the PEP, to help users know what is a good usage of the feature.

My fear is that default extras will turn out to have a complicated UI, and be too easy to use incorrectly. And this discussion hasn’t yet done much to reassure me that that won’t be the case.

3 Likes

Specifically regarding --no-default-extras <pkg>, I think a global switch could be problematic.

If pkg defines default extras, then it’s reasonable for downstream package authors to expect that putting pkg in their project.dependencies means that default extras will be present. If they wanted to support a minimal install of pkg and default to getting the default extras, they could put pkg[] in project.dependencies and pkg into their own default extras, like so:

[project]
dependencies = ["pkg[]"]
default-optional-dependency-keys = ["recommended"]
[project.optional-dependencies]
recommended = ["pkg"]

That clearly expresses “I support pkg with or without its default extras”, so I don’t think that same intent is expressed by this pyproject.toml fragment:

[project]
dependencies = ["pkg"]

In the context of the former being possible, this seems to say “I require pkg with its default extras.”

So although --no-default-extras :all: might be useful, I would expect that it results in broken installs at least some of the time.

If we want to say that declaring pkg in project.dependencies doesn’t guarantee your package the default extras, that’s okay. But the PEP should declare whether or not installers should be allowed to omit the default extras in these cases.

Given that the two aforementioned [project] tables are possible and seem to say different things about what the package author thinks they’re signing up to support, I don’t think leaving this entirely up to installers is a good idea.

6 Likes

From a user perspective, I don’t want the default behavior to be installing more than minimal dependencies and having to opt out in ways that may not be obvious, especially with transitive dependencies.

From a software reliability perspective, I’ve seen this result in the minimal builds being broken by developers not realizing something was provided by a recommended, but not required dependency in multiple ecosystems, multiple times in those ecosystems.[1]


  1. I think if you depend on a transitive dependency you should explicitly depend on it anyway, but the reality is that people don’t always do this, and are later broken by a dependency dropping a dependency or moving it to recommended but not required. ↩︎

2 Likes

Thanks for your work on this PEP @trobitaille and @jonathandekhtiar. It looks like a net win to me to have this functionality.

I’d like to request some improvements to the Motivation section, because not all examples there seem relevant, in particular for the “multiple backend” part. There are three bullet points in that section:

  • Qt. This use case is clear, and all concrete packages listed as examples (kivy et al) are for Qt.
  • BLAS/LAPACK. This case isn’t so clear to me, because normally those are shared libraries are linked against directly and then vendored in the wheel by auditwheel & co. It could work in theory, but I’m not aware of any concrete package that can depend on either OpenBLAS or MKL in practice. Do you know of any? If so, I suggest adding them. If not, I suggest removing this bullet point.
  • FFT libraries. Same - it’s possible in principle, but there is no package given as an example. The case isn’t too strong, because the functionality provided by both scipy.fft and pyfftw can also be obtained from numpy.fft - with roughly equal performance since numpy 2.0, so I suspect anyone with this use case will anyway have a numpy dependency and therefore use numpy.fft instead of an extra like this.

The other comment I have on the multiple backends use case is that it would be more naturally covered by the ability to specify a logical OR between dependencies. That is both more clear and avoids pitfalls with cases like pip install pkg[other-extra] dropping the default backend. I suspect someone brought this up before, but I don’t see it in this thread nor in the Rejected ideas section. Did you consider this?

The “recommended but not required” use case seems much more broadly applicable. That said, the tensorflow example isn’t worked out and probably not ideal - the build variant support in wheels that has been heavily discussed in other threads is a much more appropriate solution for this than moving an [and-cuda] extra to a default extra. That just swaps CUDA support from opt-in to opt-out, and many users will now get very large downloads unnecessarily before realizing they now need the [cpu] extra - I don’t think that’s an improvement. In this section, I suggest removing the tensorflow example and adding some new examples of packages where the benefit is more clear. A bunch of people in this thread already said “I’d use this for X”, so it shouldn’t be too hard to do.

6 Likes

Indeed, though I think we should perhaps think of such a flag/option as being analogous to --no-build-isolation: it’s an advanced option that would be aimed at developers or more advanced users who know what they are doing but would carry risks. As a developer, I could use this to find out if any of my packages are implicitly depending on default extras being present, and explicitly specifying extras in my dependencies as a result. But I don’t think this should be seen as a common user option that would give a minimal footprint installation of a whole dependency tree while having everything work as normal. The current wording of the PEP regarding such an option is:

It would also carry risks for users who might disable all default extras in a big dependency tree, potentially breaking packages in the tree that rely on default extras at any point

I’d be happy to amend the PEP to say that package installers implementing support for default extras must also provide a global way to opt out of these, but I think we want to also make it clear that the primary audience for this option should not be the casual user.

1 Like

In fact, given what you’re saying here, I’d prefer (as a pip maintainer) for it not to say that, and we could just not provide an opt-out in pip.

I’m also a little surprised at your qualification - “package installers implementing support for default extras”. Wouldn’t that be all package installers, if we assume that standards are expected to be implemented across the ecosystem?

1 Like

Yes apologies for the poor wording – what I meant is that we could say that when (not if) they implement support for default extras they should implement that option at the same time.

I don’t really have a strong opinion about whether this should be mandated or not though – the current wording in the PEP was meant to leave it as a choice for package installers, which I personally thought was better than being over-prescriptive, but I’m happy to follow the general consensus.

I’m also happy to leave it as a choice. This discussion only really started because I was seeing comments about use cases that seemed to rely on having an opt-out available. And people assuming something that the PEP doesn’t require is a potential issue.

I still think the PEP is weak in describing how the default extras feature should be used in practice - and the speculation in this thread about ways to manage a project’s extras suggests that I’m not the only person who isn’t sure how things will work in practice.

The PEP currently has some common concrete examples as a subsection of the Specification section, and we discuss guidelines for using this in practice and migrating in the package authors subsection of How to teach this. Are you suggesting we add some specific examples (real or hypothetical) of existing packages that would need to manage the transition and work through how this could be done? (along the lines of the Transition section you were suggesting previously). The PEP editors were not keep previously on adding new top-level sections, so do you see this as something that should be added to How to teach this?

I think there needs to be a subsection in the specification which describes required and permitted installer behaviors. Maybe there is one and I missed it, but --no-default-extras is currently only mentioned in the rejected ideas section.

Given a package pkg1 with a dependency on pkg2, are installers allowed to omit pkg2’s default extras?
I don’t think the answer is a strict yes or no based on my interpretation of the document. But in spite of the nuance, we should strive for clarity. “SHOULD install the default extras”, “MAY implement a user facing mechanism to opt out”, etc.

1 Like

I’ve made a draft PR at feat: support PEP 771 by henryiii · Pull Request #226 · pypa/pyproject-metadata · GitHub. That would allow scikit-build-core, meson-python, and pdm-backend to try it out.

I think it would be good to ensure that developers understand that default-extras are not less “weighty” than a normal dependency; most users will be getting them, and dependencies of dependencies can’t opt out easily, so it’s not intended to add dependencies you think many users won’t need. It’s a way to allow opting out for certain cases (or to pick between backends, etc). This could be both mentioned in the PEP, and be part of the eventual update to the packaging guide. The pytest example is a good one, maybe something like that could be included.

I wonder if listing a larger collection of examples of packages that plan to use the feature would help? build will build packages over a second faster by default, with fewer crashes, for example. And build will also get better for bootstrapping in the long run.

Also, the prevalence of pipx run and uvx combined with Wasm might be a reason we are seeing more need for this now, as these tools are much nicer if you can jus type the package name, but Wasm really cares about minimal dependencies in webpages and you don’t need things like click or rich on Wasm, there’s no CLI. repo-review and validate-pyproject both would really benefit from this.

If [] is added (a fork of pip was just used to show this to be possible a day or two ago), then [typo] should be identical to []. I think that also makes this feature simpler: x is shorthand for x[default-feaures], and including the brackets means you are listing the extras explicitly without defaults. Much easier to reason about and teach. It’s also nicely backward compatible, so if a adds a default extra, and b depends on a and doesn’t need the default extra, then b can explicitly depend on a[] and it will work on older versions of pip.

It might be worth mentioning that package can reexport default extras. You can depend on a[], reexport a’s extras, then re-export a’s default extra. I’d do that with sp-repo-review, which currently reports sp-repo-review[cli] -> repo-review[cli], with default extras it could do the same with defaults.

I think “MAY implement a user facing mechanism” to opt out sounds good. I’d say tools like pip and uv could wait to see if overuse of this features by packages that have dependencies becomes an actual problem before adding such a flag. In fact, uv’s override feature might be very easy to extend to allow it to override default extras too - in that case, there wouldn’t even be a new required flag.

From what I can tell, this isn’t really an issue with default features (otherwise, you could make a PR to remove the features in the packages that had the issue!), but just with lots of dependencies in general. Rust makes it very easy to use dependencies, and extra dependencies don’t affect the end user, but only compilation (for the most part), so having lots of dependencies is common. uv, for example, has over 110 direct dependences listed in its Cargo.toml. Unlike Python, Rust is not “batteries included”, and core things like Regex and AST parsing require dependencies. That’s not an indication that default features don’t work.

I’ll probably try to turn some of these suggestions into PRs toward the PEP.

4 Likes

I think this approach deserves serious consideration, and should appear in the “Rejected Ideas” if this PEP is to move forward.

It’s already quite easy to simply split up your package into “package” (with default dependencies) and “package-core” (without it) and have the former depend upon the latter. You can even pin the versions together if you want. It provides exactly the behaviour being sought for the case that everyone agrees this PEP would apply, and resolves the “what if I don’t want it” case directly.

I honestly don’t see how the complexity being proposed for installers and users is worth a handful of package maintainers doing a bit of splitting up of their package. If we need the build backends to get over their “one package per repo” idea, then let’s help them get over that.

8 Likes

I get the impression from comments here that you’re underestimating how disruptive splitting up packages is. But having said that, I’ve not seen any concrete analysis of the actual cost, nor have I seen any consideration of what we could do to mitigate that cost (which would be an alternative to this PEP). I agree, it should be covered in “Rejected Ideas”, at least.

I think the whole “packaging is bad at monorepos” idea has become a bit entrenched, and I agree it would be good to fix that. uv seems to have a working approach with its workspace model, based on Cargo. I don’t know if PDM, Poetry or hatch have anything similar, but maybe we should be looking at extracting the common ideas from all of them and standardising those. It might take longer to come to a consensus on how to do that, but it also has the possibility of being a better long-term solution. In particular, I’d like to have an answer that doesn’t lock people into a single tool (“use uv” might be a great answer, but it doesn’t help projects that have invested time into adopting Poetry).

As things stand, I’m starting to feel that PEP 771 may not be the best solution here, but it’s the only one that anyone is actively working on. That’s a shame, because I don’t think that’s a good criterion for choosing what to standardise.

@trobitaille @jonathandekhtiar I would strongly recommend that the PEP include some investigation into the approach of splitting packages into “core” and “standard” versions, both as things stand now, and in terms of what potential tool improvements could offer. Even if the result is that you decide it’s a “Rejected Idea”, I want the PEP to be able to answer the questions “why isn’t splitting packages sufficient?” and “why wouldn’t a PEP to make working with monorepos easy make default extras unnecessary?”

4 Likes

One thing to point out here is that Rust has both (default) extras and workspaces, which suggests that one doesn’t replace the other.

I think they’re mostly separate ideas entirely. There’s one use case where they overlap with the “foo-core” type of arrangement but usually I think of extras as pulling in a third-party dependencies, in which case workspaces aren’t relevant.

1 Like

Or that one did replace the other, but they can’t just remove a feature. It’s worth digging into this rather than simply assuming it supports your preferred approach.

1 Like

It isn’t really splitting up a package, though. It’s just replacing your “main” one with a metadata-only list of dependencies, and renaming the one with code to have some kind of suffix.

1 Like

This is where a worked example, ultimately as a recipe in some documentation somewhere, would help enormously. Maybe you could write up such an example, as the idea that the PEP would then reject (assuming the authors still prefer to have default extras)?

I think a big part of this is inertia and a question of whether or not such a rearrangement is “worth it”. For me, this is the issue – splitting up my example package would be pretty easy at a technical level, but I’d have to talk with other org members about adding new repos or experiment with a novel (for me) monorepo layout where there is a package at the root and others in subdirs.

Reshaping a package is the kind challenge that projects can and do tackle. (See: everyone who moved to a src/ layout when that became more prevalent.) It’s also enough cost that you would hesitate to do it if you’re unsure that the result will be a significant improvement.

Even if it’s straightforward to do, it requires a significant degree of commitment to the new project structure.

Default extras are a lot easier to apply and back away from if it turns out to be a mistake.

I think the best answer to “why don’t packages just restructure?” is that the cost benefit of restructuring is not all that clear. I’m not a huge proponent of the default-extras solution (basically a +0 here), but I think this is a legitimate argument to make: it makes it much easier for packages to change their default and minimal dependencies.

Mostly OT, but I’ve seen less evidence that the packaging tools themselves struggle to handle monorepos gracefully and more evidence that a wide variety of tools have issues with this. Packaging is, I think, taking the blame for more general ecosystem problems. I’ve only used a couple of monorepos at work but the packaging parts seem totally fine.

It’s the ancillary tools where we see the most issues. pre-commit, mentioned earlier, can be unpleasant to configure with per-project settings. I’ve had issues getting isort to behave itself. Repackaging workflows get more complex.

Lack of support for relative path dependencies is a problem, but usually tool-specific options let us work around that issue.

1 Like