PEP 771: Default Extras for Python Software Packages (Round 2)

I’m not sure I follow why you expect pip install foo to act differently than pip install foo[something] here?

The name foo means “foo and it’s default extras”, just like foo[something] means “foo and the something extra”? As far as I can tell, foo never acts differently no matter where you use it as PEP 771 is currently written, it always acts as if you entered foo[any, default, extras], and foo[] is what you type if you don’t want to pull in the default extras.

1 Like

This whole discussion has only convinced me more that extras are a confusing and unnecessary dimension of packaging, and default extras would only exacerbate that.

An assumption I see lurking in the background of most perspectives here is that the “normal” way to install somepackage is (or should be) pip install somepackage[1]. But I think instead we should assert that the “normal” way to install somepackage is to look at the documentation for that package and do what it tells you to do to install it.

Maybe that will usually be pip install somepackage, but maybe if the package author thinks some other things are good to have then they can tell people “you probably want to do pip install somepackage nicetohave otherstuff”, or they can tell people “if you are in situation X then pip install these things, if in situation Y then pip install these other things”. And if they want to provide multiple convenient predefined combinations of stuff then they can create packages for those things and tell the users to install those packages.

All this can be done today and does not require anyone to use extras, let alone default extras. If someone does pip install blah and it somehow doesn’t work, their first recourse should be to the package docs where they will see “you should do pip install blah stuff whatever” and then they can install those things. That just seems way less confusing to me than all the thorny questions being raised here about the interactions of default extras with dependencies and so forth.

It just seems to me that we are doing a lot of bending over backwards just to “save” people from the horror of looking at a basic package readme, and/or typing more than one package name in their pip install command.


  1. or maybe uv install somepackage, etc., which only adds more dimensions of potential divergence in behavior ↩︎

1 Like

Can people please just read the thread (preferably including the parent one). Every question that’s been asked since this little conversational revival has already been answered in detail with real world examples. Every point that’s been made has been seen and heard already. This conversation has been going round and round for the last year and a half.

Ultimately this decision all comes down to a series of tradeoffs. Maybe the majority of people genuinely don’t care if dependency trees become murky with bits that shouldn’t be there or are hard to tell why they’re there. Maybe the handful of legitimate usages outweigh the attractive bad ones. Maybe it’s worth making repackagers unreliable and/or inefficient so that package authors can do whatever they feel they need to do. Maybe it really is too much to ask for packages to explain what their extras are for and for users to read those instructions.

But we can’t have that conversation if we’re stuck in this endless loop of “let’s do it! It has no drawbacks at all (apart from maybe bikeshedding the syntax)!” “No there are actually drawbacks (the misconception that there are no drawbacks is one of the main drawbacks)” “WDYM? Please spend ages spelling them out all over again.”

5 Likes

Thanks everyone for the renewed discussion! I’ve been mostly unavailable for the last few months, but plan to become more actively involved again. Before responding to specific points (which I’ll do in follow-up replies), I wanted to share some thoughts on how I think we should proceed in general.

The idea in PEP 771 has now been discussed over nearly six years — starting with Jonathan’s original thread in August 2020 (which itself referenced a setuptools issue from 2017), through three rounds of PEP discussion, and involving many people across the packaging community. I think there is by now a reasonable understanding of the trade-offs involved, even if not everyone agrees on how to weigh them.

Over the course of this discussion, different approaches have been suggested — most significantly, having recommended extras be opt-in rather than opt-out. However, I think it’s important that this PEP continues to represent the opt-out approach that it was designed around. This is what a significant number of people in this and previous discussions have expressed a desire for, and it is what the PEP was written to propose.

If the opt-out approach is ultimately rejected, that would at least give a clear signal to the community, and would leave the door open for a separate PEP proposing an opt-in or metadata-only approach. In other words, I think it would be a mistake to water down this PEP into something substantially different just to avoid a rejection — better to have a clear decision on the idea as proposed.

That said, there are of course improvements we can and should make to the PEP based on the discussion here, and this is what I plan to focus on going forward.

7 Likes

Eh. I disagree here?

If pip install somepackage (or uv, or whatever) isn’t the “normal” way to install a package and you’re expected to “just go look at the documentation and do what it tells you to do”, then you can easily say things like:

  • Listing the required dependencies for a project is superfluous and causes complications, after all you can just document pip install somepackage andanotherpackage andathirdpackage.
  • Build backends and the PEP 517 interface are superfluous and cause complications, after all you can just document the correct correct commands to run to install and invoke the correct tool to produce a wheel file that you pass into pip.
  • Wheel files, and pip itself, are superfluous even, after all you could just document the command that tells the package to install itself.

What you’re describing as a hypothetical “this is what the normal way to install a thing” should be, is what the normal way to install something was 20 years ago. The last 20 years of packaging evolution within not just Python, but the entire industry, has basically been people going “you know, maybe it isn’t actually a good thing if every project needs to document exactly how to install itself, especially when for most of them we can provide structured metadata that provides a consistent experience rather than each project being a special snowflake”.

Obviously there are some projects that are just too complicated or complex to fit within those parameters, and sometimes some dimensions introduce far more complexity than they’re worth, and those projects have to be outliers. I don’t think that means we should have some sort of blanket idea that every project should have to document “how do I install this project” and think that’s going to be reasonable (or that we’re even going to be successful in convincing users when the whole rest of the industry has been moving the other way).

FWIW I’d contest that it’s incorrect to say become, given that a number of widely used projects are putting “nice to have but ultimately not really required” dependencies as required dependencies.

Pretending that dependency trees are pristine and perfect today is ignoring the ground truth of the ecosystem, they’re already murky, and in some cases that murkiness is a direct result of not having a feature like this. For those projects, the concerns around repackagers are improved by this metadata, not hurt.

Yes, there is a trade off, and my stance has been that we should be enabling the project authors, who have far more context available to them on what makes sense for their users, to make the right choice for their users.

As I see it, the trade offs are basically:

If we keep the status quo, the set of dependencies installed by default will continue to be limited to the set of dependencies the package author declares are “required”. Packages with optional, but highly recommended or “default backend” dependencies will have to make a choice between:

  • Specifying those highly recommended dependencies as optional dependencies, and requiring users to seek them out and ask for them to be installed to get the “recommended” out of the box experience.
  • Specifying those highly recommended dependencies as required dependencies, and forcing people who don’t want those dependencies to manually determine which are truly required and which are actually optional and fork the package (or “break” their install to remove them even though the metadata says they’re required).
  • Implementing the “default extras” feature themselves, by making a {}-core package which has the true required dependencies specified, and a {} package which depends on that and the recommended/default dependencies.

We know that all 3 of those cases are happening today, so none of them are hypothetical.

The first case is the best case for the repackagers and for the understand-ability of dependency trees. Unfortunately it’s also the worst experience for users out of the box, so projects that want to focus on their out of the box experience are incentivized to pick one of the other options.

The second option is the worst case for repackagers and for the understand-ability of dependency trees, but it provides that “out of the box” experience that those projects want. Unfortunately it has the lowest friction for package authors, so unless they feel strongly about keeping their “required” dependencies list to be what is actually required, this is going to be an attractive option.

The third option is a middle ground between them, it’s easy for repackagers to determine what is truly required ({}-core vs {}), but dependent projects have to make their own decisions about whether they depend on {}-core and keep their dependency tree minimal (and if they do, whether they surface their own core vs default package) or whether they depend on {} and pull in the recommended dependencies. Unfortunately this option has the highest friction for package authors, they’re suddenly forced to rename their “real” package from foo to foo-core and publish two packages instead of one (which then adds it’s own question about whether foo should depend on foo-core with a == or a >=,< or no constraint at all? should the foo-core and foo readme be the same? one empty? 2 maintained?).

As far as I can reason, default extras as specified by PEP 771 are basically equivalent in spirit to the third option above, except it’s foo and foo[] instead of foo and foo-core. Where it differs from that, I see primarily positives:

  • It does not introduce much, if any, additional friction for package authors, so they’re far more likely to actually use it instead of the second option above where they over-specify their dependencies.
  • It still requires dependent projects to make their own decisions about whether they depend on foo (with default extras) or foo[] (without default extras), and similarly I expect the “I just didn’t think about it” option to be the “with default optional dependencies” choice, but it doesn’t make that any worse, it just changes whether the minimal dependency is spelled foo-core or foo[].
  • It does not introduce the weirdness around how foo should depend on foo-core, surfacing readmes and documentation, etc.
  • It wastes less shared resources on PyPI and on the users machines as there’s less of these empty “meta packages” laying around (though these meta packages are likely to be pretty small).

As far as I can tell from the various discussions, the only real negatives PEP 771 has over the “core package” pattern, is:

  • It has less friction, therefore package authors are more likely to use it, so the negatives of the “core package” pattern that still remain are more likely to be encountered.
  • Stuff that’s inherent to new features; it’ll require ecosystem effort to make sure it’s well supported (for instance, making sure that what extras were requested being surfaced in tool output) and teaching people about it, and adding another concept to know.

Of those, I think the first of those is the only real fundamental question about the conceptual trade offs to PEP 771 (e.g. not questions around syntax, or specific details, but the idea in general), and when we compare that to the status quo:

Do we think it’s better to enable the projects that are currently choosing the worst case option for the concerns folks like you have raised to switch to a middle ground option that still gives them most of what they want without much additional friction, at the risk that maybe some projects who are currently choosing the first option are being held back from choosing the third option only due to the increased friction?

For me personally, I think the cohort of projects that are currently choosing option 1 and are being held back from option 3 only by the amount of friction is entails is pretty small, and I believe the cohort of projects who are choosing option 2, but would choose PEP 771 if it were available is much larger [1].

I think the group of people who are willing to blindly add things to a list of default extras without understanding the ramifications but who also care enough about keeping their “required dependencies” list honest and/or minimal to not just blindly add stuff to that list is a pretty small group.

My rationale for that is that if you don’t care about or understand the ramifications of “these extras will be installed by default”, which is one piece of dependency metadata, then it’s hard to imagine that you’re going to treat another piece of dependency metadata much differently.

As far as I can tell though, those are the actual trade offs, and like it’s silly for proponents of PEP 771 to pretend that there is absolutely no downsides to PEP 771 [2], it’s also silly for the detractors to pretend that all packages are currently choosing option 1 and thus our dependency trees are clean and pure or that option 3 doesn’t have almost all of the same downsides they’re worried about for PEP 771 and then some.


  1. And TBH, I expect this includes a number of projects that have actually required dependencies today, but where the dependency is only used for one small “side” feature who could be persuaded to make that dependency optional for people who aren’t using that small side feature. ↩︎

  2. Literally every new feature has a downside! Even if that downside is just “it’s one more thing to document/teach/understand/etc”. ↩︎

11 Likes

On a strict reading of what I said, in some sense this is true. However, it is mostly only true for this case:

Perhaps I am more inured to that as I’m used to conda-installing as many deps as I can before pip-installing something that isn’t available on conda-forge. :slight_smile: But, in practical terms, there is a difference between a large, nested set of dependencies and a small, flat set of “extras” (i.e., between pip install thingone thingtwo and pip install thingone thingtwo thingthree ... thingfifty).

As I said, on a pedantic reading of what I said, these would be included. However, as the examples I gave suggested, what I’m mostly focusing on is simply specifying additional packages on the install command line. That is not comparable to manually re-doing the whole build process.

I also want to emphasize that the process I described is not hypothetical. It is already the situation. As you said:

But it’s not just complicated and complex. Some packages do recommend, for instance, that you install them with pipx, or using a requirements.txt, or via conda, or what have you. And that is my point. There will never be a world in which people can install absolutely any package they ever want without ever having to look at packaging documentation, because there will always be cases for which the simplest install approach doesn’t work. And from my perspective everything associated with extras falls into the category of “it’s fine if people have to look at the docs in this case”.[1]

If we want to say the default approach is “try pip install thepackage, and then if it somehow doesn’t work, check the docs for thepackage”, I’m fine with that modification to what I said in my previous post. My point is just that we should not consider “oh no I had to look at the docs to see what to do because the first thing I tried didn’t work perfectly” as some kind of onerous imposition on people trying to install packages.

I agree that in the end this is a matter of weighing the tradeoffs differently. Just to clarify my own weighting, though: essentially my position is that the downside you list in your footnote is already larger than the benefit of having this feature. However, this is not because I think our dependency trees are clean and pure without default extras; I prefer a clean dependency tree all in all, but I’m not fixated on that.

Rather, in my view, if anything is unclean, it is extras. Regardless of the dependency situation, having some things be installed as packages and some as extras is an unnecessary complication. I realize this may make some people dismiss my position out of hand, but a decent part of my opposition to this proposal is that I do not think we should do anything to further legitimize or encourage the use of extras in any way. In other words, the use of extras as a way to handle this problem (or any problem) is itself a downside.

On the other side of the tradeoff weighing:

I think a lot of our disagreement comes down to how we weigh this downside. I don’t see it as that big of a deal, especially if we also consider the option of foo and foo-full or the like, where the unadorned package name is the plain one and there is a separate one that means “gimme the works”. I have a couple reasons for this:

  1. We could easily add stuff to the packaging docs telling people about this and explicitly recommending some such convention, forestalling some of the disruption caused by migrating the package name.
  2. Other improvements in packaging tooling could significantly ameliorate this. It could become really easy to specify certain stuff in pyproject.toml and have tools automatically build and publish a set of packages instead of just one, autogenerate readmes that refer to the main package, etc. I would see this as preferable to any extras-based solution.
  3. Conceptually it just makes more sense to me to follow the principle of “if you want more, ask for more”, rather than something like this proposal[2] in which you must ask for less.

I won’t go through all the rest of your pros and cons, because overall I think your framing is good, I just disagree about the weighting of those factors (in some cases drastically :-).


  1. Not least because people already have to look at the docs to even know that a package has extras! Yes, it can be argued that default extras are different here because they’re supposed to be, well, a default, but a user can’t even make an informed choice about whether they want that default without looking to see that the package is using a default extra. ↩︎

  2. or even the foo-core/foo alternative ↩︎

1 Like

And has the ability to shunt the code that’s unusable without the recommended dependencies out of {}-core and into {}. It’s the only option that actually satisfies the goal of not making someone pay for what they don’t need. Dead weight in a package is no less dead or weighty than dead weight in its dependency tree. Both extras and this proposal and extras only address the latter whilst to some extent encouraging patterns that worsen the former.

If this proposal allowed shedding parts of the package itself as well as dependencies analogous to Linux distros’ -dev/-doc/-debug[1][2] then it would be more appealing – I’d still wary but it least it would actually be solving its intended problem.


  1. in fact giving numpy/scipy/… a way to drop debug symbols and test suites and examples and optional accelerator extension modules and still have them installable with synced versions would likely do way more than any focus on dependencies ↩︎

  2. possibly doable now without any PyPA support – just a tool that takes a wheel, a mapping of subpackages to filename patterns and how they depend on each other then turns that into several wheels ↩︎

2 Likes

Maybe? I’m not sure that I’ve ever seen someone do the “core package” pattern and actually shunt that code out of the “core” package and into the meta package, but it’s definitely possible. Though doing that has increased friction over an empty meta package as now your testing has gotten significantly more complicated (something folks have complained that default extras does, but not nearly as bad as this shunting would):

  • You can no longer test just the core package with various sets of dependencies installed, but now you need to test your meta package and the core package both.
  • The versioning story between your core package and the meta package is possibly more critical to get correct now. With it all in one package your extra features could rely on internal details of the rest of your application easily as it was all part of a singular version, but now you have to at least a quasi public API for that functionality or you need the meta package to pin with == to the core package version it supports (and then force dependents to update the meta and core package in lock step).

Of course, PEP 771 does not preclude shunting that code-- and I find it hard to believe that there is a large population of people who are willing to pay the cost of the core package plus the cost of shunting that code like you’ve suggested, who are then going to suddenly decide that it’s not worth it because of default extras.

Folks who are paying that cost are almost certainly doing so because they feel motivated by the benefits of breaking those things apart, and are not likely to lose that motivation because there’s another option that doesn’t give them those particular benefits.

But we’re still left with the people who are choosing to over-specify their dependencies, and whether or not the core package concept allows them to shunt their code or not, that doesn’t seem to be a compelling enough benefit for them to pay that cost-- which means that the goal of not paying for something you don’t need is even further away for their users.

2 Likes

Howdy :waving_hand: ,

Greetings from the Django community. We just received a related proposal that I wanted to link here.
Django, being one of the bigger and more complex packages, might be a good candidate to work out some quirks.

Here’s the related proposal: Add project metadata for all optional dependencies · Issue #157 · django/new-features · GitHub

Cheerio!

Joe

I’m pretty certain you’re misunderstanding what this PEP does. django[db] can only point to a fixed set of database dependencies with or without PEP 771. You could have django[postgresql] but you can already do that. With PEP 771 you could then also make django equivalent to django[whateverthedefaultis] but that default is sqlite so it would be pointless.

1 Like