Thanks for the thoughtful response. We are in sync that we want to work toward a good solution for all parties.
I haven’t read all of the responses since yesterday but I will do my best to explain my earlier comment re: “nice to have” dependencies.
I’m going to use napari as an example as it is cited in the PEP.
napari users: Users are typically bioscience researchers (and sometimes geoscience) who need n-dimensional visualization (such as researching cancer cells and layers). Some are computational scientists who have familiarity in using command line tools and programming. Another group are bench scientists who typically are using napari as an application, and they are not experienced with the command line or programming.
Ways napari is used: Library. Application. Application with plugins. Plugins are napari written and community developed.
Tool expectations of users: Users will use pip, conda, uv, and pixi to install the library and application. We do have an application bundle as well but it is less well maintained (developer time limitations).
OS support and GPUs: Windows is frequently used by bench scientists. Mac and Linux are also used. GPUs are typically used by all.
Our explanation of installation: How to install napari — napari and Choosing a different Qt backend We currently recommend pip install napari[all]. [all] currently is an optional dependency that installs napari, pyqt5, and a limited set of optional dependencies. Installing as pip install napari would not provide the user with Qt.
tldr;
Suffice it to say, that there is a lot of complexity and permutations with OS/GPU/Qt/Python versions. My hope is that we can continue to improve the experience for users and maintainers as well as packaging tool creators. Having standards is helping. I personally view this PEP (in it’s current or future form) as a way to help reduce complexity. (As an aside, I am hopeful that wheel-next and variants will help as well.)
Just to make sure there’s no confusion here, I don’t think this statement is true.
$ poetry add httpx
Creating virtualenv example-1k4QdtUV-py3.12 in /Users/zb/Library/Caches/pypoetry/virtualenvs
Using version ^0.28.1 for httpx
Updating dependencies
Resolving dependencies... (0.6s)
Package operations: 8 installs, 0 updates, 0 removals
...
- Installing httpx (0.28.1)
Writing lock file
And uv add will perform an installation too, after adding the package to the dependency list.
I’ll echo the concerns of others in this thread. While this seems nice, it’s very hard to define in practice and will cause a lot of problems. I don’t think a request for a package should result in different dependencies in different installation contexts.
On another note, I think the PEP would benefit from some examples requiring a package with default extras via a project.dependencies table in a pyproject.toml. A lot of packaging users don’t use pip install <package> and I think we should be documenting experiences with the standard project tables as a first-class part of any proposal.
Hi Zanie Would you mind clarifying this a bit more.
All, would anyone be able to summarize with a bullet list what the primary concerns are? There’s a lot of complexity and details around all things packaging especially for those of us who are users of packaging tools. As a user of packaging tools, I’m not as up to speed with details as those who are working to build the tools. Instead, I’m coming from the perspective of a busy maintainer with a complex project to maintain. Like folks who are creating the tools, I’m just trying to help our users who range from command line newbies to experts succeed in installing our project
which is the standardized tool-agnostic way to declare dependencies.The current examples center the conversation on pip, which has an interface not defined by the standards — I think they’re valuable still, but it’s critical to cover the standardized path.
Still, I think of that install as a side effect.
(I’m probably very biased. The poetry applications I work on have workflows which never use that managed virtualenv).
From my perspective, the “add” operation is still
update project state in metadata files:
add to project.dependencies
update the lock
synchronize that state to the managed venv
It might be moot anyway, since it seems that we’re building consensus that specifying behavior only for user requested or “direct” installs doesn’t work. It appealed to me at first, but having toyed with the idea, it opens up thorny UX problems.
Thanks Zanie for the helpful response. For my understanding, what’s the best place to read up on the tool-agnostic standard? I tried searching the packaging docs and it wasn’t jumping out at me.
I think your suggestion to work through an example like you describe makes good sense.
I would also like to re-echo this concern, there’s a discussion about something like pip install foo vs. something like uv add foo.
But the more equivalent to uv add foo in the pip world is to add foo to your pyproject.toml and then run pip install ., and in the pip world when you run pip install . only your local package is a “direct requirement”, foo would not be.
Which already causes confusion, as for complicated requirements this causes the resolver to behave differently for pip install -r requirements.txtcompared to pip install . where the pyproject.toml has the same dependencies (I’m hoping to minimize these differences in the future). Having default extras behave differently in these two scenarios seems fraught for user confusion.
raises hand I’ll do my best to summarize. Although I might not be the perfect candidate to do it, at worst we’ll have a Cunningham’s Law moment.
This is my understanding of the issues / debate items, editorial comments at the end:
This is similar to recommended packages in apt and dnf, and those mechanisms haven’t worked well historically.[1] We may be about to repeat their mistakes without learning key lessons.
For package consumers who really want a minimal install tree, this makes things harder. In particular, the concern here is inclusion of default extras in second- and third-order dependencies, which can’t be controlled by the package consumer. Possible bad outcomes include more disk and network usage, package version conflicts, increased security risk exposure.
Package maintainers who add a default extra are changing the install behavior of their package. This can be a breaking change (e.g., if it holds back the version of a second-order dependency) and needs to be socialized properly as a package design decision with nonzero impact.
Most packages already fail to test scenarios which their metadata claims to support (e.g., testing your maximal Python version with your minimal supported dependency versions). This mechanism encourages packages to have two supported install modes, with and without the default extra, and we need to socialize the testing requirement this implies for maintainers.
This pattern encourages the use of extras in general. If extras have a bad design, this doubles down on that bad design.
Default extras can lead to inconsistency across distribution channels, as each redistributor (e.g., linux distro packages) may make a different decision about whether or not to include the default extra.
The current discussion does not sufficiently explore the alternative of publishing multiple distinct and related packages. Why are packages which provide applications + libraries not satisfied to publish a library package and separate application package?
A couple of these are “this whole thing is a bad idea”. We can acknowledge downsides and keep them in mind. We may be able to mitigate some of the harms which are foreseen by large/uncontrollable install trees, but I don’t see all of these as criticisms which can be “addressed” per se because some are fundamental. I’d like us to get more clarity on these so that we can at least understand why “recommended packages” don’t work and see if the same logic applies to Python. @bwoodsend wrote some good content about this in the other thread which I want to reread.
Changes to install behavior and testing requirements are social issues. Your perspective on it will vary based on how optimistic/pessimistic you are about package maintainers doing a Good Job. At a minimum we need good guidance and documentation. There will be mistakes in how packages declare their dependencies. I do not find this argument compelling when it presupposes that it is categorically impossible to ever guide users towards proper use of the feature.
Partly, I find “this feature is a bad idea inherently” getting mixed with “proper usage is way more complex and expensive than everyone seems to think”. But if there is such a thing as proper usage and it’s wildly complicated, I’d like to see someone take a crack at defining that proper usage so that we can understand why it’s so hard to use well.
On splitting packages in particular, I think we can do more to understand/explain why this doesn’t work for all packages. We have enough smart and hard working people in the conversation that if this were really an equivalent alternative, we would be simply talking about ways of making that workflow better for maintainers. Many of us don’t want to split our packages – for my part, I think it makes the redistribution problem even more confusing/worse. Presumably there are other reasons for this? I can guess at some, but I emphatically reject “I don’t want the work of doing two releases” as sufficient – there’s something more here, since we are all generally willing to put in effort when it improves things for our users.
Here I’m just echoing what’s been said. I have never used “recommended” package features of my linux package managers, so I have no idea about this. ↩︎
A use case we have in Flask and Werkzeug is to decouple the development server from the framework. Flask depends on Werkzeug, and right now Werkzeug is both a framework and a convenient development server. The server must only be used during development, it is not secure, stable, or performant for production use.
Right now, all users, tutorials, etc assume that if they install flask they have access to the dev server through flask run. I could split werkzeug and werkzeug-dev-server, and have flask run or from werkzeug import run_simple show an error explaining that they need to install the dev server.
If we had a default extra that we could opt out of, I could say flask has a default dev extra which installs werkzeug-dev-server. You would install flask and get the same experience you get now. If you wanted to exclude the dev server, you would add flask[] to dependencies, and flask to dependency-groups.dev.
Without default extras, I could instead require users to write flask[dev] to opt into the dev server instead of opting out. But developing is the much more common activity than deploying to production, and so it feels like getting the dev server should be the default.
Substitute werkzeug and werkzeug[] above as well, since it can be used as a standalone framework without flask. Then you also get an example of second level extras, as flask would depend on werkzeug[] and have a default extra with werkzeug, which would have a default extra of werkzeug-dev-server.
I could have sworn I commented about this use case already, but I can’t find it in this or the previous topic. Sorry if this is repeating myself.
I’d just like to briefly reiterate for those who have not seen the post about Cargo features from a previous thread: the problems that arise from default extras are well known in that ecosystem as well, there’s a wealth of prior art in this space. I think the PEP should be clear about what we’ve learned from other ecosystems and how we’re avoiding these pitfalls.
Thinking about this more, I’m not sure this is even covered in the PEP, or would be possible. If flask[] depends on werkzeug[] and flask default extra lists werkzeug, does werkzeug[] (no extras) or werkzeug (default extras) “win”? If this behavior can’t be defined consistently (especially with deeper nesting), if it’s an example of the mistakes we don’t want to repeat, I’m fine with that answer too.
I don’t understand why that makes the runtime check any less effective though? My reading of what the difference would be is:
To the developer, the chain is x installed checks just gets longer. This is dull to write but it’s only more code as opposed to more complicated code. It’s also something that you’d have to do anyway even with this PEP.
To the non-programmer, I’d presume that they’re follow the instructions that tell them to use pip install napari[all] and don’t have any prior experience or intuition to mislead them into thinking they don’t need to read anything and guessing pip install napari. I’m sympathetic to the pains of delivering to this kind of users [1] but I can’t think why it would even matter what this command is as long is it’s portable (no shellisms, platform specifics, current working directory assumptions, activation scripts [2]).
To the user that has used pip before, I’d expect this to be all they need. If the guidance you need to put in the error message starts to look too long/convoluted then you can always preface it with an if in doubt, just install natpari[all].
What/who am I missing?
In my previous life, I was writing Python for dentists. The troubles I had there is mostly what got me into Python packaging ↩︎
ignoring the problem that getting and using pip itself requires dealing with these things ↩︎
I think also that’s the default for most people. I know I’ve frequently run pip install thing and then later realised I really meant pip install thing[extra].
I disagree with that statement, although I do agree with the conclusion you draw.
The current examples centre the conversation on installing a project, as opposed to installing its dependencies. This looks like it’s focusing on pip, but solely because pip is the only tool which implements a “pure install” (you could use uv pip install, but I guess you’d argue that’s “just pip compatibility” and not a native interface for uv).
The uv and poetry add commands don’t actually do a “pure install” of the requested project. Instead, they add the requested package to the current project’s dependencies, and then do a pure install (or sync) of the current project. That distinction is subtle, and one that I don’t expect users to be aware of (indeed, uv seems to be making an explicit effort to hide that distinction by not supporting “projectless” workflows outside of the uv pip interface). But it is a real distinction, and it’s what was behind the idea that we could distinguish direct installs.
It’s clear at this point that having different behaviour for direct installs is not going to work, at least in part because tools like uv and poetry conflate direct installs and dependencies. So at this point the difference isn’t that important. But I do think it’s still important that the PEP both gives examples of both types of usage, and explores the implications of the differences between them.
(On a personal note, I’m somewhat frustrated that “pure installs” and “projectless” workflows appear to be getting sidelined by the current crop of workflow tools. But that’s a side issue for this discussion, so I don’t want to debate it here. I just want to be clear that projectless workflows are still extensively used, and the PEP needs to consider them - it’s just that it also needs to cover project-based workflows).
For 2. I think it’s a bit more subtle than that. Unlike linux distros (and similar packaging systems), there are no transitions, no coordination group, no central overrides. This means that most likely the first time someone finds out that a dependency has switched to using default extras is when they get a bug report noting that they need to add [] to said dependency specifier. Likely what will happen is they realise that they need to add [] to all their dependencies (because otherwise they will get more bug reports), and then they will get on social media and complain about Python packaging. And if said maintainer is busy will other things (and hence there is no central group to fix said issue), the person with the original bug report is stuck with something that is unfixable on PyPI (local forks is how I would solve this, but this naturally does not scale, and doesn’t fix the community effects), and then they will get on social media and complain about Python packaging.
I personally have no issue with there being additional metadata that can aid discovery (or extras in general), it’s the reinterpretation of what an existing dependency specifier means that is the concerning part. I’ll note that uv already has a feature that says “solve for dependencies as if it was this date”, this shouldn’t be required and says we currently have an issue where existing metadata cannot be fixed, and PEP 771 will only make this worse.
I think having this change to show the default extras if there are default extras on a package would be a good solution, though I guess that’s a question for the warehouse devs as to how easy that is.