PEP 735: Dependency Groups in pyproject.toml

Maybe I’m still missing something about it too, but I don’t understand why someone would ever use only-deps inside a pyproject.toml file at all. It seems like a feature for pip, not for PEP.

I’ll just add that my earlier proposal was a strawman and I wouldn’t worry too much about it–I was just trying to work out how external tools might declare editable installs, without it being spelled out here. But after writing it out, it felt too messy and not the right solution.

(emphasis mine)

FWIW: my primary use case for pip install -e is very different. I simply want something that can make sure that my test harness can actually find the code it’s supposed to test, even if I haven’t committed it yet - without having to build and install a new wheel for each test candidate (never mind setting up an entire new environment; Pytest’s own test isolation is plenty) and without having to learn about some precommit toolchain or whatever (for that matter, just because the tests pass on my working copy doesn’t mean I’m ready to commit it!). My test code needs an absolute import, so that I can have the tests as a separate package (instead of a subpackage, which would in turn complicate my distribution process); I find the pip install -e . approach far cleaner than anything involving PYTHONPATH or (heaven forbid) an in-code sys.path hack.

1 Like

@sirosen Sorry for the overload of information. Sorry I do not have time to write what I have in mind in a more digestible way.

For me what is important is to be able to opt-out from editable dependencies, path dependenices, VCS dependencies, URL dependencies, and so on, if they end up being added to a dependency group notation standard. Why? Because of abstract vs. concrete dependencies. The what needs to be installed should be kept separate from the how it should be installed.

I am not sure why the “editable” dependency case gets much more attention than the others, but whatever I don’t mind using it as the main example as well for the reasoning. Keep in mind though that I mean this for all “concrete” parts of the dependency notation.

So for example not everyone wants to work with editable installations. I use editable installations all the time but I do not want to impose it on anyone else. As we have seen editable installations have their flaws. So why “force” editable installations on all the contributors of the project?

And then, in my mind, if the concrete part of the dependency specification can be opted out of and overridden, then it is easier/clearer to reason with if this concrete part is stored separately. Otherwise people will struggle figuring out what is concrete and what is abstract. What can I opt out of?

If we end up standardizing dependency group notation that includes the concrete part, for example some-group = [{path = "../my-dependency", editable = true}], it is not a deal breaker as long as tools make it easy to opt out of and override with a different path.

I am a bit worried that the “opt out and override” idea will not happen if it is not taken into account right now in the standardization of dependency groups by clearly separating abstract and concrete. I am also not that worried, because I understand that “override” can be handled separately (by each tool as they see fit or by a different standard), it is kind of an orthogonal thing.

I assume Poetry might not be completely opposed to this kind of idea (separating abstract from concrete), since they kind of have to do it anyway if they want to add support for PEP 621 one day: Use the [project] section in pyproject.toml according to PEP-621 · Issue #3332 · python-poetry/poetry · GitHub. Indeed they currently have mixed abstract and concrete in their packaging metadata dependency notation and now they are somewhat stuck because of this.

1 Like

It’s OK, I couldn’t remember myself :smile: I looked up some details and it made me realise there’s a much more fundamental issue here, which also ties into this comment:

The critical point here is that all of these extended dependency types are only defined by implementation-specific behaviours in pip. There is no standard semantics for what any of these things mean.

For example, pip uses the name of a URL dependency to determine whether the dependency is already installed in the environment. Whether pip reinstalls a dependency that’s already installed is non-trivial, implementation defined logic. Also, the URL format for URL dependencies is implementation-defined. The question of whether editable installs replace a regular install, or whether editable installs get reinstalled if they are already present, is implementation-defined. There are probably plenty more aspects that are only defined in the pip source code.

Furthermore, not all tools that consume dependency group data will be installers. How would a tool that, when given the name of a dependency group, listed the names and versions of all packages to be installed by that group, get that information? Would it have to know how to get name and version from an arbitrary source path (meaning it needs to be able to invoke a build backend)?

Maybe it’s OK for all these semantics to remain as implementation decisions, and the PEP to punt on them. But if so, how can users rely on them? One of the key points of standardising behaviours is to allow multiple implementations. If this dependency groups are defined in such a way that users will write them so that they can only be used with pip as the installer, we’ve failed in that goal, and we’ve further entrenched the idea of pip as the only possible package installer for Python[1].


  1. I don’t know if we’ll ever be able to interoperate with conda at this sort of level, for example, but I’d prefer not to close the door on the possibility. ↩︎

4 Likes

As it happens: python - Installing local dependency with pip install - Stack Overflow : D

1 Like

As a separate process-y note: I’d appreciate if this PEP could actually be published on peps.python.org so that there’s a document I can read in one sitting and comment on.

While I appreciate that many folks engaging here have a lot of energy to put towards this topic (orders of magnitude more than me), I don’t think it’s reasonable to expect that everyone else will also be able to engage with the same amount of time – and one of the ways we accommodate for that in our standards process is by having a condensed proposal + discussion takeaways that people can read through (ala the PEP).

It’s still a draft PR, and it seems like the PEP has been continously updated as this discussion has gone on. The merged PEP does not need to be perfect (the PEP template has an open issues section!) but it is useful for engaging in the discussion.

And, I’ve been holding off on engaging in the discussion around this proposal and, realised this morning, that both this thread and the PR have over 100 comments each (~240, although I’m sure that I’m mixing editorial and non-editorial comments in that counting – which was another concern I had). That’s… a lot.

While this is a topic I’m interested in, I am realistically not catching up on this discussion at this point; hence this note that having the PEP up on peps.python.org would be appreciated.

6 Likes

And as the sponsor I’m fine with seeing the current draft merged and updated as appropriate (it just requires updating the Post-History as major revisions are made).

1 Like

It was, for which I apologize. I thought moving it quickly and capitalizing on the energy that was being poured in was the right thing to do, but it delayed getting the doc into a mergeable state and made things confusing and hard to follow. My misunderstanding of the process – merged draft, then discussion – contributed to this.

It’s not an ideal start, but I’m hoping that now that it’s merged we can get this back on track with minimal fuss.

1 Like

I agree. And I don’t see much point in just specifying some config-file syntax that says (for instance) “this means a path dependency” if we can’t standardize what “path dependency” means. Nor do I think it’s wise to say “well that’ll just be up to individual tools”, because then different tools will do it differently and the syntax in pyproject.toml won’t have any clear meaning and will be another source of confusion.

To my eye, this discussion seems to be demonstrating how trying to standardize something like a particular format for pyproject.toml metadata winds up, in its ramifications, connecting to many thorny conceptual questions (like “what does it mean, in a tool-independent sense, to specify an editable dependency/path dependency/etc.”). On the one hand, I think this means we shouldn’t rush to try to standardize something until we’re fairly sure it’s not just going to lead to more confusion (because of tools interpreting the metadata differently). On the other hand, I’m not sure if there’s any way to do it without increasing confusion, because as long as there are multiple competing tools that do things differently we’re going to have this problem of how to define some notion in a way that will be meaningfully consistent across tools.

It looks like the way the discussion is evolving is people are coming up with various edge cases that some users might want to have handled in some particular context (e.g., “I want my dev environment to install this package in editable mode”). I don’t think it’s necessary for this PEP to handle all such edge cases. It would be okay if the PEP just provided a way to define groups of dependencies that must be installed. And all these wrinkles could be handled individually by tools, because that’s probably what will wind up happening anyway.

As long as the dependency groups are readable and usable by tools, it doesn’t seem like it would be that insane for individual projects to just specify particular incantations that have to be done to set up particular environments for dev/test/etc. So like if you’re in the case where you want to install the dependencies needed by three packages in a monorepo, but you want those three packages themselves to be installed in editable mode, then you just have to do something like:

mytool --only pyproject.toml:dev -e package1 -e package2 -e package3

Where :dev means “the group labeled as dev in the TOML”, --only means “only install the packages listed in that group, and not anything else” (like for instance not the package1/2/3 themselves, whose metadata is actually defined in that TOML), and then you use -e to “manually” specify that you want to install the three packages individually. Yes, this means people working on this particular project can’t get everything they want just from a minimal command like mytool pyproject.toml:dev, but so what? For all of these dev/test/etc. use cases that are targeted at audiences who want to do something other than just install the package and use it, there’s going to have to be a note in the readme anyway saying “here’s how to set up for dev/test/etc.”.

In other words I feel like dependency groups will be more useful if they just provide a way for users to use tools to mix and match individual groups with “other stuff” like editable installs. As long as a tool has some mechanism like --only to say “just give me exactly this one part of the metadata”, the user can always back off to whatever relevant set of dependencies is conveniently retrievable from the TOML, and then incrementally add on anything else that doesn’t fit into that.

Of course, this really means that what matters is not really the metadata but what tools let users do with it, which is what we keep coming up against in these discussions. As a result, although I overall like the direction this PEP is trying to go in, I’m not super optimistic that it’s going to reduce confusion in the packaging world. Whatever metadata this PEP defines, if there are multiple competing tools that choose to do divergent things with it (or define their own alternatives because this doesn’t meet their needs), then confusion will increase; if they all do the same with it, or if one tool becomes dominant, then confusion will decrease.

2 Likes

Would it make sense to reboot the discussion thread? There have been some comments about how long and difficult-to-follow this thread is (apologies for my part of that). Some of the discussion is no longer relevant so it might be nice to start fresh, and let people punt on this thread.

As PEP-delegate, I’ll point out that spreading discussion over multiple threads is more difficult for me. Just because a PEP revision says “this covers all the discussion in the previous thread” doesn’t mean I can accept that - I still need to review it for possibly missed concerns. And Discourse isn’t very good when it comes to searching multiple threads.

So one monster thread is easier for me - but it’s harder for all the other participants, so I’m OK with a reboot thread if that’s what people prefer. As long as people are careful not to simply repeat points from the previous thread (except if they need to say “I said X in the previous discussion and you actually didn’t address that in the PEP revision” of course!)

5 Likes

The current draft PEP is available at PEP 735 – Dependency Groups in pyproject.toml | peps.python.org .


Commenting on the 20-Nov-2023 draft …

path

The path must refer to the path to a built distribution (wheel, sdist, or any future file format) or a directory containing python package source code.

Is having a single key name for so many potential types the best way to go? The PEP doesn’t say how to tell what the path is pointing at. Is the assumption that tools will somehow infer it based on the contents of the path, be it a file or a directory?

The extras key is a list of strings, each of which is the name of an extra which should be included in the package installation.

This should probably explicitly call out project.optional-dependencies as what you mean by “extras”.

If only-deps is true , implementations MUST NOT install the package at the specified path . Instead, they should only install the dependencies of that package. This may still require building a package from a source tree in order to discover dynamic dependency data.

I’m not sure if this is worth including. As someone who has asked for --only-deps from pip, I would effectively have to set this for every dependency group since I can’t really think of a situation where I wouldn’t want it that isn’t context-sensitive (e.g., developing locally I don’t need my code installed by not using src/, but I may want to do that to verify builds in CI).

One thing that I think is missing from the PEP is something that causes project.dependencies not to be included in what to install. An example of this would be the docs example from PEP 735 – Dependency Groups in pyproject.toml | peps.python.org. If your documentation doesn’t require your package and its dependencies to be installed because you’re not using any “auto” feature of Sphinx, then why install it? This is the use case of wanting to specify the development tools you need to work on a project, but which operate on the static source files and not on the project being installed (or its dependencies).

2 Likes

I also ignored editable since I don’t have a use for it and I think we admit we want more of a workspace solution, but I also acknowledge it seems some people do have a use and it editable installs aren’t worthless.

For now, let’s stick with one thread. I know I’ve seen “take two” and “reboot” threads for PEPs in the past, but I’m not convinced it won’t do more harm than good right now. We can always create a new summary thread, lock this one, and migrate discussion to a “reboot” if it seems like the right decision in the future.


I’ve just put in a body of work on the PEP to document the JS and Ruby ecosystem solutions.
Writing down some of the details about JS was interesting and revealing in a few ways. As a particular point of interest, peerDependenciesMeta in package.json relies on being able to match names against peerDependencies in very much the same way that we’ve suggested here that tool configs could refer to packages to extend standard metadata with tool-specific data.

It would be nice to have a smooth upgrade path from using string dependency specifiers to using tables of structured data, and to use the keys in the table representation as the package names. Combining these characteristics seems to be troublesome.
To clarify, this is a nice thing for new users to learn:

[dependency-groups]
test = ["pytest<8", "coverage"]

and this is a nice thing for ensuring each dependency is named:

[dependency-groups.test]
pytest = {version = "<8"}
coverage = {}

Perhaps simply allowing these two options side-by-side is best? (Either a dependency group is a list of strings or it is a table?)

I admit to wanting “something better” without seeing an obvious realistic option. How you represent coverage with no version in the above example is a particular sticking point.

It’s also unclear if this implies that an include of one dependency group in another needs to be named.

I was expecting that tools would inspect the path, but I’m not sure that there are realistic usages for built distributions here. The only use-cases I know of for path are path = "." and path = "../foo/".

Perhaps the correct step forward here is to restrict this to directories-only.
It would be relatively easy to extend this in a number of ways in the future if sdist or wheel paths turn out to be important. I’m not aware of any need to support the non-directory cases.

This comment tells me that something has gone wrong between my intent and the document. This should not only be clearly possible, but one of the main use cases being supported. I need to read with a critical eye to see how this was lost or muddied.

A Dependency Group does not implicitly include the current package (if there is one) or [project.dependencies]. So the non-autodoc sphinx case is readily supported by

[dependency-groups]
docs = ["sphinx"]

This is part of why I want to include path support in this proposal, so that {path = "."} can be used for Dependency Groups which want to include the current package.

In this way, Dependency Groups should read like a formalization of requirements.txt files, and many requirements files should translate naturally.
For example,

# test.txt
pytest
.

becomes

test = ["pytest", {path = "."}]

This relates to the question of how to support the use cases which drive --only-deps, but they are also somewhat separate. The current draft’s only-deps flag provides for referring to [project.dependencies] directly, as --only-deps would. Otherwise, there is no way to refer to the contents of [project.dependencies] except to install the package.

If we drop only-deps from the spec, then I would want an example of dynamic [project.dependencies] content which refers to a dependency group.
For example:

[dependency-groups]
production = ["arrow"]  # a dependency group is declared
[project]
dynamic = ["dependencies"]  # dependencies is declared as dynamic

# and the build backend does the necessary translation
# potentially it errors at build time if the dependency data
# includes path dependencies
[tool.some-build-tool.dynamic]
dependencies = {"dependency-groups": ["production"]}

The key thing is that it’s somehow possible to refer to the same data as [project.dependencies] but without the project. The above dynamic approach could achieve that, if build backends are interested in supporting it.

I had a case where I needed to patch a library and always use my local copy in my environment. Being able to specify either the path to the actual file to install, or the equivalent of pip’s --find-links was needed for this.

I think it would be better to mention these things once the pull request is merged and the PEP document is updated. The changes you are referring to are not shown here yet:

Also the PEP would be clearer if it showed more complete/realistic examples of how a pyproject.toml making use of the relevant features might look.

This is why I want to see a more complete example of the pyproject.toml. I am pretty sure that when put together it will seem like this is a bit backwards: having dependency-groups refer to optional-dependencies but not the other way round is not how I would expect these things to be designed if starting from scratch. The PEP example already seems backwards to me:

[dependency-groups]
test = ["pytest", "coverage", {path = ".", editable = true}]
docs = ["sphinx", "sphinx-rtd-theme"]
typing = ["mypy", "types-requests", {path = ".", extras = ["types"], only-deps = true}]
typing-test = [{include = "typing"}, {include = "test"}, "useful-types"]

[project.optional-dependencies]
types = ["typing-extensions"]

Here we have the awkward construct {path = ".", extras = ["types"], only-deps = true} that extracts the dependencies from the .[types] extra minus the project itself so that they can be included in a dependency group. It would seem much more natural to me to define this the other way around and have the extra reference the dependency group:

[dependency-groups]
test = ["pytest", "coverage", {path = ".", editable = true}]
docs = ["sphinx", "sphinx-rtd-theme"]
types_extra = ["typing-extensions"]
typing = ["mypy", "types-requests", {include = "types_extra"}
typing-test = [{include = "typing"}, {include = "test"}, "useful-types"]

[project.optional-dependencies]
types = [{include = "types_extra"}]

Now only-deps isn’t needed anymore because the dependency group types_extra does not implicitly include the project. I am sure that I would want to do this for project.dependencies as well. Also I would expect to use the same syntax in all three sections project.dependencies, project.optional-dependencies and dependency-groups but currently it seems like dependency-groups uses a very different syntax.

1 Like

I doubt we will want to change the specification of the [project] table, which seems to be required for this suggestion.

Regarding path/editable dependencies, if PDM and Poetry already have that format, who is concretely benefiting by keeping paths in this proposal?

  • Hatch wouldn’t because workspaces are coming and that is what users actually want.
  • Redistributors like Debian wouldn’t because why would they ever use editable installations for builds?
  • Dependency scanners certainly would not want to complicate their already complex logic, for a single language, by having to read the metadata of that path or literally building if referring to a project with only a setup.py.
  • etc. there are others but I don’t have much time right now
1 Like

Yep, that was a bit of a faux-pas! The PEP doc has now been updated.
I’ll slow my roll and make sure updates are merged before I mention them.

My answer is that there are two cases:

  • PDM and Poetry would be harmed by their omission[1]
  • Use of path dependencies in requirements.txt

Before Poetry and PDM were available, a monorepo could declare path dependencies using requirements.txt data and have a build pipeline which runs pip install -r. That’s still valid usage today.


  1. Open for debate, but that’s my line of thought right now. ↩︎

My question was very specifically about who would benefit and so the first one I sort of disregard, leaving the requirements.txt point. I would say that file has all sorts of other features that are specific to pip and therefore I see even less of a justification now if that is your main point.

1 Like