PEP 735: Dependency Groups in pyproject.toml

pradyunsg · December 19, 2023, 3:14pm

The full regex for non-normalised names is here: Package name normalization - Python Packaging User Guide

I think we should permit non-normalised names here, since not doing so would be inconsistent with extras; especially given that we’ve gone down the route of using the same normalisation approach for extras.

ofek · December 19, 2023, 5:42pm

Sorry I didn’t quite understand that comment, are you saying we should or should not normalize?

flyinghyrax · December 19, 2023, 6:00pm

I believe this is very subjective.

I have the opposite intuition. For me it only makes sense for project.dependencies to refer to a dependency group, and not the other way around. Personally I might even prefer if it could only be either a list of dependencies or the name of a single dependency group!

If - in the abstract - the dependency groups feature is modeling a project’s set of sets of dependencies, then the contents of project.dependencies are a member of that set. The configuration can then work in either direction and seem more or less intuitive person to person, either:

project.dependencies is like an implicit dependency group, so other explicit dependency groups can refer to it and include its contents.
project.dependencies is strictly wheel metadata, so dependencies are all specified in one place (dependency groups) and build tools can be told to populate the metadata using a dependency group

Personally I found 2 more intuitive, but others find 1 more intuitive, and I don’t know which would be more “valid”. There may be additional interpretations, this is just my reading of the back-and-forth in the thread thus far.

Edited to add: the interaction between this proposal and PEP 725 (external dependencies) is definitely… thought provoking. Using both mechanisms in a single pyproject.toml file seems like it would feel inconsistent?

oscarbenjamin · December 19, 2023, 6:43pm

The question is not about centrality but about how you can define the groups without duplication. The groups can be combined as set unions but not intersections, subtractions etc so e.g. it is not possible to define group B as “group A except without dependency Y” or something like that.

Since the only way to reuse groups when defining new groups is by adding to them you want to build larger groups out of smaller groups. If any dependency group is a strict subset of project.dependencies then it cannot be defined in terms of project.dependencies. In that case if project.dependencies cannot be defined in terms of any dependency group then its contents will need to be duplicated in both project.dependencies and in dependency-groups possibly along with version constraints etc.

It is more natural that you have a single place dependency-groups where all groups are defined and where larger groups are made out of smaller groups in an organised way without duplication. Then project.dependencies and optional-dependencies can just be references to these groups.

I have also done this but I don’t consider this interface convenient. I recently wanted to know what was the earliest version of A that had a dependency constraint B > x.y: I had to backtrack through the PyPI version pages downloading sdists.

It would be much better if PyPI could just display the dependency information directly on the web page like crates.io

pradyunsg · December 19, 2023, 9:02pm

We should permit non-normalised names and normalise them after parsing them when we need to compare things (i.e. handle these names like package names and extra names). In other words, I’m suggesting that the specification change like:

The dependency-groups table contains an arbitrary number of user-defined keys, each of which has, as its value, a list of requirements (defined below). These keys must be valid non-normalized names and should be normalised before comparisons. Tools SHOULD prefer to present the unmodified non-normalized name to users by default.

(using links to packaging.python.org where appropriate)

brettcannon · December 19, 2023, 11:01pm

If this is meant to be a way to sneak guidance to tool authors via this PEP then I think it should advise tools to validate input data upfront, then work with the data. That lets them provide better context when an unexpected input is provided.

pf_moore · December 19, 2023, 11:46pm

I wouldn’t call it “sneaking” guidance What I was thinking of was wording along the lines of “tools should ignore dependency groups that use content not recognised as conforming to this spec, but should not raise an error in such a situation”. Or maybe “tools must raise an error if unrecognised constructs are found” - I don’t actually care what decision the PEP makes (as long as it’s justified), I just think it should make some decision, and be explicit about it.

sirosen · December 20, 2023, 12:38am

I proposed pre-normalized names (thanks for the catch on the regex being incorrect, btw) in order to save implementers a little bit of work, but I’m not overly attached to it. If people like having the option for non-normalized names or want stronger consistency with extras, the only thing I’d like to call out is that I think the spec should recommend emitting an error whenever duplicate names (post normalization) are encountered.

The first draft had a construction like this, in a slightly different context, which instructed tools to ignore unrecognized data.
The feedback seemed pretty strongly to indicate that people wanted tools to error, since ignoring new data could be construed as a silent failure (and nobody needs convincing that those are bad).

I want to phrase this very carefully because I think there’s an opportunity to thread the proverbial needle here.

Tools SHOULD error when processing unrecognized data in Dependency Groups. They SHOULD NOT eagerly validate the list contents of Dependency Groups.

There, now you can have

[dependency-groups]
foo = ["click"]
bar = [{crazy-cool-new-feature = true}]

Don’t use bar with older tools when crazy-cool-new-feature is added, but you can still use foo.

Does that work as well as I think it does?

brettcannon · December 21, 2023, 8:32pm

I would tweak that phrasing to say, “They SHOULD NOT eagerly validate the list contents of all Dependency Groups” or “They SHOULD NOT eagerly validate the list contents of Dependency Groups that will not be used by the tool”. Basically clarify you can very much validate what you’re planning to use upfront, just don’t worry about stuff that doesn’t concern you.

But then again, if you’re a pyproject.toml linter, you want to phrase it such that they are not somehow suggested to not process everything.

pradyunsg · December 21, 2023, 9:01pm

A sentence with SHOULD seems reasonable to me.

rgommers · December 21, 2023, 9:30pm

I don’t think so, they’re about separate things. One is dev/test/doc etc. tools from PyPI, the other is things you need at build/runtime not on PyPI. I don’t see much interaction there.

I thought of one more reason why project.dependencies should not refer to dependency groups. Dependency groups without that link are something that only build frontends (and higher-level tools) have to know about, build backends do not. The PEP explicitly says:

Build backends MUST NOT include Dependency Group data in built distributions as package metadata.

and since dependency groups are not needed for building the package nor creating sdist/wheel metadata ([build-system] and [project] are enough), there is no reason for build backends to care. Hence it saves build backends the implementation effort to not have that link, and it’s a good separation of concerns.

sirosen · January 3, 2024, 11:55pm

There’s another update now live for PEP 735.

(Aside: would an admin mind editing the initial post here to have a link to the PEP? I’m not able to edit it, which I’m guessing is a Discourse rule for threads with many replies. But I think only having the draft doc link in there is less nice than a link to the actual draft PEP.)

To summarize the changes:

Remove the PEP 723 use case.

Given that 723 is moving towards “script metadata” as opposed to “embedded pyproject.toml”, this makes sense to drop. The IDE Use Case Appendix Item has been updated because it previously referred to some of the content from the PEP 723 use case.

Change to non-normalized names which require normalization.

This is covered in the start of the specification section.

Clarify Include meaning and behaviors and forbid cycles.

Consider jumping to the section on includes to review.

The bit on cycles is almost a verbatim copy of Paul’s suggestion, stating that data with include cycles is invalid and that tools should error on cycles.
Clarifying the meaning of includes was a tricky balance between specificity and verbosity. Hopefully the current text got that right.

Add a section on validation and compatibility which clearly defines future-compatible behavior
Update the Open Issues to remove “include lists” and add “includes of [project] tables”.

There’s still a significant open issue, which is how to share data across the [dependency-groups]/[project] table boundaries.

I see three basic options:

declare the problem out of scope and hope we can solve it in a future PEP
declare a syntax for [project] to include from [dependency-groups]
declare a syntax for [dependency-groups] to include from [project] (probably with the restriction that the data MUST be static, as discussed above)

I’m very wary of punting on this, since it may be harder to introduce later than in the initial spec.
Perhaps (3) should be our choice on the basis of its practicality. The spec can include a new syntax for this with the requisite rules and there’s no backwards compatibility question (since the table is new).

For an example of possible syntax:

[dependency-groups]
foo = ["a"]
# bar = ["a", "b", "c"]
bar = [{include = "foo"}, {project-include = "dependencies"}]
# baz = ["d", "e"]
baz = [{project-include = "optional-dependencies.snork"}]
[project]
dependencies = ["b", "c"]
[project.optional-dependencies]
snork = ["d", "e"]

project-include could be defined to be an include which operates on the list found at "project.{value}".
Is this appealing? I’m posting this as a kind of “thinking aloud”.

pf_moore · January 4, 2024, 11:22am

I’m not 100% happy with (3) because of the messy questions about dynamic dependencies, and the fact that I still feel that it has information flowing “the wrong way”. But I can see the argument that (2) opens up too many questions about extending the syntax of [project].

Some questions, also very much of a “thinking aloud” nature:

Are there any use cases where someone would want to use a dependency group that referenced the project dependencies, where they didn’t also want to install the project itself? To put it another way, in your example syntax, what is a real-world example of using bar, which doesn’t also install the project?^[1]
Is the only reason for not having a syntax to say “this project” in a dependency group (i.e., what was previously suggested as ".") because people might sometimes want to install the project as editable? If so, what’s wrong with a group bar = [{include = "foo"}, "."] and pip install -group bar (normal install) or pip install -e . -group bar (editable install)? I’m not suggesting re-introducing path syntax, but rather simply having an equivalent of "self".
It feels confusing to me to use the term “dependency” in both “dependency groups” and “project/extra dependencies” if we allow project/extra dependencies to be in dependency groups. I can’t really articulate my problem very clearly, but I feel like this could result in a lot of user confusion if we’re not careful. On the other hand, I don’t have that same sense of confusion with option (2) - the idea that the project dependencies are made up of one or more groups of dependencies feels natural to me, in a way that having a dependency group include the project dependencies doesn’t.

This feels like the same sort of unease that I have with the --only-deps idea for pip. It feels somehow linked to using pip for build workflow management rather than for pip’s core purpose, which is installation. Maybe PDM, Poetry or hatch would view this feature differently. Isn’t that how we started down this route? ↩︎

jamestwebber · January 4, 2024, 3:11pm

This might be bikeshedding, but does this version open up any cans of worms?

[project]
dependencies = ["b", "c"]
[project.optional-dependencies]
snork = ["d", "e"]
[dependency-groups]
foo = ["a"]
bar = [{include = "foo"}, {include = "project.dependencies"}]
baz = [{include = "project.optional-dependencies.snork"}]

I prefer the way this reads (and writes, although that may be unfamiliarity). But separating the keys probably would make implementation easier.

This could be restricted to the project table for now with the potential to allow other tables in the future (this might be a path toward including dynamic dependencies in the future?)

brettcannon · January 5, 2024, 12:41am

I like option 2, OK w/ option 3, and agree that we should just solve this now.

If you don’t use a src/ layout for pure Python projects (like me), then an editable install is redundant. This is why I sought out an --only-deps solution.

The only time I remember . being brought up was by me, but it tried to do more due to {include} not existing as an idea yet. There’s also the concern we will get asked to support the syntax in project.-optional-dependencies for specifying extras, e.g.:

[project.optional-dependencies]
test = ["pytest"]
lint = ["ruff"]
dev = [".[lint, test]"]

And this isn’t hypothetical; I know of projects relying on pip supporting a project’s own name being used in an extra.

Personally, I’m fine w/ this solution.

jamestwebber · January 5, 2024, 1:10am

Another flavor would be something like {include = "project"}. This fits together with my previous post in that I’m essentially defining the value of {include} as “some table in this file”.

To really formalize it, maybe require a preceding . to denote “one of the tables in the [dependency-group] table”, like it’s a relative path. So a full example would be:

[project]
dependencies = ["a", "b"]
[project.optional-dependencies]
snork = ["d", "e"]

[dependency-groups]
foo = ["a"]
# this would only install deps, not the project
bar = [{include = ".foo"}, {include = "project.dependencies"}]
# equivalent to bar + the project itself
bar2 = [{include = ".foo"}, {include = "project"}] 
baz = [{include = "project.optional-dependencies.snork"}]

The relative path feels clunky to me (I think I’d forget) but it avoids the possibility of name collisions.

EpicWink · January 5, 2024, 4:32am

What use-cases are there for specifically requiring the dependencies in an extra-group and not the main runtime dependencies? This new proposal looks overly complex, and I can only think of situations where you want the project’s dependencies and some set of the project’s extras.

For that matter, is there a use-case for using dependency-groups and installing the project’s dependencies without the project itself? I get the feeling it was alluded to in this thread, but I can’t find it with a quick search. I think it could be solved outside of pyproject.toml, eg with:

pip install --groups test,docs -e '.[cuda]' --no-deps

PEP editorial comments:

The use-cases appendix’s section headers are at the wrong level
The use-cases link in the rationale is broken

Only if you don’t need the project accessible anywhere (without setting PYTHONPATH). At that point, having an installable project is unnecessary, and could be just a script or set of scripts (which I guess leads the discussion of having a pyproject.toml for non-installable projects).

Even without the src-layout, I install projects as editable so I can make changes and run a script-entrypoint from anywhere (after activating the virtual environment).

ofek · January 8, 2024, 4:17pm

True until Python 3.12 where you can now exclude the current working directory from consideration (I believe the plan is for this to become the default).

brettcannon · January 9, 2024, 12:06am

Not that I’m aware of, especially since that would break workflows. Plus I’m sure there would be a way to opt back in even if this did occur.

sirosen · January 11, 2024, 12:53am

In 100% agreement with your footnote, I think this is the same as the --only-deps case. It’s not a case that I’ve experienced myself, so my understanding of this case is weak. I may need to reread some of the threads on this topic.

I think any such need could be solved by having dependency group inclusion interact with [project.dependencies] in either direction, since any desire for --only-deps can be satisfied by making a dependency group which is synonymous with [project.dependencies].

As we discussed the path dependencies, one of the things which I grew uncomfortable with was the realization that specifying . would mean that a build happens. Even if behavior is newly defined for this, it’s not necessarily the same as whatever the project’s preferred build frontend does. If a project has any need to configure the build environment, this could start to break down.

Having thought more about this, I’m not ready to reintroduce the idea that a dependency group can refer to the current package as a package. I could be convinced that it would be okay to do so, but right now I would need convincing.

I’m not sure we have a clear use case we’re satisfying by including . as a dependency. It “feels natural” and we know that users will want environments with a dependency group (like “test”) + the current package. But are we making things significantly better for users by including it?

For a tool like hatch or tox which can already install . in addition to some extended set of dependencies or extras, I don’t think there’s any particular benefit in having a dependency group include .. For example, for tox, we’re really talking about replacing

[testenv]
deps = -r test-requirements.txt
commands = pytest

with

[testenv]
dependendency-groups = test
commands = pytest

In both cases the installation of . is managed separately. I think tox would actually find this harder if test included a reference to ., since . is being requested twice via two different paths: once in the dependency group and once implicitly as part of a tox testenv without skip_install=true.

For direct pip usage, we’re comparing pip install --dependency-group test against pip install . --dependency-group test. (Or maybe two invocations of pip, but still something like that.)
So it’s really similar in simplicity, although you could argue that it’s subtle.

Before we reintroduce . of “self” or “current package”, I want to have a clearer handle on what we’re gaining by including it and what we’re losing by excluding it.

This is consistent with @EpicWink’s comment:

I’m not sure that there is a clear use case for it. A lot of this thread assumes that there is such a case, but as far as I know we haven’t clearly articulated cases. I will have one such case detailed below, at the end of this comment.

Right now, I’m reading and trying to get a better understanding of what kinds of includes between [project] and [dependency-groups] are important or useful.

I appreciate that you’re playing around with syntaxes for this, and I think the “relative path” trick is cute/clever (I mean that in a positive way) but too subtle to be a good interface.

We want it to be relatively obvious at a glance what each “thing” in the [dependency-groups] and [project] is. Currently {include = ...} means a Dependency Group Include. If we need multiple types of includes, maybe include is a bad keyword to use as a bare name, and it should be a vocabulary of things like

{include-group = "foo"}  # Dependency Group
{include-optional-dependencies = "bar"}  # Extra
{include-dependencies = true }  # [project.dependencies] list

Thanks for these! I’ll get the PEP updated with some fixes.

Use Cases for `[project]` Includes

I only have one user story which I can articulate clearly, and that’s related specifically to static analysis.
This applies mostly to type checkers, but other analyzers like pylint sometimes have similar requirements.

Basically, type checking a codebase requires that all of the runtime dependencies of that project are present. It also often requires that some or all extras are installed. (Theoretically, there could be conflicting/mutually exclusive extras, which could all also be needed over multiple runs of the analysis, but I’ve never seen this in practice.)

So, for a simple case, imagine a project with some library requirements and one extra:

[project]
dependencies = ["a", "b"]
[optional-dependencies]
foo = ["c", "d"]

In order to type check this project, assuming the use of mypy, the following packages need to be installed:

mypy, a, b, c, d

How should this project write the requirements?

Well, here’s how we can write it in one of the proposed forms from this thread:

[project]
dependencies = [{include = "runtime"}]
[optional-dependencies]
foo = [{include = "foo"}]
[dependency-groups]
runtime = ["a", "b"]
foo = ["c", "d"]
typing = ["mypy", {include = "runtime"}, {include = "foo"}]

Name the group pylint and swap mypy for pylint and you have the same case, but for a different analyzer.

Could this be satisfied by installing .[foo]? Yes, but two things have been lost:

installing . is wasted work – the analyzer doesn’t need it, but it is being built and installed anyway
if the typing dependency group cannot express the need for these other dependencies, it is now incomplete

There’s another element of the way that a test dependency group would typically interact with extras which I’m thinking about but not yet sure how it impacts things:

test requirements are often needed as part of a matrix of build configurations over multiple extras, which may be dependent on the python runtime version.
For example, the following test configurations may be desired for a package where a toml extra refers to tomli and a yaml extra refers to pyyaml:

dependency_groups	extras	pythons
test	(none)	3.9, 3.10, 3.11, 3.12
test	toml	3.9, 3.10
test	yaml	3.9, 3.10, 3.11, 3.12
test	toml, yaml	3.9, 3.10

etc.

tox, nox, etc already let you build these kinds of matrices. Do we benefit from allowing dependency groups to include extras in these cases, or vice-versa? I tend to think not, but maybe I’m missing a potential interaction here.

PEP 735: Dependency Groups in pyproject.toml

Use Cases for [project] Includes

Use Cases for `[project]` Includes