PEP 751: one last time

PEP 751: drop the requirement that installers default to not installing any extras or dependency groups by brettcannon · Pull Request #4300 · python/peps · GitHub drops the requirement that installers default to not installing any extras or dependency groups. It also fixes the typos that Randy noticed involving extras and dependency-groups.

3 Likes

I just read through the latest proposal and only have very minimal feedback. Thanks @brettcannon!

  • Could we make packages.sdist.name and packages.wheels.name optional, so that they can be omitted if they’re equivalent to the last segment in the path?
  • Does [[packages.dependencies]] have a schema? Does it need one?
  • I’m a little surprised that the extras and dependency-groups markers were added but have no objection to the design. The only concern is that they do add a meaningful amount of work to the PEP implementation for installers…
2 Likes

One more thing regarding dependency-groups: I think it is necessary to define a name for the main dependencies (from project.dependencies) so that all lockers and installers use the same name. Otherwise, you cannot unselect them when installing, can you?

Just don’t select them in the first place? Given that what extras to install by default is tool specific, mechanisms for deselecting what is installed by default should also be tool specific.

1 Like

I think it is not quite clear what I meant. For example, let’s assume a package is required for Python < 3.11 in project.dependencies and always in the dependency group “dev”. The marker in the lock file could look like this:

marker = 'python_version < "3.11" or "dev" in dependency_groups'

However, how shall an installer recognize that it must not install the package if only the “docs” group (without the main dependencies from project.dependencies) is requested by the user?

My point is that we need an implicit name for project.dependencies so that it can used with the dependency_groups marker, for example Poetry uses the name “main”, which would result in the following marker:

marker = '(python_version < "3.11" and "main" in dependency_groups) or "dev" in dependency_groups'
1 Like

It shouldn’t recognise that. I’m afraid I still don’t understand your point here. That marker says “install this file if you’re on Python 3.10 or less, or if you requested the dev dependency group”. If you install stating you want the docs group and you’re on Python 3.10, then that marker says you install the file.

What I think you want is

marker = '(python_version < "3.11" and "dev" not in dependency_groups) or "dev" in dependency_groups'

I don’t see why this needs a “main” dependency group name.

Let’s specify the example in more detail. We have the implicit main group and two other groups:

  • “main” (aka project.dependencies): requires the package for `python_version < “3.11”
  • “dev” always requires the package
  • “docs” does not require the package

How to define the marker correctly without an implicit main group so that:

  1. requesting only “docs” correctly does not install the package?
  2. requesting “main” and “docs” together correctly installs the package?

To put it another way, poetry envisions that some dependency groups could be installed in a standalone environment without the primary project’s dependencies?

For tools that want to support that, couldn’t they define a dependency group called “main” (or some other name) and put nothing in the project.dependencies table?

Poetry’s CLI could then opt to automatically select that group unless the user requests that it not be installed.

That seems more explicit than having an implicit group for the base project, and thus make it clear to the reader whether this is an expected manner of use for a particular lock file.

Edit: as discussed below, this should say the dependency-groups key in a pylock.toml file, not project.dependencies (which is part of pyproject.toml).

2 Likes

Yes.This is already state of the art.

They could (if you only consider locking/installing) but project.dependencies have a special meaning when building [1]. We could declare project.dependencies as dynamic and define a dependency group but that feels like a bad workaround.


  1. These are the runtime dependencies of the package. ↩︎

This is where I get lost. There’s no “implicit main group”. All groups are explicit. If a requirement is in project.dependencies, then the project depends on it, and therefore it’s always installed.

Let me try to be even more precise.

  • We have a project, A. A depends on B (in project.dependencies) with a marker of python_version < "3.11".
  • We also have a “dev” dependency group, that installs A, plus another dependency C.
  • And we have a “docs” dependency group that installs a dependency D, but crucially does not install A - because if it did, B would necessarily be installed (you declared that it had to be!)

I’m concerned that you think it’s possible to declare that “docs” depends on “A without B”. Maybe that’s possible in Poetry, but it’s not supported by the existing standards, and so it’s not necessary for the lockfile spec to support it. Adding support would be a new standards PEP, which would have to cover the impact of the new feature on lockfiles, rather than being something needed in the lockfile spec.

Assuming my definition of the dependencies and dependency groups is accurate, though, we then lock the project. I’m going to assume we want a “universal” lockfile of some form, so installers can (in a tool-specific manner!) support requests like “install the project”, or “install the docs dependency group on its own”.

OK. So with that background, I think I (finally!) see the question you’re getting at. Lockfiles are supposed to be a list of files, which can be considered in isolation. So an installer can’t, when trying to decide if it should install B, use any knowledge of what else is being installed. And that’s a problem because “did the user ask for A to be installed” isn’t something that can be expressed in the marker syntax.

But that seems right to me - marker syntax shouldn’t reflect things like that.

I’m struggling a little to decide what’s best here, because the idea of a lockfile that “locks project A” being used to install a set of packages that doesn’t include A, just seems wrong to me. My recommendation (and my personal preferred workflow) would be to create a separate lockfile that holds the docs dependency group. But let’s accept this is something that people (or Poetry, at least) want.

Let me ask a different question. What would the installer UI look like here?

# 1. Install A
pip install --lockfile pylock.toml
# 2. Install the dev dependency group
pip install --lockfile pylock.toml --group dev
# 3. Install the docs dependency group
pip install --lockfile pylock.toml --group docs

How does pip have any idea that in case (2), A should be installed but in case (3) it shouldn’t? All it has to look at is the individual package sections in the lockfile, one by one. There’s literally no other information.

Furthermore, in (1), how does pip know to install A? The answer is that in the lockfile, there’s nothing to say don’t install A. That’s how everything gets handled - install it unless there’s something that says not to. So I assume that in this scenario, A would have to come with a marker that said “only if docs is not in the requested dependency groups” to support (3). But then we’re back to the original answer - if A is installed unless docs is in the requested dependency groups, we push that condition down onto A’s dependencies (recursively).

This will rapidly become pretty complex, but honestly, I feel like that’s OK. The single-environment case remains my key priority here, and as long as we can support other cases somehow, that’s sufficient. I’ve said this before, but if including dependency groups and extras becomes problematic, I’ll be asking for them to be removed, not for additional complexity to be added.

My main objection to “just” agreeing a name for the default dependencies is that it has wider ramifications - it affects markers wherever they are used, and it impacts other proposals like PEP 771- Default extras. Plus, it breaks any current code that uses that name as a “normal” extra. I honestly don’t want to have to deal with working through those implications. And given that as far as I can see it isn’t needed - at best, it simplifies some of the marker expressions - I’d prefer not to get into those questions.

3 Likes

Fair point, and I agree that nothing should change in pyproject.toml. I meant to refer to the dependency-groups key in the pylock.toml file, not those listed in pyproject.toml, but used the wrong identifier. As far as I understand, nothing requires that the two files use the same list of dependency groups (since pylock.toml does not require a pyproject.toml file to exist at all).

1 Like

Is it true that pylock.toml files are explicitly tied to locking a project (or that the project itself be installed)? In the “requirements.txt 2.0” use-case, there’s not really even a need for there to be a “project” against which to lock, just some requirements.

I agree that adding an implicit dependency group for the “main” group of dependencies (which could then be disabled) seems unwise. But I’m curious about whether you feel it would be permissible for a locker to add a “main” dependency group to the lock file that isn’t present in the pyproject definition to support this behavior.

The biggest downside I see to that design is that you end up with no “required” dependencies, so pip install --lockfile pylock.toml would not install anything - the user would be required to provide a dependency group to get anything to happen. That seems fine to me (this behavior is mostly to support poetry using this format to replace poetry.lock, and poetry would presumably default to including the “main” dependency group), but you may disagree.

Regarding your hypothetical:

IMO, it must be that (2) and (3) should behave the same and install any required dependencies and the requested dependency-group.

With my proposal that poetry add an explicit “main” dependency group to the pylock file, you could get the desired behavior using:

# 1. Install A
pip install --lockfile pylock.toml --group main
# 2. Install the dev dependency group
pip install --lockfile pylock.toml --group main --group dev
# 3. Install the docs dependency group
pip install --lockfile pylock.toml --group docs

This makes it explicit that under scenario (3), the user doesn’t want the “main” group. The only “trick” is that “main” was created by the locker (poetry) rather than be specified in the pyproject.toml file.

Edit: formatting

2 Likes

This behavior may not be defined in standards, but it’s already how uv, maybe Poetry, and pipenv work. You can lock some requirements without having your own package defined. uv uses uses the presence of a “build-system” key to indicate whether a package is being defined or not. If not, then there isn’t any “implicit main” group of dependencies.

1 Like

Not at all, no. And in fact, my personal use cases are all more around locking an environment rather than locking a project (the “single-environment” case rather than the “multi-environment” one). But @radoering seems to be talking in the context of locking a project, so I’m trying to frame things in those terms.

I’m happy for lockers to do anything they want to get the results they are aiming for. The key for me is that because we are standardising lockfiles, it should be possible to install from a lockfile with a tool different than the locker, and get reasonable results - otherwise, why bother with a standard? What I’m uncomfortable about is adding more complexity to the lockfile standard to support more and more use cases that are unlikely to ever be used in a “one tool locks, a different one installs” scenarios.

Yes, that’s a UX issue that is IMO outside the scope of the PEP. It’s a workable solution, but requiring people to type pip install --lockfile pylock.toml --group main is mildly annoying. Poetry could choose to default to main, but you can’t expect other installers to.

I agree - if Poetry only use this internally in lockfiles that are only intended to be consumed by poetry itself, that seems fine to me.

I disagree. Why do you think that (2) and (3) must install A (and by implication, its required dependencies)? It’s only the dependency group definition that says whether A is part of the dependency group. Extras are different - they can only be specified if you’re installing the project itself - so the problem doesn’t arise there.

I think this is the key point as far as the “everything is optional” tool specific use case goes: it is supported in the format (by the tool defining a “main” dependency group in pylock.toml rather than marking the default dependencies from pyproject.toml as unconditional), but actually using that capability means that exporting a separate lock file will still be necessary when defining specific group selections for basic installers to consume (no major process change from the status quo, just more consistent file formats)

4 Likes

Ah - I had misunderstood. In this hypothetical, package A is the local project itself and the question is whether it should be considered as a dependency under some dependency groups, but not others?

There had been discussion some time back about whether the project associated with the lock file should be included in the lock file. My recollection (which might be wrong!) was that, since there is no real meaning of “locking” something that is under development, that it would not be part of the pylock.toml spec (this seems to be borne out in the current text of the PEP).

Thus: tools that are managing the project would likely look to pyproject.toml to make a determination about when the local package itself should be installed. pip, conversely, would not. Since the lock file never includes the local project, I would think that pip never installs that project (in the example above, “A”) when provided “just” a lock file as input.

This leads to a potential UX problem for pip, even with the “requirements.txt 2.0” use case: how should I as a user specify to pip “install the environment as described in this lock file, and also install the local project, but nothing else”? The existing “requirements.txt” format allows the inclusion of -e . to explicitly request the local project be installed, but that is not part of the pylock.toml spec.

I only raise it because this will surely be a common use case, and novice users (following an “install this app” style tutorial) may well be asked to follow it. To the extent that the PEP can help enable pip and other installers to streamline this particular workflow (without too much complexity), it’s at least worth considering.

This is off-topic for here, as it’s purely a pip UI issue, but I don’t see what’s wrong with

pip install --lockfile pylock.toml
pip install -e .

Not everything needs to be a single command.

7 Likes

I’m fine with that if no one objects.

I tried to word [[packages.dependencies]] such that you replicated the data for an entry as required to disambiguate it from other entries in [[packages]] (as inspired by uv). So I would say it implicitly has a schema of what [[packages]] can contain. if that isn’t clear I’m happy to try and clarify it somehow.

Miracles do happen and I occasionally catch a break. :wink:

That’s why I proposed the table solution initially as I think that would have been easier to implement. But then you and Randy said you liked this solution more, so :person_shrugging: .

This came up before when we discussed whether dependency groups imply installing what project.dependencies represents (I believe @zanie brought this up?). In the end it was considered a UX decision of the installer as to whether the user needed to specify whether any “default” packages should be installed with a dependency group (which conceptually makes sense to me since dependency groups are designed for that use-case).

True, but thanks to what PEP 685 brought in you could use a name that couldn’t be represented outside of the lock file, e.g. “Default” – which PDM uses – or “Main” as uppercase letters are no longer allowed. Now this doesn’t apply to dependency groups, though, as the restrictions are much smaller, but you still could do _Default, _, etc. to have a name inside the lock file that no one could use from a pyproject.toml file for any modern configuration.

The other option is a Boolean flag on [[packages]].

I don’t have an opinion on all of this (at least yet; I’m on vacation and a bit sleep deprived thanks to the baby having a bad night’s sleep).

I think in that instance it’s a question as to how often and in what scenarios will the tool generating the lock file not be the tool performing the installation? In the case of some service doing it (e.g. some cloud host), the PEP explicitly says they should check for a lock file just for that service or a dependency group in pylock.toml, so in that instance there’s a way to do this with Poetry making “main” an internal detail that it can use.

The discussion about including the project itself came down to wanting an easy way to leave it out when it wasn’t desired. As such, it was easier to consider it an optional thing to include depending on the purpose of the lock file (i.e. if the lock file is meant to exist independent of any other code, or it’s considered part of the project). So the PEP doesn’t really say anything on this topic, which implies you have to explicitly include a self-referential entry if you want it.

And the PEP does let you lock against a source tree, so you can do it.

4 Likes

From reading the latest (final?) version of the PEP, I have only minor notes; I hope they weren’t already discussed in one of the DPO threads.

Nits:

Given that packages.wheels.name exists, must the last component of packages.wheels.path or packages.wheels.url respectively be a valid wheel filename?

Regarding environments and the [[packages.dependencies]] being optional: When a tool produces a single platform lockfile, how do an installer know whether it is on the wrong platform? (For requirements.txt files, this is worked around by always doing a resolution, then installing the full set of packages from the resolution, or in the hash-checking case erroring because the resolution pulled in packages without a hash).

Two cases interesting for users are installing across tools (say, lock with uv and install with pip) and updating the lockfile across tool. My understanding is that two first case is a motivation for the PEP, while the latter is unsupported in the general case. Can we add a sentence about this to the “How to Teach This” section?

For directories, is an installer allowed to change whether they are installed as editable or not? Users often want to have editable installation by default for development, but a non-editable installation toggle for production (this e.g. helps with docker multi-stage builds because .venv no longer depends on other files).

Semantic differences with requirements.txt files
Backwards Compatibility

An interesting checkbox for the PEP: Does the new lockfile replace all uses of requirements.txt (the output format, not the input format), i.e. can we universally recommend migration from requirements.txt to pylock.toml? Otherwise, what features are we removing in pylock.toml wrt to an existing requirements.txt setup?

Regarding the marker syntax:

Why do we introduce 'async' in extras, when extra == 'async' exists? While the existing extra == syntax is not good syntax for extras, it’s the syntax supported for project.dependencies and the marker tooling is built around it, we’d create a special case for one file format only.

Third, the “<marker_op> operators that are not in <version_cmp>” will be changed from operating “the same as they do for strings” to “the same as they do for containers”. There are no backwards-compatibility concerns as strings are containers themselves.

I’m having trouble following this sentence, what happens when someone specifies 'extra-1' < extras?

I realized this morning a third option is introducing a default-group key that takes a string and represents a synthetic dependency group (I would probably leave the synthetic name out of dependency-groups, but that’s not a strong opinion). That gets around the name clash issue as tools can make up a name that does not clash. It also keeps it unified with marker expressions compared to a Boolean flag. It’s then up to the installer to provide the UX around when the default group is used.

No and that’s never been guaranteed anyway by any standard (I’ve asked @charliermarsh about this before in regards to uv’s lock file and he said it wasn’t an issue yet, but also not a concern either).

If it leaves out environments it won’t, and that’s a bug in the locker.

Correct, and I can add a sentence about that.

packages.directory.editable isn’t a “SHOULD” in terms of following it, but it could be. I honestly put the flag in there because I think @charliermarsh and @radoering asked for it; I have no great attachment to it as I think whether something is editable can be a decision at install time.

No, because requirements files allow you to embed index details due to them potentially requiring resolving packages at install time (e.g. --constraint, --index-url, etc.).

There also isn’t a concept of referring to another lock file, i.e. -r.

Finally, environment variables are not supported.

I would argue that it isn’t really supported for project.dependencies either. The dependency specifier spec says, using extra is “An error except when defined by the context interpreting the specification”, and in that case I would say it’s only valid in Requires-Dist in core metadata.

Also, how do you specify those semantics? That extra operates on == as equality in specific scenarios but like in in other situations? I don’t see how that’s any better than a new marker with clearer semantics. An installer would still have to have two different ways of interpreting a marker expression compared to supporting another marker.

Two things. One is don’t do that. :wink: I would say that’s a bug in the locker for writing out that requirement. And since the PEP says extras and dependency_groups are only valid in lock files which are explicitly not for being handwritten, there isn’t a way for a person to write that out manually without circumventing tools (i.e. I wouldn’t say “someone” in your sentence, I would say “something”).

Two, that would be a runtime error anyway since 'extra-1' < [] is an error in Python. This is why the PEP says that the extras and dependency_groups markers represent containers and not strings like the other markers as that influences what’s valid based on that fact that you can only specify strings to operate against. I can make the PEP explicitly say extras and dependency_groups cannot be strings and all preexisting markers are strings.

1 Like